6,659 Matching Annotations
  1. Jun 2023
    1. Author Response

      Reviewer #1 (Public Review):

      This work reports an important demonstration of how to predict the mutational pathways to antimicrobial resistance (AMR) emergence, particularly in the enzyme DHFR (dihydrofolate reductase). Epistasis, or non-additive effects of mutations due to their background dependence, is a major confounding factor in the predictability of protein evolution, including proteins that confer antimicrobial resistance. In the first approach, they used the Rosetta to predict the mutant DHFRdrug binding affinity and the resulting selection coefficient, which then became inputs to a population genetics model. In the second approach, they use the observed clinical/environmental frequency of the variants to estimate the selection coefficient. Overall, this work is a compelling demonstration that a mechanistic model of the fitness landscape could recapitulate AMR evolution; however, considering that the number of mutations and pathways is small, a more compelling description of the robustness of the results and/or limitations of the model is needed.

      Major strengths:

      1) This is a compelling multi-disciplinary work that combines a mechanistic fitness landscape of DHFR (previously articulated in literature and cited by the authors), Rosetta to determine the biophysical effects of mutations, and a population genetics model.

      2) The study takes advantage of extensive data on the clinical/environmental prevalence of DHFR mutations.

      3) Provides a careful review of the surrounding literature.

      Major weakness:

      1) Considering that the number of mutations and pathways being recapitulated is rather small, I would suggest a more detailed description of the robustness of the results. For example:

      a) Please report the P-value for the correlation of the predicted DDG_{binding, theory} and DDG_{binding, experimental}.

      We thank the reviewer for the suggestion. We agree the available experimental data is small, limiting the statistical power of the Pearsons correlation test to determine how well Flex ddG predicts binding free energy change. However, as highlighted in the manuscript, two earlier studies by Aldeghi et al. 2018 & 2019 considered much larger datasets and found a correlation in a similar range to the one we found here. Furthermore, as suggested by the Reviewer, we carried out a onesided T-test with alternative hypothesis that the correlation is greater than 0 and found a p-value of 0.040, suggesting the correlation we observed is significant. We have included this test and p-value to the Results section.

      If interested in showing the correct assignment of mutational effects, perhaps use a contingency matrix to derive a P-value.

      As suggested by the Reviewer, we used a contingency matrix known as a confusion matrix to determine how accurate Flex ddG is at classifying mutations as stabilising or destabilising. This gave an accuracy of 0.89, sensitivity of 0.83 and a specificity of 1. The p-value associated with this continency table was 0.14, despite the high accuracy, sensitivity and specificity. This is likely due to the small sample size making it difficult to determine significance. This analysis has been included in the Results section.

      b) Although the DDG_binding calculation in Rosetta seems to converge (Appendix figures 3 and 4), I do not think the DDG values before equilibration should be included in the final DDG estimate. In practice, there is a "burn in" number of runs where the force field optimizes the calculation to account for potential clashes in the structure, etc. This is particularly important since the starting structures are modeled from homology. Consequently, the distributions of DDG that include the equilibration runs are multimodal (Appendix figure 2), which means that calculating an average may be inappropriate.

      Each Flex ddG prediction is independent (see Figure 1 of Barlow et al. 2018 for a summary of the Flex ddG method), i.e. the distribution of values does not represent a MCMC process in which there is a burn-in in order to equilibrate. The structures of both the wild-type and mutant are equilibrated in each run using the backrub algorithm. The reason so many runs are required is because each prediction is from a distribution of possible ddG values associated with that specific mutation and the authors of Flex ddG suggest running 35 runs or more and taking the average of the distribution. Therefore, in order to get an accurate prediction, enough simulations must be run per mutation to adequately characterise the distribution so that the average converges to a constant value.

      2) The geographical areas over which the mutational pathways are independently estimated are not isolated, allowing for the potential that an AMR variant in one region arose due to "migration" from another area. For example, the S58R-S117N is the most frequent double mutant of PvDHFR in geographically proximate Southern/Southeastern Asia (Fig. 4). To a certain extent, similar mutational patterns occur for PfDHFR in Southern/Southeastern Asia (Fig. 3). Although accounting for mutant migration in the model may be beyond the scope of the study, a clear argument for the validity of the "isolated island" assumption is needed.

      The Reviewer is correct that some variants in one region may have arisen due to “migration” from another area. This would impact the method for inferring mutational pathways from regional isolate frequency data but not when considering the worldwide population. If this occurred, we would expect to see a multiple mutant appearing in a region without the precursor (single, double etc) mutations, even in the case of large sample size. However, this does not seem to have been an issue for the pathways we have been predicting here. If it were the case that a variant migrated, and the precursor mutations could not be found in that region, we could look to mutations from neighbouring regions to infer the pathway, under the assumption of migration.

      We have added some discussion on this between lines 517-523:

      “When inferring pathways at a regional level, it is possible we may encounter instances where genotypes with multiple mutations are observed in a specific region, but the precursor mutations in the pathway are absent. This could happen either due to insufficient sampling of the region or due to "migration" of the variant from a neighbouring region. To infer pathways in the former case more samples would be required, whereas in the latter case we can look to the data from neighbouring regions where the variant is present and use the frequency data of the precursor mutations.”

    1. Author Response

      Reviewer #2 (Public Review):

      1) Analytical approaches are in the current form preliminary and not enough to draw firm biological conclusions. While the datasets are large (which is highly appreciated), they represent a relatively early stage of ENS development and possible differences between vagal and sacral-derived populations could partially be attributed to difference in maturity. Maturity will surely not explain the whole difference observed but needs to be factored into the interpretation. As scRNA-seq datasets from the mature chicken ENS are lacking (as well as detailed IHC-based neural classification system) the inference made in the paper between molecular classes and functional types are premature.

      We appreciate this comment and think it is an excellent suggestion that we definitely plan to do. This made us realize that we failed to clarify in the text why we chose this particular time point for our study, which is two-fold.

      First, we are particularly interested in how neural crest cells choose their prospective fates. E10 is a time when the post-umbilical gut has been completely populated by both vagal and sacral neural crest cells for 2 days so cells are in the process of differentiation but there still exists a large precursor pool. For this reason, we can capture both precursors and some differentiated neuronal subtypes. We have clarified this point in the revised manuscript and now focus much more on the precursor population to identify both genes that are common to vagal and sacral neural crest cells as well as those that are distinct. This enables us to formulate testable hypotheses for the role of potential role of particular transcription factors is allocation of cell fate. Of particular interest, we find that at E10, the sacral neuronal precursor pool is largely depleted whereas the vagal crest has a substantial neuronal precursor pool. Thus, we believe this is the perfect time point for initial analysis.

      Second and perhaps even more important, in the US, chick embryos are not considered vertebrates until after E10. Thus, E10 represents the last timepoint we can raise embryos without animal approvals which are not currently in hand. We completely agree that performing experiments at later timepoints will be incredibly valuable and therefore are now applying for approvals. But realistically, these take several months and thus would delay publication of our datasets (already delayed due to Covid restrictions) for at least another year. Therefore, we propose to publish the mature dataset as a Research Advance that would focus on differences between mature neuronal subtypes between preumbilical vagal, post-umbilical vagal and sacral datasets that would nicely complement the current work. Instead, we have refocused this paper on the precursor to differentiated neuron transition.

      I should mention that this refocusing seems particularly important given that our original aim was to explore differences between vagal and sacral neural crest contributions to the gut. However, the single cell data reveals strong overlap between sacral and vagal neural crest contributions to the postumbilical gut, suggesting a strong environmental influence on cell fate decisions.

      Specific concerns:

      1) Analysis of scRNA-sequenced sacral- versus vagal-derived ENS reveals clusters consistent with a non-ENS identity (endothelial, muscle, vascular and more). Previous studies in mouse using the neural crest tracing line Wnt1-Cre has not demonstrated such diverse progenies of neural crest from any region. An exception being a small population of mesenchymal-like cells (Ling and Sauka-Spengler, Nat Cell Biol. 2019; Zeisel et al., Cell 2018; Morarach et al., 2021; Soldatov et al., Science 2019). Therefore, the claimed broad potential of 6 of 13 neural crest giving rise to diverse gut cell populations warrants more validating experiments.

      We thank the reviewer for this comment. We clarify that hematopoetic clusters have dropped out upon reanalysis. The other clusters we believe are real based on gene markers used in previous studies to identify cell types such as neural crest-derived melanocytes like Mlana, Dct, and Mitf.

      2) Several earlier studies have revealed that parts of the ENS is derived from neural crest that attach to nerve bundles, obtain a schwann cell precursor-like identity and thereafter migrate into the gut (Uesaka et al. J Neurosci 2015 and Espinosa-Medina et al, PNAS 2017). The current work in chicken needs to be interpretated in the light of these findings and the publications should be discussed in relevant sections of the introduction and discussion.

      Thank you for this suggestion. We agree and indeed our data cannot differentiate between SCPs, which are neural crest-derived, versus early migrating neural crest cells. We have added this point to the discussion and also discuss these papers in more detail.

      3) The analysis indicates the presence of melanocytes. It is not clear why they are part of the GI-tract preparations. Could they correspond to another cell type, with partially overlapping gene expression profile as melanocytes?

      We have assigned these as melanocytes based on expression of Mlana, Mitf, and Dct as highly upregulated genes. These have been used in previous studies to identify neural crest derived melanocytes in the heart (Chen et al., 2021)

      4) As evident, the sacral- and vagal-derived ENS are not clonally related. To decipher differentiation paths and relations between clusters, individual analysis of the different datasets are needed. With only one UMAP representing the merged datasets combined with little information on markers, it is hard to evaluate the soundness of the conclusions regarding cell-identities of clusters and lineage differentiation.

      This is an excellent suggestion and we apologize for not including this previously. We have now added individual pre-umbilical vagal, post-umbilical vagal and sacral neural crest datasets as well as trajectory analysis for each.

      5) E10 is a relatively early stage in chicken ENS development. Around E7, the intestines do not contain differentiated neurons even. The relative high expression of Hes5 (marking mature enteric glia in the mouse; Morarach et al., 2021) in the vagal neural crest population might be explained by the more mature state of vagal versus sacral ENS. As also outlined below, Th/Dbh are known to be transiently expressed in the developing ENS why they could indicate the relative immaturity of sacral neural crest rather than differential neural identities. These issues need to be taken into account when interpreting biology from scRNA-seq data.

      We completely agree. We now clarify that we are particularly interested in how neural crest cells choose their prospective fates. We chose the E10 time point because this reflects a time point when the post-umbilical gut has been completely populated by both vagal and sacral neural crest cells for 2 days so cells are in the process of differentiation but there still exists a large precursor pool. For this reason, we can capture both precursors and some differentiated neuronal subtypes. Notably, the sacral derived precursors seem to be glial in flavor whereas neuronal precursors appear to be absent. We have clarified this point in the revised manuscript.

      6) Unlike the guineapig, and to some extent pig and murine ENS, the physiology of chicken enteric neurons has not been well characterized yet. Therefore, it is highly advisable to refrain from a nomenclature of clusters designating functions. Several key molecular markers are known to differ between murine, guineapig, rat and human systems. IPANs are a good example where differential expression is seen (SST in human but not mice; CGRP labels some IPANS in mouse, but not in guineapig, where Tac1 instead is expressed). IPANs are not defined in the chicken very well, and molecular markers found in other species may not be valid. Adrenergic and noradrenergic neurons have not been validated in the ENS (although, TH and Dbh have been observed in the especially in the submucosal ENS). Cholinergic neurons are also mentioned in the text, but do not appear in the figures as a defined group.

      Another reason to refrain from functional nomenclature is that a rather early stage is analysed in the present study, without possibilities to compare with scRNA-seq data from the mature chicken ENS (which was performed in Morarach et al, 2021 for the mouse). Recent data suggest that considerable differentiation may occur even in postmitotic neurons, and several markers are known to display a transient expression pattern (TH, DBH and NOS1; Baetge and Gershon 1990; Bergner et al., 2014; Morarach et al., 2021) why caution should be taken to infer neuronal identities to clusters.

      This is an excellent point and we thank the reviewer for this valuable input. Accordingly, we have now renamed the clusters based on prominent gene expression rather than neuronal or precursor subtype. Indeed we struggled with finding appropriate names making this comment all the more useful.

      7) The immunohistochemical analysis (Figure 5,6) is an essential complementary addition and validation of scRNA-seq. However, it is very difficult to discern staining when magenda and red are combined to display coexpression.

      Good point. This has been changed to be more readily discernible and higher magnification views have been added.

      8) To give more information to the field and body of evidence for claims made, quantifications relating to the analysis in Figures 5 and 6 are warranted as well as an expanded set of marker genes that align with the scRNA-seq results.

      Good point. We have added additional markers as suggested. In terms of quantitation, we can include numbers of labeled cells in a particular region but this may give a false impression of degree of contribution since we are using different viruses for vagal vs sacral that may have different titers making it a bit like comparing apples and oranges. We now emphasize that our labeling approach does not mark the entire population and that the degree of labeling can be variable.

      9) Correlations between genes and functions/neuron class are in many cases wrong (including Grm3, Gad1, Nts, Gfra3, Myo9d, Cck and more).

      Good point. We have toned this down.

      10) Attempts to subcluster neuronal populations are needed (Figure 7). However, to understand the biology, it is important to address which cells are sacral versus vagal-derived. Additionally, related to previous comment, as the vagal and sacral neurons are not clonally related, it would be important to make separate analysis of neurons relating to each region.

      Good point. We have added additional analysis to address this important point in what is now Fig 6 and in particular validated sacral contributions to glial cells (new Fig 8).

    1. Author Response

      Reviewer #2 (Public Review):

      In this study, Yang et al. used single-cell technology to construct the cell profiles of normal and pathological ligaments and identified the critical cell subpopulations and signaling pathways involved in ligament degeneration. The authors identified four major cell types: fibroblasts, endothelial cells, pericytes, and immune cells from four normal and four pathological human ligament samples. They further revealed the increased number of fibroblast subpopulations associated with ECM remodelling and inflammation in pathological ligaments. In addition, the authors further resolved the heterogeneity of endothelial and immune cells and identified an increase in pericyte subpopulations with muscle cell characteristics and macrophages in pathological ACL. Ligand-receptor interaction analysis revealed the involvement of FGF7 and TGFB signaling in interactions between pathological tendon subpopulations. Spatial transcriptome data analysis also validated the spatial proximity of disease-specific fibroblast subpopulations to endothelial and macrophages, suggesting their interactions in pathological ligaments. This study offers a comprehensive atlas of normal and pathological cells in human ligaments, providing valuable data for understanding the cellular composition of ligaments and screening for critical pathological targets. However, more in-depth analyses and experimental validation are needed to enhance the study.

      1) In this study, the authors performed deconvolution analysis between bulk RNA sequencing results and scRNA-seq results (L204-L208). However, the analysis of this section is not sufficiently in-depth and the authors failed to present the proportion of different cell subpopulations of the bulk sequencing samples to further increase the reliability of the results of the single cell data analysis.

      Thank you for the suggestion. We selected the top 50 Degs in each subpopulation of scRNA-seq, and scored the gene sets at the bulk RNA sequencing data level by GSVA method, so as to present the proportion of different cell subpopulations of the bulk sequencing samples to some extent. The results illustrated that, in the bulk RNA-seq data, fibroblast subpopulations (fibroblast 1,2,8,9) scored higher in the diseased group than in the normal group and fibroblast subpopulations (fibroblast 3,4) scored higher in the normal group than in the diseased group, which are consistent with the results of scRNA-seq.

      2) In results 5, the authors should clearly describe whether the analysis is based only on pathological subpopulations of ligament cells or includes a mixture of normal and pathological subpopulations; the corresponding description should also be indicated in Figure 5. Besides, although the authors claimed that "the TGF-β pathway was involved in many cell-cell interactions among fibroblasts subpopulations and macrophages", Figure 5C displayed that the CD8+NKT-like cells displayed the most TGFB signaling interactions with fibroblasts subpopulations.

      Thank you for your great questions. In results 5, our analysis is based on the mixture of normal and diseased subpopulations. We have also added a description of the data sample in the corresponding position in our manuscript.

      As for the question of the TGF-β pathway in cell-cell interaction analysis, we claimed that “the TGF-β pathway was involved in many cell-cell interactions among fibroblasts subpopulations and macrophages”, because we took into account the proportion of each subpopulation of immune cells. Macrophages are the largest subpopulation of immune cells, and the number of macrophages is significantly increased in the degenerative group, suggesting that they are closely related to disease progression. However, the proportion of CD8+NKT-like cells in immune cells was very small, and the number of them was basically unchanged between the normal and diseased groups. So, macrophages are the focus of our attention, and after comprehensive analysis, we did not mention the strength TGFB signaling interactions of CD8+NKT-like cells.

      3) In result 6, the authors performed spatial transcriptome sequencing, however, the sample numbers were relatively limited, with only one sample from each group; in addition, the results of this part failed to correlate and correspond well with the single-cell results. The subgroups labelled in L382 and L384 should be carefully checked. Besides, expression data of FGF7 and TGFB ligand and receptor molecules based on the spatial transcriptomes should be added to further confirm the critical signalling pathway in regulating the cellular interactions in pathological ACL.

      Thanks for your reminding. The purpose of our spatial transcriptome sequencing (spRNA-seq) was to verify the scRNA-seq results, so only one representative sample from each group was selected for spRNA-seq. We believe that the results of our spRNA-seq were correlated and corresponded well with the scRNA-seq results. The scRNA-seq results were validated on the spRNA-seq data using marker transfer and spotlight methods, respectively. The results showed that more fibroblast4 in the normal group and more fibroblast9 in the diseased group of the scRNA-seq data were also consistent in the distribution of spRNA-seq samples. As shown in the spotlight plots, the more fibroblast subsets (fibroblast1,2,8,9) identified in the scRNA-seq data of the disease group were more widely distributed in the spRNA-seq sample of the disease group, and were closer to endothelial cells and immune cells in spatial location. We have revised the subgroups labelled in L382 and L384.

      According to your suggestions, FGF7 and TGFB related ligand and receptor genes were mapped on spRNA-seq data, and the results were consistent with the results of cellchat analysis in scRNA-seq.

    1. Author Response

      Reviewer #1 (Public Review):

      It has been previously shown that defective autophagy and disorganized microtubule network contribute to the pathogenesis of Duchenne muscular dystrophy (DMD). The authors previously reported that nitrite oxide synthase 2 (NOX2) regulates these alterations. It was also shown that acetylated tubulin facilitates autophagosome-lysosome fusion and thus autophagy. In the present study, the authors showed that autophagy is differentially regulated by redox and acetylation modifications in dystrophic mdx mice. The ablation of Nox2 in mdx mice activated the autophagosome maturation but not its fusion with the lysosome. On the other hand, the inhibition of histone acetylase 6 (HDAC6) restored microtubule acetylation, promoted autophagosome-lysosome fusion, and improved muscle function in mdx mice. The strength of this paper is the combination of different approaches to decipher the mechanism, including the evaluation of the level and interaction of several proteins involved in the maturation of autophagosomes and in the fusion between autophagosomes and lysosomes.

      This study reveals an important molecular mechanism by which increasing microtubule acetylation improves autophagy and muscle function in dystrophic mice. This has a translational impact on several diseases in which autophagy is impaired. The improvement of autophagosome-lysosome fusion with HDAC6 inhibitor is supported by several data, but some parts merit further analysis:

      1) To add appropriate controls (e.g. without antibodies) to support protein-protein interaction for all co-immunoprecipitation assays.

      Thank you for your valuable suggestion. We appreciate your input and have taken it into consideration. Based on your recommendation, we have conducted an experiment by including IP-IgG as a negative control to support the protein-protein interaction results obtained from the co-immunoprecipitation assays. The results of the negative control have been included in the respective figures. Additionally, to ensure the accuracy of the negative control, we ran the positive controls on the same blot. We have immunoprecipitated the same amount of samples for the negative control as we did for the actual IP samples presented in the manuscript. We believe that the inclusion of the negative control has strengthened the validity of our results and the conclusion drawn from our study.

      2) The simple evaluation of the protein levels of p62 and LC3-II is not sufficient to claim autophagy improvement after HDAC6 inhibition. It would be good to evaluate the autophagic flux in vivo in all groups of mice (to treat the mice with or without autophagy inhibitor and evaluate whether the difference in the level of LC3-II between the two conditions is higher with HDAC6 inhibitor than without in the mdx mice).

      Thank you for your suggestion to further evaluate the role of TubA on autophagic flux in vivo. We have included data using chloroquine to test the effect of TubA on autophagic flux in vivo. We found that chloroquine increased LC3 and p62 in skeletal muscle from mdx and mdx + TubA mice, suggesting. We have now included this information in the revised manuscript.

      Reviewer #2 (Public Review):

      Agrawal et al. propose an interesting model in which the autophagy pathway in adult mouse skeletal muscle fibers is orchestrated by two independent mechanisms: a) the activity of the NADPH oxidase (Nox) 2 enzyme necessary for autophagosome biogenesis and maturation and b) the level of acetylation of the microtubule (MT) network more selectively responsible for the fusion of the autophagosomes to the lysosomes. Using the well-known mdx mouse, a model for Duchenne muscular dystrophy, the authors perform a quite impressive (but rather traditional) biochemical characterization of the autophagy pathway and found that biogenesis and maturation of the autophagosomes are impaired in mdx mice muscle fibers by means of altered expression of components of the class III phosphatidylinositol 3-kinase complex (PI3K) such as Beclin, VPS15 (both upregulated in mdx mice), ATG14L and VPS34 (both downregulated), and by the reduced expression of JNK and JIP-1, required for the formation of the heterodimer between Beclin and ATG14L-VPS34. In mdx mice, defective nucleation of the phagophore appears to be coupled to altered elongation and expansion as confirmed by decreased expression of WIPI-1, an early marker of autophagosome formation, required for the assembly of the ATG5-12 complex. Clearance of sequestered cytosolic components necessitates the fusion of the autophagosome with the lysosome, a process that the authors found impaired in mdx mice due to altered formation of the SNARE tertiary complex (STX17-SNAP29-VAMP8), as a result of the marked reduction of STX17 expression.

      In a previous work (Pal et al., Nat Commun 2014), the same group described the generation of an mdx-based mouse model where Nox2 activity was abolished by genetic ablation of the p47phox component. These mice presented with a better outcome in terms of dystrophic pathophysiology by means of reduced oxidative stress and improved autophagy. Further characterization of these mice in the present study reveals that in p47-/-/mdx mice abolishment of Nox2 activity restores autophagosome nucleation and maturation thanks to the increased expression of p-JNK, JIP-1 and improved stability of the Beclin-ATG14L complex, but no amelioration is observed on the formation of the SNARE tertiary complex indicating that the biogenesis of autophagosomes is dependent on Nox2 activity but not the fusion between autophagosomes and lysosomes. Given the existing body of evidence in non-muscle cells pointing at alpha-tubulin acetylation as a regulator of MT activity facilitating the fusion of autophagosomes to lysosomes, the authors thought to investigate the level of MT acetylation in mdx mice muscle fibers and found that acetylation is reduced but can be restored by inhibiting the HDAC6 enzyme via the FDA-approved, highly selective pharmacological inhibitor Tubastatin A (Tub A). Treatment of mdx mice at 3 weeks of age (before the onset of pathological manifestations) with Tub A not only restored the normal level of alpha-tubulin acetylation (without altering the organization and density of the MT network) but also curbed the intracellular redox status and improved the autophagic flux by stabilizing the SNARE tertiary complex. Interestingly, treatment of dystrophic mice with Tub A results in substantial improvement of the dystrophic phenotype as confirmed by a reduced level of apoptosis, diminished tissue inflammation, improved sarcolemma integrity, and superior force generation capacity in ex vivo experiments using the diaphragm and Extensor Digitorum Longus (EDL) muscle fibers of Tub A-treated mdx mice compared to untreated mdx and healthy counterparts.

      The in-depth characterization of the steps orchestrating the autophagy pathway in the mdx mouse model on the one hand, and the comprehensive evaluation of the phenotype of the mdx mice treated with the HDAC6 inhibitor Tubastatin A on the other, support the conclusions proposed by the authors. Nonetheless, some aspects deserve consideration.

      1) The effect of increased alpha-tubulin acetylation by means of genetic and pharmacological strategies (i.e., in vivo overexpression of alpha-tubulin acetyltransferase-aTAT1 and treatment with Tubacin or Tubastatin A, respectively) has been previously explored in isolated cardiomyocytes and skeletal muscle fibers and revealed that augmented MT acetylation, due to selective inhibition of HDAC6, increases cytoskeletal stiffness and favors Nox2 activation (Coleman et al., J Gen Physiol 2021).

      We have added a discussion of the work by Coleman and colleagues. In brief, that work was in wild-type cardiac and skeletal muscle and showed that MT acetylation controlled stiffness in control muscle cells. Interestingly, while they did not quantify MT organization, their data suggest that HDAC6 inhibition does not alter organization. Here, we are assessing the role of MT acetylation is a diseased model, mdx. Taken together, our data along with that from Ward and colleagues highlight the importance of a proper balance of tubulin acetylation in order to maintain cellular signaling, which is different between non-diseased and diseased skeletal muscle.

      2) Altered organization and density of the MT network in mdx FDB muscle fibers with loss of vertical directionality is not a novelty as well and it has been reported by others (see Randazzo et al., Hum Mol Genet 2019), who also observed that overexpression of a single beta-tubulin (tubb6) in normal Flexor Digitorum Brevis (FDB) muscle fibers mimic the disruption to the MT network of mdx FDB fibers, increases the level of detyrosinated tubulin and increases Nox2 activity (through elevated expression of gp91phox). Conversely, downregulation of the same beta-tubulin restores normal MT organization in mdx FDB. Previous work from the authors (Loehr et al., eLife 2018) reported that in p47-/-/mdx mice MT organization in diaphragm muscle fibers is normalized and autophagy improved. Accordingly, it is puzzling that increased alphatubulin acetylation determines such a wide range of ameliorations in terms of physiological and morphological aspects in dystrophic skeletal muscle fibers treated with Tubastatin A whereas no improvement in the overall MT organization is observed, as reported by Agrawal and colleagues.

      Our findings are also supported by Coleman et al who show that HDAC6 inhibition did not alter levels of DT-tubulin. Although that group did not specifically measure MT organization viewing and analyzing their representative images of alpha-tubulin (Figure 1D, control and tubacin) shows that HDAC6 inhibition does not alter MT organization in wild-type FDBs

      3) Given that p47-/-/mdx mice present with levels of acetylated alpha-tubulin and HDAC6 expression comparable to mdx while showing significant improvement of the dystrophic phenotype despite partial rescue of the autophagic flux (as reported in Loehr et al., eLife 2018), it would have been of great interest to investigate the effect of HDAC6 inhibition in p47-/-/mdx mice as well.

      We would like to thank the reviewer for acknowledging our in-depth characterization of the steps orchestrating the autophagy pathway in the mdx mouse model and the comprehensive evaluation of the phenotype of the mdx mice treated with the HDAC6 inhibitor Tubastatin A. While we believe these experiments are of interest, we think that they merit a detailed investigation that is beyond the scope of the current work

    1. Author Response

      Reviewer #1 (Public Review):

      In the manuscript, titled "Comparative single-cell profiling reveals distinct cardiac resident macrophages essential for zebrafish heart regeneration," Wei et al. perform bulk and single-cell RNA-sequencing on uninjured and injured zebrafish hearts with or without prior macrophage depletion by clodronate. For the single-cell RNA sequencing, the authors sort macrophages and neutrophils prior to sequencing by using fluorescent reporters for each of the two lineages. The authors characterize the differential gene expression between injured and uninjured hearts with and without prior macrophage depletion. The single-cell analyses allow the characterization of nine discrete subpopulations of macrophages and two distinct neutrophil types. The manuscript is largely descriptive with lots of discussion of specific differentially expressed genes. The authors conclude that tissue-resident macrophages are important for heart regeneration through the remodeling of the microenvironment and by promoting revascularization. Circulating monocyte-derived macrophages cannot adequately replace the resident macrophages even after recovery from clodronate depletion.

      The manuscript presents a very large catalog of useful gene expression data and further characterizes the diversity of macrophages and neutrophils in the heart following injury. Although the conclusions that resident macrophages are important for regeneration and that circulating macrophages cannot adequately substitute for them are not particularly novel, this manuscript provides additional support for those ideas and extends that work by providing a wealth of gene expression data from the different macrophage sub-populations in the zebrafish and how they respond to and promote regeneration. The authors also present a nice analysis supporting the interactions of macrophages with neutrophils via comparing receptors and ligands (from gene expression data) on the two populations - this should be a useful resource.

      We appreciate how reviewer #1 recognizes the work we have put into sample preparation, data collection, and all the bioinformatic analyses to delineate and characterize the inflammatory cells during zebrafish heart regeneration.

      Reviewer #2 (Public Review):

      Wei et al. analysed the composition of immune cells, mostly macrophages, and neutrophils, in the context of zebrafish cardiac injury while utilizing clodronate liposomes (CL) to inhibit regeneration via alteration of the immune response. This work is a direct continuation of Shih-Lei et al. which compared the regenerative outcomes of zebrafish vs the non-cardiac regenerative medaka. In that work, the authors used CL to pre-deplete macrophages and showed significant effects on neutrophil clearance, revascularization, and cardiomyocyte proliferation. In this work, the authors used the same pre-depletion method to study the dynamics, composition, and transcriptomic state of macrophages and neutrophils, to overall assess the effect on cardiac regeneration. Using bulk RNA-seq at CL vs PBS treated hearts 7 and 21 days post cryo injury (dpci) a delayed\altered immune response was evident. Single-cell analysis at 1,3 and 7 dpci showed a wide range of immune populations in which most diverse are the macrophage populations. Pre-depletion using CL, altered the composition of immune cells resulting in the complete removal of a single resident macrophage population (M2) or dramatically reducing the overall numbers of other resident populations, while other populations were retained. Looking at the injury time course and distribution of macrophage populations, the authors identified several macrophage populations and neutrophil population 1 as pro-regenerative as their presence compared to CL-treated hearts correlates with regeneration. CL-treated hearts also show a marked sustained neutrophil retention suggesting that interaction with depleted macrophage populations is required for neutrophil clearance. As the marked reduction in populations 2 and 3 occurs after CL treatment, the authors tested whether early CL treatment (8 days or 1 month prior to injury) could reduce the non-recoverable populations and affect regenerative outcomes and indeed they observed a reduction in key genes characterizing M2 and M3 which caused marked reduction in revascularization, CM proliferation, neutrophil retention, and overall higher scaring of the heart.

      The findings of this paper could be broadly separated into the characterization of myeloid cells after injury and in non-regenerating animals and assessing the effects of early pre-depletion of macrophages on various cardiac functions involved in regeneration. Both parts draw conclusions that are supported by the facts however several questions remain to be clarified.

      We thank the reviewer for recognizing that the conclusions we drew were supported by the data we presented and further replied to the specific suggestions below.

      1) In figures 2 and 3 the main claim is that the main resident macrophage populations, M2 and M3 are depleted and are largely unable to replenish after injury, similar to resident macrophages in mice 1. However, as the identification of this population is made solely using scRNA-seq, an alternative explanation would be that these cell populations do replenish but are sufficiently changed due to CL treatment (directly or indirectly) and thus would be a part of another cluster. To address this, we suggest:

      A. Run trajectory analysis to ascertain whether the different cell clusters are due to differentiating states of the cells

      B. Create a reporter line for M2 and M3 macrophages and assess whether they are indeed depleted or changing.

      We followed the reviewer’s suggestion and performed trajectory analyses (Figure 6). The results suggest that Mac 2 and Mac 3 form unique trajectory, which was not shifted by -1d_CL treatment but only diminished in number. Conditionally-enriched gene ontology analysis (Figure 4) also suggests that Mac 2 and 3 do not change property under -1d_CL condition (unlike monocyte-derived Mac 1 and some other clusters). When we examine homx1a expression (Mac2) and timp4.3 expression (Mac3) in -8d_CL treated hearts, we again observed diminished cell numbers (Figure 8C and Figure 7-figure supplement 1D). These results support the resident macrophages Mac 2 and Mac 3 are more likely to be non-recoverable than changing their property so much thus grouped into other subsets.

      We also agree with the reviewer that the specific reporter and CreER driver lines for the lineage tracing experiment will provide the most concrete answer to this question. We have now generated an endogenous Tg(mpeg1-2A-CreERT2) line in the lab (collaborative work with McGrail lab) and reporter lines using Mac2/3 enriched genes. Unfortunately, this work will take much longer time and might not fit into the scope of the current study.

      2) One of the major findings of this paper is that some macrophage populations can persist throughout injury and promote the regenerative response. Considering that macrophages have a half-life of less than a day in tissue 2 (although could be different in zebrafish and in this population), we estimate that the resident populations should be proliferative. As there is only a single proliferating macrophage population (M5) we speculate that it is a combination of several populations which are clustered together due to the high expression of cell cycle genes. To verify whether the resident populations are proliferating we suggest:

      A. Perform cell-cycle scoring and regression (found in Seurat package) and assess whether after regressing out cell cycle genes there are contributions of M5 to other clusters.

      B. Perform EDU labelling experiments with cell cycle identifiers (staining for hbaa1, Timp4.3) and assess their proliferative dynamics.

      We followed the reviewer’s suggestion and performed cell-cycle scoring and regression (Figure 2-figure supplement 4). Cell cycle scoring suggests there are cells in both Mac 2 and 3 in the G2/M phase and presumably proliferative. Cell-cycle regression results suggest that most macrophage subsets, including Mac 5, still stand as unique clusters after regression (Figure 2-figure supplement 4). These results suggest that Mac 5 might not be constitute of proliferating cells from other clusters.

      On the other hand, we also tried to double-stain the proliferating resident macrophages by EdU and ISH of hbaa1 and timp4.3. Unfortunately, these methods were not comparable in our hands, and we failed to confirm their proliferative dynamics. We did show proliferating macrophages residing in the untouched hearts and will further check their identity once we have the cluster-specific reporter lines ready.

      Last but not least, using the Tg(mpeg1-2A-CreERT2) line to label embryonic macrophages under the Tg(ubi:loxP-EGFP-loxP-mCherry)cz1701 background before 7 dpf, we observed mCherry+ macrophages in juvenile fish at 50 dpf, suggesting some embryonically derived macrophages can last more than a week in the system presumably by self-renewing. As replied previously, these results might not be included in this study.

      3) In connection to the previous point if indeed these resident macrophage populations are proliferative, even a smaller portion of remaining cells should be sufficient to partly replenish given sufficient time after CL 1. However as seen in Fig. 3B, the M2 population has a similar proportion of cells on days 1 and 3 after CL treatment and by day 7 it declines in numbers. Given that CL should not be present anymore, we expect this population to increase in numbers over time.

      We thank the reviewer for pointing out that Figure 3B might be misleading as the proportion of the macrophage subsets was calculated. The persistence of Mac 2 proportion at 1 and 3 dpci might be due to the overall depletion of both resident and recruited macrophages after CL treatment. 2 days after CL treatment still have profound effects on total macrophage numbers (Figure 7-figure supplement 1A and Lai et al., 2017) and the overall macrophage numbers only recovered to the same level as those in untouched or PBS-treated injured hearts by 7 days (Figure 7-figure supplement 1A and Lai et al., 2017). We have also confirmed that Mac 2 diminished in CL-treated hearts by both qPCR and ISH/IHC of homx1a in Figure 7-figure supplement 1C and Figure 8B.

      4) In Figure 6 the authors show a reduction in mpeg+ population however a persistent, large population ({plus minus}70% of the original mpeg+) is retained. The authors suggest that this population is comprised of other, non-macrophage, cell types however as this method is the very core of the paper and the persistence of macrophages could alter our understanding of the results, it must be verified.

      Dick, S. A. et al. Self-renewing resident cardiac macrophages limit adverse remodeling following myocardial infarction. Nature Immunology 20, 29-39, doi:10.1038/s41590-018-0272-2 (2019).

      Leuschner, F. et al. Rapid monocyte kinetics in acute myocardial infarction are sustained by extramedullary monocytopoiesis. J Exp Med 209, 123-137, doi:10.1084/jem.20111009 (2012).

      We acknowledge that mpeg1 might not be the perfect marker for pan-macrophage labeling shown by the work published by Ferrero et al., J Leukoc Biol. 2020, when our profiling work had been undergone. Fortunately, scRNAseq profiling is an unbiased method to reveal gene expression/cell identity, and our results indeed identified non-macrophage/non-neutrophil populations out of the clustering and found mpeg1+ B-cells consistent with the literature. Thus, the mixed input from the mpeg1 reporter does not affect the property of Mac 2 and 3 being both mpeg1-positive macrophages, which diminished after both -1d_CL and -8d_CL treatment. Following the reviewer’s suggestion, we further verified this point by both qPCR of hbaa1 and timp4.3 and ISH/IHC of homx1a and timp4.3 in the CL-treated hearts in Figure 7-figure supplement 1C and D and Figure 8B and C.

      Reviewer #3 (Public Review):

      Macrophages play an important role during heart regeneration. This has been shown in the mouse and zebrafish for example by treating the animals with clodronate liposomes to eliminate phagocytic cells.

      The manuscript follows up on a previous observation by the authors performing these experiments in the zebrafish (Lai et al eLife 2017). When comparing regenerative vs non-regenerative teleosts zebrafish resp Medaka they found that macrophages and neutrophils were the cell types more differentially responding in these two species to a cardiac injury.

      Here the authors analyze in extenso neutrophil and macrophage populations using single-cell RNA-seq at different stages of regeneration. They perform FAC sorting of the two populations using specific reporter lines. They also assess the change in these populations upon clodronate treatment. They find that clodronate treatment affects the gene expression profiles of different subsets of macrophages and neutrophils as well as their abundance.

      They also show that chlodronate treatment performed several days before cryoinjury depleted macrophages from the heart but after injury overall macrophage number recovers. However, heart regeneration does not. Cardiomyocyte is the only parameter that is not affected, but vasculogenesis and scar resolution is impaired.

      The authors conclude that (1) there are different subsets of macrophages and neutrophils, (2) that they interact with each other during regeneration through specific ligand and receptor pairs, and (3) that a cardiac resident population rather than a circulating macrophage population is important for heart regeneration.

      The transcriptomic characterization of the two immune cell populations is very exhaustive and rigorous. No functional validation of subpopulation marker genes was performed, but the data as it stands will already be of great value to the community. The figure quality is outstanding.

      We thank the reviewer for recognizing the value of our study and the quality of the data presented. We further examined the subpopulation markers and their functional relevance in the revised manuscript, as suggested.

    1. Author Response

      Reviewer #1 (Public Review):

      In the current work, the authors aimed to investigate the genetic and non-genetic factors that impact structural asymmetry.

      A major strength is the number of data samples included in the study to assess brain structural asymmetry. A consequence of the inclusion of many samples is then also the sample size.

      We thank the reviewer for their supportive and insightful comments that have helped improve our paper.

      Comment #1: Given that the authors also work with longitudinal data, it would be nice to be able to appreciate the individual effects across time points, this is now a little unclear.

      Our lifespan analysis incorporated both single and repeat measures over time in the trajectory estimation, and hence these will be an intermediate estimate of cross-sectional and longitudinal trajectories. We have clarified this in the Methods (see 1). A comprehensive analysis of the individual-specific asymmetry change effects in the current paper is thus hindered by many properties of the data, including that many participants contribute a single measure, that participants vary in their number of repeat-measures (1-6 timepoints), that the number of repeat-measures is dependent on age, and that the degree of asymmetry change differs between cortical metrics, clusters, and along the age variable. Most importantly, the average degree of asymmetry change is small; Fig. 3 indicates thickness asymmetry typically corresponds to a ~0.1 - 0.2mm difference, such that changes therein will be smaller and thus likely unclear at the individual level. Nevertheless, we have modified the average plots in Figures 2 and 3 to allow better visualization of the individual hemispheric measures across timepoints, as well as an appreciation of the density of our longitudinal data.

      1 – (line 646) “GAMMs incorporate both single and repeat measures over time to capture nonlinearity of the mean level trajectories across persons, resulting in population estimates that are intermediate between cross-sectional and longitudinal trajectories”

      Comment #2: A possible less well-developed approach is the genetic basis, as this was stated as the main question, here the investigations are not that deep and may only touch upon the question.

      We agree the previous formulation of our Abstract did convey this impression, and have thus made the following important amendment:

      (Abstract) “Cortical asymmetry is a ubiquitous feature of brain organization that is subtly altered in some neurodevelopmental disorders, yet we lack knowledge of how its development proceeds across life in health. Achieving consensus on the precise cortical asymmetries in humans is necessary to uncover the developmental timing of asymmetry and extent to which it arises through genetic or later influences in childhood.”

      Our paper aims to serve as a critical reference for the normative childhood development and lifespan change of cortical asymmetry. We performed heritability analyses as they are informative regarding development and shed light on the timing of influences shaping cortical asymmetry (also possibly prior to age ~4 at which our sample starts). Similarly, genetic correlation analysis sheds light on whether the replicable interregional correlations are underpinned by genetic differences, indicative of coordinated genetic development of asymmetries. We apologize the rationale behind these analyses was not well-specified, and have clarified this (see response #4). Thus, we respectfully disagree the genetic aspect represented the main research question, but rather lends support to our developmental perspective.

      Given the density of analyses already included and that these are well-specified within the context of our overarching question, we do not see how adding more genetic analyses will be beneficial for our paper. However, we agree with the Reviewer’s subsequent comment (#8) that the genetic correlations in HCP data should also have been reported, and now incorporate these (see response #8).

      Comment #3: Moreover, the association with cognition, handedness, sex, and ICV is somewhat interesting yet seems also a bit minimal to fully grasp its implications.

      In the asymmetry field it has been commonplace to assume these factors are strongly related to asymmetry, particularly sex. Here, despite optimizing the delineation of asymmetries, associations with factors purportedly related to it were all very small. We believe this is an important message that may help reorient the field away from entrenched views; unless we show it is not the case, researchers may think the effects of these factors are larger than they are. Further, because questions pertaining to sex and handedness differences will certainly arise for many, we chose to address them by quantifying the average effects in big data, because our lifespan trajectory analysis was not well-suited to assessing e.g. sex differences in asymmetry trajectories (i.e. 3-way non-linear interactions; sexagehemisphere). We have strengthened the reasoning for this analysis in the Introduction (see 1):

      1 – (line 118) “Therefore, as a final step, we reasoned that combining an optimal delineation of population-level cortical asymmetries with big data would optimize detection and quantification of the effects of factors commonly assumed important for asymmetry, namely general cognitive ability, handedness and sex.”

      Contrary to approaches that often place emphasis on p-values (e.g. pheWAS), our targeted approach using variables long considered important for asymmetry enabled transparent reporting of the effect sizes and directions. We hope the Reviewer agrees we have taken care in this regard, and are careful to communicate the found effects are small. The small effects seem typical of structural brain associations in big data, as may be expected when relating complex phenotypes to any single structural measure. For these reasons, we opt not to extend the analysis beyond our initial targeted approach, arguing instead that the size of the effects is reason enough to report them.

      Despite being small, however, we argue they are not negligible (see 2-4). Of note, though it may appear so in Fig. 7, the p-value for the cognitive association was far from just surviving Bonferroni correction (it would survive >13,000 comparisons at our alpha level [⍺=.01], whereas we corrected for our 136). Note we did not accept a 5% false positive rate. We have clarified this in the Results (see 5):

      2 – (line 485) “Other factors commonly espoused to be important for asymmetry were associated with only small average effects in adults. For example, we found one region – SMG/perisylvian – wherein higher leftward areal asymmetry related to subtly higher cognitive ability. Since interhemispheric anatomy here is likely related to brain torque 2,3, this may agree with work suggesting torque relates to cognitive outcomes 4,5. Interestingly, that ~94% of humans exhibit leftward asymmetry in this region (Figure 1G) suggests tightly regulated genetic-developmental programs control its lateralized direction in humans (see Figure 6). This result may therefore suggest disruptions in areal lateralization early in life are associated with cognitive deficits detectable in later life as small effects in big data 6. While speculative, this may also agree with evidence that differences in general cognitive ability that show high lifespan stability 6 relate primarily to areal phenotypes formed early in life 7–9.”

      3 – (line 461) “We also found areal asymmetry in anterior insula is, to our knowledge, the most heritable asymmetry yet reported with genomic methods 10–14, with common SNPs explaining ~19% variance. This is notably higher than in our recent report (< 5%) 14, illustrating a benefit of our approach. As we reported recently 14, we confirm asymmetry here associates with handedness.”

      4 - (line 495) “Consistent with our recent analysis in UKB 14, we confirmed leftward areal asymmetry of anterior insula, and leftward somatosensory thickness asymmetry is subtly reduced in left-handers. Sha et al. 14 reported shared genetic influences upon handedness and asymmetry in anterior insula and other more focal regions. Anterior insula lies within a left-lateralized functional language network 15, and its structural asymmetry may relate to language lateralization 16–18 in which left-handers show increased atypicality 19–21. Since asymmetry here emerges early in utero 22 and is by far the most heritable (Figure 6), we agree with others 16 that this ontogenetically foundational region of cortex may be fruitful for understanding genetic-developmental mechanisms influencing laterality 23,24. Less leftward somatosensory thickness asymmetry in left-handers also echoes our recent report 14 and fits a scenario whereby thickness asymmetries may be partly shaped through use-dependent plasticity and detectable through group-level hemispheric specializations of function. Still, the small effects show cortical asymmetry cannot predict individual handedness. Associations with other factors typically assumed important were similarly small, and mostly compatible with the ENIGMA report 25 and elsewhere 26,27. 5 - (line 3221) ”Although small, we note this association was far from only just surviving correction at our predefined alpha level (⍺ = .01; corrected for 136 tests; Methods).”

      6 - (line 348) “we … uncover novel and confirm previously-reported associations with factors purportedly related to asymmetry – all with small effects”

      Thus, in quantifying effects we could not include in our lifespan analysis we preempt the questions likely to arise for many researchers, provide a sobering account of the effect sizes of factors typically assumed important for asymmetry, and find results that fit the developmental framework we lay out in the paper. We therefore opt to keep these together with the lifespan and heritability results in the current paper.

      Comment #4: To some extent, the aim of the study could still be written with more clarity. However, the authors have in part achieved their aims - assuming it is found a consensus on the brain asymmetry patterns in humans as is stated in the abstract.

      Alongside the amendment to the Abstract that better clarifies our aims (response #2), we have restated the aims in the Introduction:

      1 - (line 121) Here, we first aimed to delineate population-level cortical areal and thickness asymmetries using vertex-wise analyses and their overlap in 7 international datasets. With a view to gaining insight into cortical asymmetry development, we then aimed to trace a series of lifespan and genetic analyses. Specifically, we chart the developmental and lifespan trajectories of cortical asymmetry for the first time longitudinally across the lifespan. Next, we examine phenotypic interregional asymmetry correlations, under the assumption correlations indicate coordinated development of left-right asymmetries through genes or lifespan influences. To shed light on the extent to which differences in asymmetry are genetic, we test heritability of asymmetry using genome-wide single nucleotide polymorphism (SNP) and extended twin data, and examine whether or not phenotypic associations are underpinned by genetic correlations suggestive of coordinated development through genes. Finally, we screen our set of robust, population-level asymmetries for association with general cognitive ability and factors purportedly related to asymmetry in UK Biobank (UKB). 28

      Comment #5: Overall the results support the conclusions, yet the strong interpretation of early life factors in particular is not empirically investigated as far as I gather.

      The reviewer is correct that we do not have data on neonates to directly support interpretations of prenatal factors. We have therefore tempered strong interpretations pertaining to prenatal accounts accordingly, have added text at the start of the Discussion to address this (see 1), and qualified all discussion of prenatal factors:

      1 – (line 366) “Tracing their lifespan development, we show the trajectories of areal asymmetry primarily suggest this form of asymmetry is developmentally stable at least from age ~4, maintained throughout life, and formed early on – possibly in utero 13,29,30 (while we cannot extrapolate to ages before our sample begins, we note this agrees with findings in neonates 29,30). One interpretation of lifespan stability combined with low heritability may be stochastic early-life developmental influences determine individual differences in areal asymmetry more than later developmental change, but work linking prenatal and childhood trajectories is needed to affirm this”

      2 – (Abstract) “Results suggest areal asymmetry is developmentally stable and arises early in life through genetic but mainly subject-specific stochastic effects”

      We have also added argumentation regarding a just-published study suggesting the average pattern of neonatal areal asymmetry is largely similar to adults 1. In addition, we reiterate what our data can and cannot say about the developmental timing of asymmetry in several places in the Discussion (see 3 & 5). In other places, we have removed reference to prenatal factors (see 4). Still, while we agree we previously used the terms “prenatal” and “early life factors” interchangeably, we note the latter often encompasses periods of early childhood covered here and is not necessarily restricted to factors present at birth 2,3. Thus, we have amended the Discussion to qualify the age-range the interpretation pertains to (see 5), and then retain the conclusion as follows (see 6).

      3 - (line 383) “For areal asymmetry, adult-like patterns of lateralization were strongly established before age ~4, indicating areal asymmetry traces back further and does not primarily emerge through later cortical expansion 33. Rather, the lifespan trajectories predominantly show stability from childhood to old age, as asymmetry was maintained through periods of developmental expansion and aging-related change that were region-specific and bilateral. This may align with evidence indicating areal asymmetry may be primarily determined in utero 29,30, including evidence suggesting little change in areal asymmetry from birth to 2 years 29,33,34, and little difference between maps derived from neonates and adults 29,30. It may also fit with the principle that the primary microstructural basis of cortical area 8 – the number of and spacing between cortical minicolumns – is determined in prenatal life 8,9, and agree with work suggesting asymmetry at this microstructural level may underly hemispheric differences in surface area 35. The developmental trajectories agree with studies indicating areal asymmetry is established and strongly directional early in life 29,36. That change in surface area later in development follows embryonic gene expression gradients may also agree with a prenatal account for areal asymmetry 9”

      4 - (line 439) “The strongest relationships all pertained to asymmetries that were proximal in cortex but opposite in direction. Several of these were underpinned by high asymmetry-asymmetry SNP-based genetic correlations, illustrating some lateralizations in surface area exhibit coordinated genetic development.”

      5 - (line 481) “Regardless, these results support a differentiation between early-life (i.e. before age ~4) and later developmental factors in shaping areal and thickness asymmetry, respectively.”

      6 - (Conclusion) “Developmental and lifespan trajectories, interregional correlations and heritability analyses converge upon a differentiation between early-life and later-developmental factors underlying the formation of areal and thickness asymmetries, respectively. By revealing hitherto unknown principles of developmental stability and change underlying diverse aspects of cortical asymmetry, we here advance knowledge of normal human brain development.”

      Overall this is a nice and thorough work on asymmetry that may inform further work on brain asymmetry, its genetic basis, development, environmentally induced change, and link to behavioural variation.

    1. Author Response

      Reviewer #1 (Public Review):

      Bacterial carboxysomes are compartments that enable the efficient fixation of carbon dioxide in certain types of bacteria. A focus of the current work is on two protein components that provide spatial regulation over carboxysomes. The McdA system is an ATPase that drives the positioning of carboxysomes. The McdB system is essential for maintaining carboxysome homeostasis, although how this role is achieved is unclear. Previous studies, by the lead author's lab, showed that the McdB system is a driver of phase separation in vitro and in cells. They proposed a putative connection between McdB phase separation and carboxysome homeostasis. The central premise of the current work is as follows: In order to understand if and how phase separation of McdB impacts carboxysome homeostasis, it is important to know how the driving forces for phase separation are encoded in the sequence and architecture of McdB. This is the central focus of the current work. The picture that emerges is of a protein that forms hexamers, which appears to be a trimer of dimers. The domains that drive that the dimerziation and trimerization appear to be essential for driving phase separation under the conditions interrogated by the authors. The N-terminal disordered region regulates the driving forces for phase separation - referred to as the solubility of McdB by the authors. To converge upon the molecular dissections, the authors use a combination of computational and biophysical methods. The work highlights the connection between oligomerization via specific interactions and emergent phase behavior that presumably derives from the concentration (and solution condition) dependent networking transitions of oligomerized McdB molecules.

      Having failed to obtain specific structural resolution for the full-length McdB as a monomer or oligomer, the authors leverage a combination of computational tools, the primary one being iTASSER. This, in conjunction with disorder predictors, is used to identify / predict the domain structure of McdB. The domain structure predictions are tested using a limited proteolysis approach and, for the most part, the predictions stand up to scrutiny affirming the PONDR predictions. SEC-MALS data are used to pin down the oligomerization states of McdB and the consensus that emerges, through the investigations that are targeted toward a series of deletion constructs, is the picture summarized above.

      Is the characterization of the oligomerization landscape complete and likely perfect? Quite possibly, the answer is no. Deletion constructs pose numerous challenges because they delete interactions and inevitably impose a modularity to the interpretation of the totality of the data.

      This is a good point and always a possibility with truncations – the protein McdB may not be as modular in nature as it seems in our tripartite model. But the deletion constructs were more so intended to be tools for identifying key regions of oligomerization and condensate formation as others have done, and for this, they were indeed useful. Additionally, we were able to strategically aim our substitution mutations based on data from the deletion constructs. These substitutions provided data consistent with the deletions, but in the context of the full-length protein (see Fig. 5 vs. Figs. 2, 4). However, we ultimately agree with the reviewer that this is always a possibility with truncations, and we have therefore mentioned this caveat in the discussion.

      Line 415 “Truncated proteins have been useful in the study of biomolecular condensates. But it is important to note that using truncation data alone to dissect modes of condensate formation can lead to erroneous models since entire regions of the protein are missing. However, data from our truncation and substitution mutants were entirely congruent. For example, deletion of the CTD or substitutions to this region caused destabilization of the hexamer to a dimer, and deletion of the IDR or substitutions to this region caused solubilization of condensates without affecting hexamer formation.”

      Accordingly, we are led to believe that the N-terminal IDR plays no role whatsoever in the oligomerization.

      Our updated data still strongly supports this interpretation. Both truncation of the IDR (Fig. 2) and the six-Q-substitution mutant in the IDR (Fig. 5) form a monodispersed hexamer in solution via SEC-MALS, as does wild-type McdB.

      Close scrutiny, driven by the puzzling choice of nomenclature and the Lys to Gln titrations in the N-terminal IDR raise certain unresolved issues. First, the central dimerization domain is referred to as being Q-rich. This does not square with the compositional biases of this region. If anything is Q/L or just L-rich. This in fact makes more sense because the region does have the architecture of canonical Leu-zippers, which do often feature Gln residues. However, there is nothing about the sequence features that mandates the designation of being Q-rich nor are there any meaningful connections to proteins with Q-rich or polyQ tracts. This aspect of the analysis and discussion is a serious and erroneous distraction.

      We changed the language here, and no longer refer to the central region as “Q-rich”. However, we would like to note that the second half of the McdB central domain is indeed enriched in glutamines (14/53 = 26.4%) to a comparable extent as the region of FUS, which has been shown to help drive condensate formation via glutamine H-bonding (14/44 = 31.8%; Murthy et al 2019). We were simply proposing that, at a molecular level, there was some insight to be gained from this comparison. We agree, however, that there is no functionally meaningful comparison between McdB and polyQ-tract proteins, as we may have previously alluded to in our discussion, and that text has been removed.

      Back to the middle region that drives dimerization, the missing piece of the puzzle is the orientation of the dimers. One presumes these are canonical, antiparallel dimers. However, this issue is not addressed even though it is directly relevant to the topic of how the trimer of dimers is assembled.

      Indeed, we were unable to resolve the orientation issue, despite much effort. The story we present is not a complete and final model of McdB structure, nor its molecular modes of oligomerization or condensate formation. However we now provide a discussion section “McdB homologs have polyampholytic properties between their N- and C-termini” that highlights this issue. We also mention the remaining dimer orientation issue at the end of the results section “Se7942 McdB forms a trimer-of-dimers hexamer”. However, we believe the data presented still provides useful initial models, which for example, allowed us to create a series of substitutions that tune McdB condensate solubility and verify that they do not affect oligomerization. We would like to further add that for other condensate forming proteins in bacteria, like the PopZ protein we mention in the text, there remains no detailed structural model beyond the resolution we provide here for McdB; despite PopZ being first identified in 2008. Over 40 publications on PopZ have progressively provided useful and more detailed models that are only now being used to develop PopZ as a tool for condensate technologies that are furthering our understanding of the biological implications of condensate formation across all cell types. The intention with our current report is therefore not to generate a finalized molecular model of this entirely unstudied class of McdB proteins. But instead, to generate useful insight into McdB biochemistry that can advance our understanding of this class of protein’s function in vivo. To this end, we now add in vivo data based on these initial models where we specifically link cellular phenotypes to McdB condensate solubility (Fig. 8). Of course, there are several follow-up studies that come from the current report, but we believe that speaks to the value of the presented research in advancing this field.

      If the trimer is such that all binding sites are fully satisfied (with the binding sites presumably being on the C-terminal pseudo-IDR), then the hexamer should be a network terminating structure, which it does not seem to be based on the data. Instead, we find that only the full-length protein can undergo phase separation (albeit at rather high concentrations) in the absence of crowder. We also find that the driving forces for phase separation are pH dependent, with pH values above 8.5 being sufficient to dissolve condensates. Substitution of Lys to Gln in the N-terminal IDR leads to a graded weakening of the driving forces for phase separation. The totality of these data suggest a more complex interplay of the regions than is being advocated by the authors.

      Thank you and we agree. As we discuss above in response #4 and below in response #7, we have changed the focus and tone of our report to say that, while the models we have generated are useful, we are aware they are incomplete at a molecular level. Furthermore, as we describe in response #6, we have added several new McdB mutants to investigate more deeply the role of the CTD, but this region was not amenable to mutagenesis as these mutants affected McdB oligomerization. Lastly, while network forming interactions are certainly important for condensate formation as the reviewer describes, so are solvent interactions. We have added new text and data related to Figs. 3, 4 that address these issues.

      Almost certainly, there are complementary electrostatic interactions among the N-terminal IDR and C-terminal pseudo IDR that are important and responsible for the networking transition that drives phase separation, even if these interactions do not contribute to hexamer formation. The net charge per residue of the 18-residue N-terminal IDR is +0.22 and the NCPR of the remainder is ≈ -0.1. To understand how the N-terminal IDR is essential, in the context of the full-length protein, to enable phase separation (in the absence of crowder), it is imperative that a model be constructed for the topology of the hexamer. It is also likely that the oligomer does not have a fixed stoichiometry.

      We agree and thank the reviewer for these comments. We have added several new substitution mutants aimed at addressing this (Figs. 5, S6). However, the C-terminus was not amenable to substitutions as the trimer-of-dimers was significantly destabilized in these mutants (Figs. 5, S7). Therefore, in this report we were unable to determine specifically how the basic residues in the IDR contribute to condensate formation. However, with the addition of new data in Fig. 8, we think we adequately show that the IDR mutants can be used to investigate McdB condensate formation in vivo, and that follow-up studies will be aimed at investigating these details. We have also added an new discussion section “McdB homologs have polyampholytic properties between their N- and C-termini” that highlight this very likely possibility suggested by the reviewer.

      Therefore, the central weakness of the current work is that it is too preliminary. A set of interesting findings are emerging but by fixating on Lys to Gln titrations within the N-terminal IDR and referring to these titrations as impacting solubility, a premature modular and confused picture emerges from the narrative that leaves too many questions unanswered.

      The work itself is very important given the growing interest in bacterial condensates. However, given that the focus is on understanding the molecular interactions that govern McdB phase behavior - a necessary pre-requisite in the authors minds for understanding if and how phase separation impacts carboxysome homeostasis - it becomes imperative that the model that emerges be reasonably robust and complete. At this juncture, the model raises far too many questions.

      We agree that our previous report was focused mainly on the molecular basis of McdB condensate biochemistry, and in that report we left the model short. In this revised version, we have added several pieces of new data that strengthen the model (Figs. 3-5), although it is still incomplete. However, in this revised version, we have also shifted the focus from a complete biochemical understanding of McdB condensates to a study that links McdB condensate formation in vitro to phenotypes in vivo. In this regard, we have added the in vivo data in Fig. 8 and somewhat changed the focus in the text.

      The MoRF analysis is distraction away from the central focus.

      The MoRF analysis has been removed.

      The problem, as I see it, is that the authors have gone down the wrong road in terms of how they have interpreted the preliminary set of results. Further, the methods used do not have the resolution to answer all the questions that need to be answered. Another issue is that a lot of standard tropes are erected and they become a distraction. For example, it is simply not true that in a protein featuring folded domains and IDRs it almost always is the case that the IDR is the driver of phase transitions. This depends on the context, the sequence details of the IDRs, and whether the interactions that contribute to the driving forces for phase separation are localized within the IDR or distributed throughout the sequence. In McdB it appears to be the latter, and much of the nuance is lost through the use of specific types of deletion constructs.

      Thank you. We have removed much of this and changed the diction on how our current model of McdB condensate formation fits into the literature in the discussion.

      Overall, the work represents a good beginning but the data do not permit a clear denouement that allows one to connect the molecular and mesoscales to fully describe McdB phase behavior. Significantly more work needs to be done for such a picture to emerge.

      Reviewer #2 (Public Review):

      In this work, Basalla et al. study the biochemical properties of the carboxysome positioning protein, McdB. Using in vitro experiments, the authors characterize McdB oligomeric states and the domains driving and modulating its phase separation. Based on bioinformatics analysis, the authors identify a putative binding recognition motif between McdB and its two-component system counterpart McdA. As McdAB-like systems emerge as spatial regulators of bacterial compartments, the data presented here may be of general interest. The study is well executed and provides exciting hypotheses to be tested in vivo.

      The authors found that McdB from S. elongatus PCC 7942 consists of three domains: an N-terminal 18 aa disordered region, a Q-rich helical domain, and a helical C-terminal domain (CTD). Analyzing these domains, the authors present three key results: (i) The Q-rich domains form dimers, and the CTD drives the formation of trimers of dimers (ii) Phase separation is pH sensitive, driven by the Q-rich domain, and modulated by basic residues in the IDR, (iii) The IDR contains a putative recognition motif that binds McdA. While these three sets of results are rich in data, they are disjointed. Relating the three datasets (oligomeric states of the protein, its phase separation behavior, and its ability to bind McdA) is required to provide a complete picture of the molecular mechanism driving McdB condensation.

      Specific comments:

      1) The main limitation of this manuscript is the lack of integration between the three areas of results. In particular: how do the IDR basic residues disrupt phase separation? Is that through interference with either the dimer or timer interface? Does the McdB IDR regulate phase separation behavior when bound to McdA? Or, in other words, is the MoRF acting both as a binding interface and as a solubility regulator, and if so, can both functions be achieved simultaneously? It seems like the MoRF includes at least three basic residues.

      Indeed, we were unable to fully resolve the specific molecular interactions that give rise to condensates versus those that give rise to oligomers, and how these two modes of self-association contribute to one another. One limitation was that, as shown in our new data, the CTD was not amenable to mutagenesis, as it caused destabilization of the trimer-of-dimers (Fig. 5, Fig. S7). Therefore, we could not dissect how the CTD contributes to oligomerization versus driving condensates. However, we did include in vivo data showing how the IDR mutations allowed us to specifically link phenotypes to McdB condensate solubility (Fig. 8). As we discuss above in responses #4, #6, and #7, we changed the focus of the revised manuscript from the molecular basis of McdB condensate formation to linking McdB condensate formation in vitro and its functionality in vivo. To this end, we think the IDR mutation set has been useful, and follow-up studies will be done to further the molecular model of McdB condensate formation. Reviewers 1 and 3 deemed the MoRF section a distraction. Therefore, MoRF analysis and discussions of McdA interactions with this potential MoRF have been removed.

      Finally, what is the effective concentration of McdB in cells, and how does that translate to the in vitro studies?

      In our previous version, we used McdB concentrations between 50-100 µM. We do not know the in vivo concentration of McdB. We have tried several antibodies against McdB, and a few were good enough to detect the presence of McdB, but not quantifiably. We therefore believe in vivo McdB levels are low (sub-micromolar), and definitely lower than the range we previously used in our in vitro studies. In our revised manuscript, we include a titration of McdB at lower concentrations, and see condensates at McdB concentrations lower than 2 µM.

      2) How general are the conclusions made here to other McdBs? The authors have published nice work surveying the commonalities and differences between homologous McdB proteins. Can you comment on the applicability of your findings to other McdB proteins?

      This is a great point, which we have added to a new discussion section titled “McdB homologs have polyampholytic properties between their N- and C-termini”.

      Additional issues:

      3) Using SEC and SEC-MALS, the authors demonstrated that the Q-rich domain forms a stable dimer and that the full-length protein forms hexamers, suggesting trimers of dimers assembly. The authors also suggest that the CTD is responsible for forming those trimers of dimers based on SEC-MALS measurements. However, Figure 2D shows that while the full length runs at 6.6x the monomer, the Q-rich+CTD runs at 5.4x the monomer. First, I could not find SEC-MALS of the full-length protein, and it is not clear whether SEC-MALS was used for all or a fraction of the constructs discussed in Figure 2D. Second, could it be that the Q-rich domain+CTD is an ensemble of hexamers and dimers? Perhaps the IDR is playing a secondary role in stabilizing the hexamer?

      We have repeated the SEC-MALS experiments and included the full-length protein (Fig. 2). Furthermore, we have included SEC-MALS for some of the key substitution mutants (Figs. 5, S7). With the additional findings, our conclusions remain the same as in our previous version of the manuscript.

      4) The analysis of the phase separation results needs to have some extra quantification. The authors show that at 100 uM protein with 10% PEG the full-length phase separates as well as IDR+Q-rich. Lines 176-178: "The CTD, on the other hand, has no effect on the Q-rich domain condensates; Q-rich+CTD condensates formed at the same protein concentration and with identical droplet morphologies at the Q-rich domain alone." It is hard to draw this conclusion solely based on the data presented in Figure 3. An alternative interpretation might be that Q-rich+CTD reduces csat. I suggest the authors include turbidity assays (as shown for pH effect) to quantitively determine csat for these different constructs and perhaps perform FRAP to determine the mobility of these different constructs. In addition, how long after the addition of PEG were these droplets imaged?

      We now include an additional figure where we characterize condensates for full-length McdB (Fig. 3), including FRAP as suggested by the reviewer. We also include additional experiments for the truncations as requested (Fig. 4), and relate the truncation data to the model we propose for the full-length protein. All condensate samples were incubated for 30 mins prior to imaging unless otherwise stated, which we have added to the methods section “Microscopy of protein condensates”.

      5) Solubility assays shown in Figures 4A, B, D, and 5C are missing error bars. Without replicates, it is difficult to assess, for example, the effect of KCl.

      We have included replicates and error bars. Apologies for the omission.

      Also, please indicate the physiological ranges of KCl and pH in Figure 6. The phase separation sensitivity to pH is intriguing. By changing basic residues to glutamines, the authors conclude that the positive charge of the IDR modulates solubility. The Q-rich domain, however, is negatively charged. Can the authors comment on the role of acidic residues in the Q-rich domain? Are they required for phase separation? Also - based on your previous bioinformatics analysis, are the charges of the IDR and the Q-rich domains conserved across McdB homologs?

      Data from this report, and as described by reviewer #1, suggest that charge in the CTD, and not the central region, may be important. Our previous report (MacCready et al., Mol Biol Evol. 2020) touches on the conservation of charge in the NTD and CTD, which we have now added to the discussion section titled ““McdB homologs have polyampholytic properties between their N- and C-termini””. However, we were unable to experimentally verify electrostatic associations between the NTD and CTD because the CTD was not amenable to mutagenesis, as shown in our new data added to the manuscript (Figs. 5, S7).

      6) In previous work, the authors showed a conserved RKR segment in the IDR is highly conserved and missing in S. elongatus PCC 7942 (MacCready et al., Mol Biol Evol. 2020). Given the current finding, it would be important to understand whether the RKR deletion carries functional implications for phase separation behavior.

      The RKR segment is not missing, but likely relates to the KKR residues from S. elongatus PCC 7942. We describe this in more detail elsewhere (MacCready et al., Mol Biol Evol. 2020). However, as we show here, these specific residue locations do not seem to be especially important for condensate formation, but instead the overall net charge of the IDR mediates condensate solubility regardless of the specific residues mutated (Fig. 6).

      7) McdB proteins with 2Q left mutated vs. 2Q middle and 2Q right seem to result in condensates with different material properties (e.g., DIC pictures show different droplet morphologies for the different constructs). Is that the case? And if so, can you comment on that?

      We have included a brief mention of this in the text. However, the overall interpretation of these results remains that regardless of the residues mutated, there is a comparable degree of condensate solubilization for constructs with the same IDR net charge (Fig. 6).

      Reviewer #3 (Public Review):

      Through a series of rigorous in vitro studies, the authors determined McdB's domain architecture, its oligomerization domains, the regions required for phase separation, and how to fine-tune its phase separation activity. The SEC-MALS study provides clear evidence that the α-helical domains of McdB form a trimer-of-dimers hexamer. Through analysis of a small library of domain deletions by microscopy and SDS-PAGE gels of soluble and pellet fractions, the authors conclude that the Q-rich domain of McdB drives phase separation while the N-terminal IDR modulates solubility. A nicely executed study in Figure 4 demonstrated that McdB phase separation is highly sensitive to pH and is influenced by basic residues in the N terminal IDR. The study demonstrates that net charge, as opposed to specific residues, is critical for phase separation at 100 micromolar. In addition, the experimental design included analysis of McdB constructs that lack fluorescent proteins or organic dyes that may influence phase separation. Therefore, the observed material properties have full dependence on the McdB sequence.

      Thank you for the kind words and this perspective. We have added a brief mention to it in the discussion section titled “McdB condensate formation follows a nuanced, multi-domain mechanism”: “Furthermore, it should be noted that the McdB constructs used in our in vitro assays were free from fluorescent proteins, organic dyes, or other modification that may influence phase separation. Therefore, the observed material properties of these condensates have full dependence on the McdB sequence.”

      Studies of proteins often neglect short, disordered segments at the N- or C- terminus due to unclear models for their potential role. This study was interesting because it revealed a short IDR as a critical regulator of phase separation. This includes experiments that remove the IDR (Fig 2 & 3) and mutate the basic residues to show their importance towards McdB phase separation. In a nice set of SDS-PAGE experiments, the authors showed that as the net charge of the IDR decreased the construct became more soluble.

      One challenge is in the experimental design when mutating residues is to assess their impact on phase separation. The author's avoided substitutions to alanine, as alanine substitutions have synthetically stimulated phase separation in other systems. The authors, therefore, have a good rationale for selecting potentially milder mutations of lysine/arginine to glutamine. A potential caveat of mutation to glutamine is that stretches of glutamines have been associated with amyloid/prion formation. So, the introductions of glutamines into the IDR may also have unexpected effects on material properties. Despite these caveats, the authors show mutation of six basic residues in the short IDR abolished phase separation at 100 mM.

      Thank you for the thoughtful consideration, and appreciation of our work! Reviewer 1 had reservations for the Gln substitutions as well. We also used Alanine in new data added to the manuscript. But as the reviewer notes, the alanine mutations artificially drove further phase separation activity, and even aggregation. We show that mutants with the introduction of glutamines, however, remain soluble in vitro and in E. coli even at very high concentrations. Furthermore, we now include SEC-MALS of the McdB variant with 6 glutamines introduced in the IDR and show that there is no impact on oligomeric state. Together the data show no amylogenic properties of these glutamine enriched mutants.

      We have added a note to this potential caveat in the discussion section “McdB condensate formation follows a nuanced, multi-domain mechanism”: “Glutamine-rich regions are known to be involved in stable protein-protein interactions such as in coiled-coils and amyloids (52, 53), and expansion of glutamine-rich regions in some proteins lead to amylogenesis and disease (54, 55). However, when we introduced glutamines into the IDR of McdB solubility was increased both in vitro and in vivo, and without any impact on hexamerization. Together, the data show that increasing the glutamine content in the IDR of McdB did not lead to amylogenesis, but rather increased solubility. Our findings therefore underpin the importance of positive charge in the IDR specifically for stabilizing McdB condensates.”

      Computational studies (Fig 7) also suggest that this short N-IDR region may play a role as a MORF upon potential binding to a second protein McdA. The formulation of this hypothesis is strengthened by the fact that for other ParA/MinD-family ATPases, the associated partner proteins have also been shown to interact with their cognate ATPase via positively charged and disordered N-termini. This aspect of understanding McdB's N-IDR as a MORF is at a very early stage. This study lacks experimental evidence for an N-IDR: McdA interaction and experimental data showing conformational change upon McdA binding. However, the computation study sets up the future to consider whether and how the phase separation activity of McdB is related to its structural dynamics and interactions with McdA.

      Based off of these comments and from Reviewer 1 comments, we have removed the MoRF analyses entirely. The MoRF analysis will be coupled to another study in the lab focused on McdB interactions with McdA.

      In summary, this study provides a strong foundation for the contribution of domains to McdB's in vitro phase separation. This knowledge will inform and impact future studies on McdB regulating carboxysomes and how the related family of ParA/MinD-family ATPases and their cognate regulatory proteins. For example, it is unknown if and how McdB's phase separation is utilized in vivo for carboxysome regulation. However, the revealed roles of the Q-rich domain and N-IDR will provide valuable knowledge in developing future research. In addition, the systematic domain analysis of McdB can be combined with a similar analysis of a broad range of other biomolecular condensates in bacteria and eukaryotes to understand the design principles of phase separating proteins.

    1. Author Response

      Reviewer #1 (Public Review):

      When we tilt our heads, we do not perceive objects to be tilted or rotated. In this study, the authors investigate the underlying neural underpinnings by characterizing how neurons in monkey IT respond to objects when the entire body is tilted. They performed two experiments. In the first experiment, the authors record single neuron responses to objects rotating in the image plane, under two conditions - when the animals were tilted +20{degree sign} or -20{degree sign} relative to the gravitational vertical. Their main finding is that neural tuning curves for object orientation were highly correlated under these conditions. This high correlation is interpreted by the authors as indicative of encoding of object orientations relative to an absolute gravitational reference frame. To control for the possibility that the whole-body tilt could have induced compensatory torsional rotations of the eyes, the authors estimated the eye torsional rotation between the {plus minus}20{degree sign} whole-body tilt to be only {plus minus}6{degree sign}. In the second experiment, the authors recorded neural responses to objects rotated in the image plane with no whole-body tilt but with a visual horizon that could be tilted by the same {plus minus}20{degree sign} relative to the gravitational vertical. Here too they find many neurons whose tuning curves were correlated between the two horizon tilt conditions. Based on these results, the authors argue that IT neurons represent objects relative to the gravitational or absolute vertical.

      The question of whether the visual system encodes objects relative to the gravitational vertical is an interesting and basic one, and I commend the authors for attempting this question through systematic testing of object selectivity under conditions of whole-body tilt. However, I found this manuscript extremely difficult to read, with important analyses and controls described in a very cursory fashion. I also have several major concerns about these results.

      First, the high tuning correlation in the {plus minus}20{degree sign} whole-body tilt conditions could also occur if IT neurons encoded object orientation relative to other fixed contextual cues in the surrounding, such as the frame of the computer monitor. The authors ideally should have some experiment or analysis to address this potential confound, or else acknowledge that their findings can also be interpreted as the encoding of object orientation relative to contextual cues, which would dilute their overall conclusions.

      We think there are three possible interpretations of this comment. First, that visible edges, including the horizon and ground plane (in the scene stimuli), and the screen edges and other gravitationally aligned edges in the room, could serve as visual cues for the orientation of gravity. We agree with this wholeheartedly, and in fact showed a strong degree of gravitational alignment based purely on visual scene cues in Figures 3 and 4. This is consistent with our previous results suggest computation of gravity’s direction in the middle channel of IT (Vaziri et al., Neuron 2014; Vaziri and Connor, Current Biology 2016). Our findings would not be diluted by the fact that multiple cues, not just vestibular/somatosensory but also visual, could help in computing the direction of gravity.

      Second, that overlap between objects and horizon could produce a shape-configuration interaction that changes with object orientation and produces a tuning effect that remains consistent across monkey tilts. We agree this was a possibility, and that is why we tested neurons in the isolated object condition. We have added text to better explain this concern and the control importance of the isolated object condition in the discussion of Fig. 1: “The Fig. 1 example neuron was tested with both full scene stimuli (Fig. 1a), which included a textured ground surface and horizon, providing visual cues for the orientation of gravity, and isolated objects (Fig. 1b), presented on a gray background, so that primarily vestibular and somatosensory cues indicated the orientation of gravity. The contrast between the two conditions helps to elucidate the additional effects of visual cues on top of vestibular/somatosensory cues. In addition, the isolated object condition controls for the possibility that tuning is affected by a shape-configuration (i.e. overlapping orientation) interaction between the object and the horizon or by differential occlusion of the object fragment buried in the ground (which was done to make the scene condition physically realistic for the wide variety of object orientations that would otherwise appear improbably balanced on a hard ground surface).”

      The comparable results in the isolated object condition address the reasonable concern about the horizon/object shape configuration interaction.: “Similar results were obtained for a partially overlapping sample of 99 IT neurons tested with isolated object stimuli with no background (i.e. no horizon or ground plane) (Fig. 2b). In this case, 60% of neurons (32/53) showed significant correlation in the gravitational reference frame, 26% (14/53) significant correlation in the retinal reference frame, and within these groups 13% (7/53) were significant in both reference frames. The population tendency toward positive correlation was again significant in this experiment along both gravitational (p = 3.63 X 10–22) and retinal axes (p = 1.63 X 10–7). This suggests that gravitational tuning can depend primarily on vestibular/somatosensory cues for self-orientation.”

      Third, that the object and screen edges in the isolated object condition have an orientation interaction that influences tuning in a way that remains consistent across monkey tilt. If this was intended, we do not think this is a reasonable concern that needs mentioning in the paper itself. The closest screen edges on our large display were 28 in the periphery, and there is no reason to suspect that IT encodes orientation relationships between distant, disconnected visual elements. Screen edges have been present in all or most studies of IT, and no such interactions have been reported. We will discuss this point in online responses.

      Second, I do not fully understand torsional eye movements myself, but it is not clear to me whether this is a fixed or dynamic compensation. For instance, have the authors measured torsional eye rotations on every trial? Is it fixed always at {plus minus}6{degree sign} or does it change from trial to trial? If it changes, then could the high tuning correlation between the whole-body rotations be simply driven by trials in which the eyes compensated more? The authors must provide more data or analyses to address this important control.

      We now clarify that we could only measure ocular rotation outside the experiment with high-resolution closeup color photography, not possible on individual trials. The extensive literature on ocular counter-rotation has no indication that the degree of rotation is changed by any conditions other than tilt. Our measurements were consistent with previous reports showing that counterroll is limited to 20% of tilt. Moreover, they are consistent with our analyses showing that maximum correlation with retinal coordinates is obtained with a 6 correction for counterroll, indicating equivalent counterroll during experiments. Our analytical compensation for counterroll was based on this value, which optimized results in the retinal reference frame, so our measurements of counter-roll are used only to confirm this value. Ocular rotation would need to be five times greater than any previous observations to completely compensate for tilt and mimic the gravitational tuning we observed. For these reasons, counterroll is not a reasonable explanation for our results:

      “Compensatory ocular counter-rolling was measured to be 6 based on iris landmarks visible in high-resolution photographs, consistent with previous measurements in humans6,7, and larger than previous measurements in monkeys41, making it unlikely that we failed to adequately account for the effects of counterroll. Eye rotation would need to be five times greater than previously observed to mimic gravitational tuning. Our rotation measurements required detailed color photographs that could only be obtained with full lighting and closeup photography. This was not possible within the experiments themselves, where only low-resolution monochromatic infrared images were available. Importantly, our analytical compensation for counter-rotation did not depend on our measurement of ocular rotation. Instead, we tested our data for correlation in retinal coordinates across a wide range of rotational compensation values. The fact that maximum correspondence was observed at a compensation value of 6 (Figure 1–figure supplement 1) indicates that counterrotation during the experiments was consistent with our measurements outside the experiments.”

      Third, I find that when the objects were presented against a visual horizon, different object features are occluded at each orientation. This could reduce the correlation between the neural response in the retinal reference frame, thereby biasing all results away from purely retinal encoding. The authors should address this either through additional analyses or acknowledge this issue appropriately throughout.

      This idea of a shape interaction between object and horizon/ground is essentially the same concern discussed as the second interpretation of the first point, above. As outlined there, we addressed this concern in the best way possible, by removing the horizon/background (in the isolated object condition) and showing that the same results obtained. This comment raises the related point (also cured by the isolated object condition) of differential partial occlusion at the bottom of the object, 15% (by virtual mass) of which was buried below ground to provide a realistic physical interpretation for unbalanced orientations.

      We make both concerns explicit in the revised manuscript: “The Fig. 1 example neuron was tested with both full scene stimuli (Fig. 1a), which included a textured ground surface and horizon, providing visual cues for the orientation of gravity, and isolated objects (Fig. 1b), presented on a gray background, so that primarily vestibular and somatosensory cues indicated the orientation of gravity. The contrast between the two conditions helps to elucidate the additional effects of visual cues on top of vestibular/somatosensory cues. In addition, the isolated object condition controls for the possibility that tuning is affected by a shape-configuration (i.e. overlapping orientation) interaction between the object and the horizon or by differential occlusion of the object fragment buried in the ground (which was done to make the scene condition physically realistic for the wide variety of object orientations that would otherwise appear improbably balanced on a hard ground surface).”

      And we report that the control produces similar results in the absence of horizon/background: “Similar results were obtained for a partially overlapping sample of 99 IT neurons tested with isolated object stimuli with no background (i.e. no horizon or ground plane) (Fig. 2b). In this case, 60% of neurons (32/53) showed significant correlation in the gravitational reference frame, 26% (14/53) significant correlation in the retinal reference frame, and within these groups 13% (7/53) were significant in both reference frames. The population tendency toward positive correlation was again significant in this experiment along both gravitational (p = 3.63 X 10–22) and retinal axes (p = 1.63 X 10–7). This suggests that gravitational tuning can depend primarily on vestibular/somatosensory cues for self-orientation.”

      Reviewer #3 (Public Review):

      This is a very interesting study examining for the first time the influence of lateral tilt of the whole body on orientation tuning in macaque IT. They employed two types of displays: one in which the object was embedded in a scene that had a horizon and textured ground surface, and a second one with only the object. For the first type, they examined the orientation tuning with and without tilting the subject. However, the effect of tilt for the scene stimuli is difficult to interpret in terms of gravitational reference frame since varying the orientation of the object relative to the horizon leads to changes in visual features between the horizon and object. If neurons show tolerance for the global orientation of the scene (within the 50{degree sign} manipulation range) then the consistent orientation tuning across tilts may just reflect tuning for the object-horizon features (like the angle between the object and the horizon line/surface) that is tolerant for the orientation of the whole scene. Thus, the effects of tilt can be purely visually-driven in this case and may reflect feature selectivity unrelated to gravitation. The difference between retinal and gravitational effects can just reflect neurons that do not care about the scene/horizon background but only about the object and neurons that respond to the features of the object relative to the background. Thus, I feel that the data using scenes cannot be used unambiguously as evidence for a gravitational reference frame. The authors also tested neurons with an object without a scene, and these data provide evidence for a gravitational reference frame. The authors should concentrate on these data and downplay the difficult-to-interpret results using scenes.

      We still believe it is important to present these two experimental conditions in parallel, because we believe that visual driving of gravitational tuning by environmental cues is important in real life, and this is substantiated by the effects of visual cues alone. But, we have tried in this revision, in response to these comments and to comments from other reviewers, to clarify the potential concerns about visual effects in the full scene experiment, the importance and meaning of the isolated object condition as a control for concerns about other kinds of tuning, and the relationships between the two experimental conditions:

      Concerns about full scene experiment and the control importance of the isolated object condition: “The Fig. 1 example neuron was tested with both full scene stimuli (Fig. 1a), which included a textured ground surface and horizon, providing visual cues for the orientation of gravity, and isolated objects (Fig. 1b), presented on a gray background, so that primarily vestibular and somatosensory cues indicated the orientation of gravity. The contrast between the two conditions helps to elucidate the additional effects of visual cues on top of vestibular/somatosensory cues. In addition, the isolated object condition controls for the possibility that tuning is affected by a shape-configuration (i.e. overlapping orientation) interaction between the object and the horizon or by differential occlusion of the object fragment buried in the ground (which was done to make the scene condition physically realistic for the wide variety of object orientations that would otherwise appear improbably balanced on a hard ground surface) …

      Similar results were obtained for a partially overlapping sample of 99 IT neurons tested with isolated object stimuli with no background (i.e. no horizon or ground plane) (Fig. 2b). In this case, 60% of neurons (32/53) showed significant correlation in the gravitational reference frame, 26% (14/53) significant correlation in the retinal reference frame, and within these groups 13% (7/53) were significant in both reference frames. The population tendency toward positive correlation was again significant in this experiment along both gravitational (p = 3.63 X 10–22) and retinal axes (p = 1.63 X 10–7). This suggests that gravitational tuning can depend primarily on vestibular/somatosensory cues for self-orientation. However, we cannot rule out a contribution of visual cues for gravity in the visual periphery, including screen edges and other horizontal and vertical edges and planes, which in the real world are almost uniformly aligned with gravity and thus strong cues for its orientation (but see Figure 2–figure supplement 1). Nonetheless, the Fig. 2b result confirms that gravitational tuning did not depend on the horizon or ground surface in the background condition.”

      Cell-by-cell comparisons of scene and isolated stimuli, for those cells tested with both, in Figure 2–figure supplement 6. This figure shows 8 neurons with significant gravitational tuning only in the floating object condition, 11 neurons with tuning only in the gravitational condition, and 23 neurons with significant tuning in both. Thus, a majority of significantly tuned neurons were tuned in both conditions. A two-tailed paired t-test across all 79 neurons tested in this way showed that there was no significant tendency toward stronger tuning in the scene condition. The 11 neurons with tuning only in the gravitational condition by themselves might suggest a critical role for visual cues in some neurons. However, the converse result for 8 cells, with tuning only in the floating condition, suggests a more complex dependence on cues or a conflicting effect of interaction with the background scene for a minority of cells.

      Main text: “This is further confirmed through cell-by-bell comparison between scene and isolated for those cells tested with both (Figure 2–figure supplement 6).”

      Furthermore, the analysis of the single object data should be improved and clarified.

      We have added Figure 1–figure supplement 3–10 that expand the analysis of example cells and additional cells to include all stimuli shown and smoothed tuning curves for individual repetitions of the orientation range.

      We also now present results for individual monkeys in Figure 2–supplements 2,3, and the anatomical locations of individual neurons in Figure 2–supplements 4,5.

    1. Author Response

      Reviewer #1 (Public Review):

      This study optimized a protocol for analyzing microplastics (MPs) in bovine and human follicular fluid. The authors identified the most common plastic polymers in the follicular fluid and assessed the impact of polystyrene beads on bovine oocyte maturation based on the concentration of MPs in follicular fluid. The authors found a decrease in maturation rate in the presence of polystyrene beads and conducted proteomic analysis of oocytes treated with and without MPs, revealing protein alterations.

      Strengths:

      • The optimization of the protocol for analyzing MPs in follicular fluid, which is important for future research in this area.

      • Investigating the effects of MPs on oocyte maturation and proteomic profiles is significant.

      Thank you for the summary and for highlighting our manuscript’s strengths. Weaknesses:

      • The effects of polystyrene beads on oocyte maturation and proteomic profiles are not directly demonstrated, and insufficient analysis is performed to support the claims made in the manuscript.

      We disagree with this statement, as we have shown that the oocyte maturation is affected by the PS beads, which clearly have some effects on the zona pellucida as well, all supported by well thought experimental analysis. Regarding the proteomics data, as suggested to be emphasized by reviewer 3, in the oocyte maturation experiment the PS exposure was performed using cumulus-oocyte-complexes and we believe that the cumulus cells might have a protective role (to a certain extent) to the oocyte. At first, we have performed different methods to try and check incorporation of PS beads into oocyte and cumulus cells but, unfortunately, we could not validate a protocol for that. Therefore, although we have seen some changes on proteomics, indeed we were not able to directly demonstrate which pathways could have been responsible for the decreased oocyte maturation and increased zona pellucida fragility.

      • The use of polystyrene beads does not fully mimic the concentration and interaction of MPs in follicular fluid, which warrants careful interpretation and discussion.

      We are aware that the concentration of polystyrene (PS) used in our experiments (0.01ug/mL and 0.1ug/mL) did not fully represent the PS concentrations found in human and bovine follicular fluid (FF) (0.0013 and 0.0043 ug/mL). We note though that PS is not the only MPs detected in the FF and, in this study we selected PS concentrations that were in the range of the total MPs found in FF (0.102 and 0.025 ug/mL, for human and bovine, respectively). We will carefully re-read and revise the manuscript in order to ensure that we are not at risk of misguiding readers on the environmental relevance of the chosen experimental concentrations. Nevertheless, we firmly believe that our study was performed using a substantially more realistic concentration than the overwhelming majority of existing studies, which tend to use hundreds of thousands of times more plastic than what is naturally occurring (as described by Mills et al. - https://doi.org/10.1186/s43591-023-00059-1).

      • A major weakness is the lack of mechanism. Determining the cause of meiotic arrest (decreased maturationrate) would be needed to strengthen the paper. Are spindle morphology, chromosome morphology/alignment and/or spindle assembly checkpoint mechanism perturbed in MPs-treated oocytes?

      • Functional assays to validate one or more of the pathways suggested by the proteomic analysis would be necessary to strengthen the paper.

      We appreciate that understanding the mechanisms underlying the observed changes is important, however, prior to this work, little was known about the effects of MPs on reproductive health. As such, the experimental plan for this work was focused on providing an assessement of the extent to which MPs occur in reproductive systems, and the effect of these MPs on general metrics of oocyte health and function. It is only with this baseline knowledge that experiments aimed at studying the mechanisms underlying these changes can/should be designed, which we will certainly consider for future research.

      • The analysis of broken zona pellucida is not sufficiently convincing. Definitely the breakage of zona pellucida is most likely a result of oocyte denudation. However, this may indicate increased fragility of polystyrene beads-treated oocytes. Investigating cytoskeletal components in oocytes treated with or without polystyrene beads would strengthen this paper.

      Indeed, the reviewer is correct that the breakage of the zona pellucida happened during denudation. Yet, because all groups were processed in the exact same way, the differences we observed between our experimental and control groups clearly indicate that the PS beads are causing some form of damage to the zona pellucida, or indirect effects through cumulus-oocyte interactions, irrespective of the initial breakage. This is a question we want to answer in future experiments.

      • The percentage of degenerated oocytes in the control group is abnormally high which raises concern that the oocytes are not healthy.

      The reviewer is correct in noting that the baseline number of degenerated oocytes is high. This is unlikely to be due to oocyte health, and is more likely attributed to the fact that the students that were working on this experiment had a period of adaptation to learn to work with these cellular types. In this regard, it is important to mention that we designed the experiment such that this effect was evenly distributed throughout all of the groups. In other words, the technique refinement did not introduce any systematic bias into the data. Thus, while the baseline number of degenerated oocytes is high, we are confident that the effects of MPs are robust.

      • The small font size of the figures (such as Fig. 1C) affects the quality of the manuscript.

      Thank you for pointing this out. We will improve readability of all our figures for a resubmission.

      • Finally, the authors should cite previous publications on the effects of MPs on female reproduction, as this is not a novel area of research, despite the use of different concentrations. For example, "Polystyrene microplastics lead to pyroptosis and apoptosis of ovarian granulosa cells via NLRP3/Caspase-1 signaling pathway in rats (DOI: 10.1016/j.ecoenv.2021.112012)".

      Yes, absolutely. We we will include this interesting and relevant work in our revised mansucript.

      Reviewer #2 (Public Review):

      This study presents valuable findings including the use of an improved method of Raman spectroscopy to measure accumulation of microplastics in ovarian follicular fluid obtained from cows and women and demonstration that experimental direct exposure of bovine eggs to biologically relevant levels of polystyrene, a microplastic found in both cows and women's follicular fluid, negatively influenced ova maturation status and the abundance of proteins involved in oxidative stress, DNA damage, apoptosis, and oocyte maturation.

      Thank you for the summary and for highlighting our manuscript’s strengths.

      The evidence supporting the claims of the authors is solid but inclusion of human population from which the follicular fluid was obtained (e.g., demographics, reason for assisted reproduction),

      Agreed. We will include all information regarding the reason for IVF, age, BMI, and IVF outcomes in the revised manuscript.

      and details about quality control for proteome profiling experiments (i.e., peptide count cut-off for significant proteins) would have strengthened the study. The work will be of interest to exposure scientists, reproductive toxicologists, regulatory scientists, and reproductive health clinicians.

      For protein identification, the default settings of MaxQuant were used. In brief, proteins are only considered as identified with at least one unique or razor peptide. Razor peptides are non-unique and assigned to a single protein to ensure that they are only used once for identification. Additionally, a false discovery rate of 1% was applied using a decoy sequence database approach. Quantification was performed on proteins with at least two different peptides. We will include this information in the revised manuscript.

      Reviewer #3 (Public Review):

      The study from Grechi et al showed that emerging environmental microplastics (MPs) are present in both human and bovine follicular fluid. Moreover, based on the characterization and quantification data, authors treated bovine oocytes with environmentally relevant levels of polystyrene (PS) MPs and found that PS MPs interfered with oocyte maturation in vitro. This study is novel, particularly the first part of MP characterization and quantification, and for the first time confirms the presence of MPs in follicular fluid of humans and large farm animals. These results provide a possible mechanism by which the female infertility rate has been increasing in both humans and large farm animals.

      Thank you for the summary and for highlighting our manuscript’s novelty.

      The session of exposing MPs to bovine and related oocyte health evaluation can be further improved. For example, authors examined the morphology of the oocyte zona pellucida (ZP) and degeneration and stained oocyte DNA to determine the meiotic maturation status. However, a much more comprehensive oocyte health evaluation can be performed including but not limited to the examination of oocyte spindle morphology, meiotic division, fertilization, early embryo development, mitochondria, and accumulation of ROS. These additional endpoints can provide more robust evidence to determine the impact of MPs on oocyte health.

      We agree with the reviewer that a more comprehensive oocyte health evaluation can be performed. Doing so, however, is beyond the scope of any single study as there are many different pathways and mechanisms by which MPs may be affecting oocytes and attempting to include all of these experiments in a single study is simply not feasible. Indeed, we plan on continuing along this line of work in future experiments.

      While the oocyte proteomic analysis identified altered proteins, more functional studies and causation experiments can be performed.

      As noted in our reply to reviewer 1, we appreciate that understanding the mechanisms underlying the observed changes is important, however, prior to this work, little was known about the effects of MPs on reproductive health. As such, the experimental plan for this work was focused on providing an assessement of the extent to which MPs occur in reproductive systems, and the effect of these MPs on general metrics of oocyte health and function. It is only with this baseline knowledge that experiments aimed at studying the mechanisms underlying these changes can/should be designed, which we will certainly consider for future research.

      In addition, authors exposed cumulus-oocyte-complexes (COCs) but not denuded oocytes with MPs, it is crucial to determine whether MPs accumulate in cumulus cells or oocytes or both as well as the compromised oocyte quality is caused by the direct effect of MPs or the indirect impact on somatic cumulus cells to cause a secondary effect on the oocytes.

      As stated previously, at first, we have performed different methods to try and check incorporation of PS beads into oocyte and cumulus cells but, unfortunately, we could not validate a protocol for that. Therefore, although we have seen some changes on proteomics, indeed we were not able to directly demonstrate which pathways could have been responsible for the decreased oocyte maturation and increased zona pellucida fragility, and what is the possible role of the cumulus cells on it.

    1. Author Response

      Reviewer #1 (Public Review):

      [...] This study brings a lot of new information on the regulation of flagellar genes, from the identification of novel sigma 28-dependent sRNAs to their effects on flagella production and motility. It represents a considerable amount of work; the experimental data are clear and solid and support the conclusions of the paper. Even though mechanistic details underlying the observed regulations by MotR or FliX sRNAs are lacking, the effect of these sRNAs on fliC, several rps/rpl genes, and flagellar genes and motility is convincing.

      The connection between r-protein genes regulation and flagellar operons is exciting and raises a few questions. First, from the RILseq data, chimeric reads with mRNA for r-proteins (including rpsJ) are not restricted to the sigma 28-dependent sRNAs (e.g. rpsJ-sucD3'UTR, rpsF-DicF, rplN-DicF, rplK-ChiX, rplU-CyaR, rpsT-CyaR, rpsK-CyaR, rpsF-MicA...), suggesting that regulation of r-protein synthesis by sRNAs is not necessarily related to flagella/motility. Second, it would be interesting to know if the flagellar operons are more sensitive than other long operons to antitermination following MotR overexpression? In other words, does pMotR similarly affect antitermination in rrn or other long operons?

      The general effect of pMotR or pFliX on the expression of multiple middle and late flagellar genes is also interesting even though the mechanism is not clear. While it may be difficult to fully address it, testing whether some of these regulatory events depend on the control of fliC and/or the S10 operon could be relevant (by analyzing the effects in strains deleted for fliC or nusB for instance).

      We also think the connection between r-protein genes regulation and flagellar operons is exciting and raises some intriguing questions. While there are other RIL-seq chimeras for r-protein genes, the highest numbers are found for MotR and FliX. Nevertheless, understanding the impact of these other sRNAs on the r-protein operons and elucidating which long operons are most sensitive to antitermination following MotR overexpression are important directions for further studies.

      Reviewer #2 (Public Review):

      [...] This is a very interesting study that shows how sRNA-mediated regulation can create a complex network regulating flagella synthesis. The information is new and gives a fresh outlook at cellular mechanisms of flagellar synthesis. The presented work could benefit from additional experiments to confirm the effect of endogenous sRNAs expressed at natural level.

      We agree that experiments regarding the endogenous effects of endogenous sRNAs are important. We provide such data in Figures 8 and S14 for MotR and FliX in a variety of assays: flagella numbers by electron microscopy, motility and competition assays, expression of flagellar genes by RT-qPCR and western analysis. We went to the trouble of constructing strains carrying point mutations in the chromosomal copies of these genes rather than deletions to avoid interfering with expression of motA and fliC given that MotR and FliX encompass the 5’ and 3’ UTRs respectively.

      Reviewer #3 (Public Review):

      [...] Overall, this comprehensive study expands the repertoire of characterized UTR derived sRNAs and integrate new layers of post-transcriptional regulation into the highly complex flagellar regulatory cascade. Moreover, these new flagella regulators (MotR, FliX) act non-canonically, and impact protein expression of their target genes by base-pairing with the CDS of the transcripts. Their findings directly connect flagella biosynthesis and motility, highly energy consuming processes, to ribosome production (MotR and FliX) and possibly to carbon metabolism (UhpU).

      Specific points to be considered:

      • The authors use a crl- hyper-motile strain as WT strain for the study and sometimes also a crl+ strain is used. Can the authors comment on potential reasons why some phenotypes (e.g., UhpU and MotR effects on motility) are only detectable in the crl+ strain or vice versa? Is σS regulation important for the function of these sRNAs?

      • In several experiments, a variant of MotR sRNA, MotR that harbors a 3 nt mutation upstream of the seed sequence is used and seems to mediate stronger phenotypes (impact on flagellar number) upon overexpression compared to WT or phenotypes not retrieved for WT MotR (increased flagellin expression). It would be helpful to have some more clarification throughout the text, why this variant was used, even when OE of WT MotR already has impact on the target and how these three mutated nucleotides impact target regulation. For example, does MotR show increased RNA stability or Hfq binding compared to MotR? Does the mutation in MotR* impact MotR structure (e.g., based on secondary structure predictions) or increase the complementarity with selected targets at potential secondary binding sites (e.g., based on target predictions)? For example, Fig. S7 shows additional regions of interaction between MotR and fliC mRNA beside the seed sequence. It is also suggested that MotR might have multiple interaction sites on rpsJ mRNA. Additional structure probing or biocomputational predictions could clarify these points.

      • It is suggested that UphU impacts on motility via regulation of LrhA, which represses transcription of flhDC, and therefore the flagellar cascade. While LhrA-mediated regulation by UphU is validated based on reporter genes, the effect of UhpU OE on FlhDC levels is not directly examined (Fig. 3). Furthermore, as deletion of LrhA de-represses the flagellar cascade and UhpU was also shown to increase motility, the conclusions could be further strengthened by examining flhDC levels and/or the effect of ∆UhpU (if the sRNA part can be deleted) on motility (reduction) due to relieved down-regulation of LrhA.

      • This study provides many opportunities for future follow-work. Now that the four sRNAs and some of their targets and opposing effects on flagella biogenesis have been identified, it will be interesting to see how the sRNAs themselves are temporally regulated throughout the flagella biogenesis cascade and which other targets are regulated by them. Future studies could also provide insights into the mechanism and function of FlgO sRNA, which seems to act via a different mechanism than base-pairing to target RNAs, as well as the global effects of regulation of ribosomal genes via FliX and MotR.

      We thank the reviewer for the constructive comments about the variation between the crl- and crl+ strains, and about the use of MotR versus MotR*, and will address these points in a revised version of the manuscript. Regarding the UhpU-mediated regulation, we agree that assays of flhDC expression will strengthen our conclusions. We share the reviewer opinion regarding many opportunities for future follow-up work.

    1. Author Response

      Reviewer #1 (Public Review):

      This article describes the development and refinement of an open-source software framework that is used to track how the COVID-19 pandemic impacted healthcare use in England over a range of key healthcare use indicators.

      Important strengths of this study include the high coverage of 99% of practices in England, the development of health care indicators with the input of a clinical advisory group, extensive online documentation, and rigorous safeguards for the protection of patient confidentiality.

      Perhaps the largest limitation is that only high-level descriptive data on the monthly volume of health outcomes are presented. It is not clear whether the system could be used to generate more fine-grained or stratified information, ex. weekly or daily data, or data stratified by important characteristics of practices or of patient characteristics. As such, the utility of the system for answering new scientific questions is unclear, and also what the utility and long-term potential uses of this system will be past the COVID-19 pandemic.

      OpenSAFELY allows access to the full primary care record for patients registered with a TPP or EMIS practice in England.This includes medical diagnoses, clinical tests, prescriptions, as well as demographic details such as age, sex, ethnicity. Dates attached to these records allow for daily analyses to be performed. This data is updated weekly. Through linkage of other data sources, it also provides information such as hospital admissions, registered deaths or COVID-19 testing data. Detailed subgroup analysis is possible; OpenSAFELY has already been used to understand disease risk 1, monitor vaccination coverage 2,3 and novel treatments 4, assess patient safety 5, inform public health guidance and policy and much more6. These are all widely applicable beyond the COVID-19 pandemic.

      Reviewer #3 (Public Review):

      This manuscript by Fisher and colleagues documents the change in clinical activity in English general practices during the COVID-19 pandemic according to a set of indicators of clinical activity. The indicators include measures of clinical reviews (e.g. blood pressure, asthma, chronic obstructive pulmonary disease, medication, and cardiovascular risk reviews), blood tests (e.g. cholesterol, liver function, thyroid function, full blood counts, diabetes monitoring blood tests, and kidney function). All these measures saw a drop during the pandemic, to a varying degree, and some recovered afterwards but others did not.

      Clinical activity was measured using SNOMED CT codes, which are standard codes used for recording clinical events in UK GP records.

      Strengths:

      This is a large and comprehensive study including data from 99% of general practices in England. The indicators are clinically relevant, cover a broad range of disease areas, and have been chosen in a sensible manner, involving relevant stakeholders such as GPs, pharmacists, and pathologists.

      The OpenSAFELY platform has the ability to enable federated analyses to be run on raw coded data of almost all patients registered with a GP in England.

      The study demonstrates the value of OpenSAFELY in being able to monitor clinical activity in general practice at a detailed level, which is essential for planning and improving health services. The statistical methodology is broadly sound.

      Weaknesses:

      The measures are all related to chronic physical diseases in adults, with a particular focus on cardiometabolic and respiratory conditions. There are no measures related to mental health, maternal or child health.

      Results from preliminary analyses of a wider range of clinical conditions can be found in our previous work7. This includes mental health and female and reproductive health with details on why these were not covered by the initial key measures described.

      The description of the measures does not distinguish between different types of clinical activity e.g. lab tests, clinical measurements, or diagnoses, and all are lumped together as 'codes'. This is a peculiarity of the way that information is recorded in GP systems - many different types of clinical information (such as diagnoses and lab tests) are recorded using a SNOMED CT 'code', and only the exact code differentiates what type of information is in the record.

      Multiple codes of different types can arise from a single encounter, all of which could be indicative of a clinical event of interest. The codelists for each key measure, available at opencodelists.org shows the type of clinical activity (e.g procedure or observable entity) captured by each code within the codelist (see e.g.https://www.opencodelists.org/codelist/opensafely/red-blood-cell-rbc-tests/576a859e/#tree).

      The codelists were broad and comprehensive, but it is unclear how necessary this is because for some measures e.g. lab tests, laboratories typically record a particular type of test using a single standardised code. Instead of using a broad set of codes in the analysis, the authors could have initially verified which codes are associated with the clinical activity being measured (e.g. a numerical value of a blood pressure measurement) in all practices, as I would expect the same single or small number of codes would be used in all practices. This would have provided a smaller and simpler final codelist.

      Supplementary table 1 shows up to 5 of the most common codes for each key measure across the two electronic health record (EHR) systems used in this analysis. This shows that whilst a single code is often used for many of the clinical activities assessed here, there are exceptions and there can be variation in coded activity between different EHR systems. We have previously described how design features of EHR systems can impact clinical practice 8. Broad codelists allow us to capture activity across multiple EHR systems.

      1. Williamson, E. J. et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature 584, 430–436 (2020).
      2. Trends and clinical characteristics of 57.9 million COVID-19 vaccine recipients: a federated analysis of patients’ primary care records in situ using OpenSAFELY | British Journal of General Practice. https://bjgp.org/content/early/2021/11/08/BJGP.2021.0376.
      3. Parker, E. P. et al. Factors associated with COVID-19 vaccine uptake in people with kidney disease: an OpenSAFELY cohort study. BMJ Open 13, e066164 (2023).
      4. Green, A. C. A. et al. Trends, variation, and clinical characteristics of recipients of antiviral drugs and neutralising monoclonal antibodies for covid-19 in community settings: retrospective, descriptive cohort study of 23.4 million people in OpenSAFELY. BMJ Med. 2, (2023).
      5. Collaborative, T. O. et al. Potentially inappropriate prescribing of DOACs to people with mechanical heart valves: a federated analysis of 57.9 million patients’ primary care records in situ using OpenSAFELY. 2021.07.27.21261136 https://www.medrxiv.org/content/10.1101/2021.07.27.21261136v1 (2021) doi:10.1101/2021.07.27.21261136.
      6. OpenSAFELY Pubmed search results. PubMed https://pubmed.ncbi.nlm.nih.gov/?term=OpenSAFELY.
      7. OpenSAFELY NHS Service Restoration Observatory 2: changes in primary care activity across six clinical areas during the COVID-19 pandemic | medRxiv. https://www.medrxiv.org/content/10.1101/2022.06.01.22275674v1.
      8. Suboptimal prescribing behaviour associated with clinical software design features: a retrospective cohort study in English NHS primary care | British Journal of General Practice. https://bjgp.org/content/70/698/e636.
    1. Author Response

      eLife assessment:

      This important study represents a comprehensive computational analysis of Plasmodium falciparum gene expression, with a focus on var gene expression, in parasites isolated from patients; it assesses changes that occur as the parasites adapt to short-term in vitro culture conditions. The work provides technical advances to update a previously developed computational pipeline. Although the findings of the shifts in the expression of particular var genes have theoretical or practical implications beyond a single subfield, the results are incomplete and the main claims are only partially supported.

      The authors would like to thank the reviewers and editors for their insightful and constructive assessment. We are particularly glad to read of the technical advances of the methods developed here. We will rephrase parts of the manuscript and move some analysis to the supplementary materials. This will improve the clarity of the results and ensure the main claims are supported.

      Reviewer #1 (Public Review):

      The authors took advantage of a large dataset of transcriptomic information obtained from parasites recovered from 35 patients. In addition, parasites from 13 of these patients were reared for 1 generation in vivo, 10 for 2 generations, and 1 for a third generation. This provided the authors with a remarkable resource for monitoring how parasites initially adapt to the environmental change of being grown in culture. They focused initially on var gene expression due to the importance of this gene family for parasite virulence, then subsequently assessed changes in the entire transcriptome. Their goal was to develop a more accurate and informative computational pipeline for assessing var gene expression and secondly, to document the adaptation process at the whole transcriptome level.

      Overall, the authors were largely successful in their aims. They provide convincing evidence that their new computational pipeline is better able to assemble var transcripts and assess the structure of the encoded PfEMP1s. They can also assess var gene switching as a tool for examining antigenic variation. They also documented potentially important changes in the overall transcriptome that will be important for researchers who employ ex vivo samples for assessing things like drug sensitivity profiles or metabolic states. These are likely to be important tools and insights for researchers working on field samples.

      One concern is that the abstract highlights "Unpredictable var gene switching..." and states that "Our results cast doubt on the validity of the common practice of using short-term cultured parasites...". This seems somewhat overly pessimistic with regard to var gene expression profiling and does not reflect the data described in the paper. In contrast, the main text of the paper repeatedly refers to "modest changes in var gene expression repertoire upon culture" or "relatively small changes in var expression from ex vivo to culture", and many additional similar assessments. On balance, it seems that transition to culture conditions causes relatively minor changes in var gene expression, at least in the initial generations. The authors do highlight that a few individuals in their analysis showed more pronounced and unpredictable changes, which certainly warrants caution for future studies but should not obscure the interesting observation that var gene expression remained relatively stable during transition to culture.

      Thank you for the suggestion and we are happy to modify the wording to ensure the correct results are presented. We will reword the abstract and emphasise the main change was observed in the core transcriptome. We will also add clarity to the different var transcriptome results presented.

      It is important to note this study was in a unique position to assess changes at the individual patient level as we had successive parasite generations. This is not done in most cross-sectional studies and therefore these small changes in the var transcriptome would have been missed.

      Reviewer #2 (Public Review):

      In this study, the authors describe a pipeline to sequence expressed var genes from RNA sequencing that improves on a previous one that they had developed. Importantly, they use this approach to determine how var gene expression changes with short-term culture. Their finding of shifts in the expression of particular var genes is compelling and casts some doubt on the comparability of gene expression in short-term culture versus var expression at the time of participant sampling. The authors appear to overstate the novelty of their pipeline, which should be better situated within the context of existing pipelines described in the literature.

      Other studies have relied on short-term culture to understand var gene expression in clinical malaria studies. This study indicates the need for caution in over-interpreting findings from these studies.

      The novel method of var gene assembly described by the authors needs to be appropriately situated within the context of previous studies. They neglect to mention several recent studies that present transcript-level novel assembly of var genes from clinical samples. It is important for them to situate their work within this context and compare and contrast it accordingly. A table comparing all existing methods in terms of pros and cons would be helpful to evaluate their method.

      We are grateful for this suggestion and agree that a table comparing the pros and cons of all existing methods would be helpful for the reader, not just malaria researchers. This will also highlight the key benefits of our new approach. This will be included in the updated manuscript as a supplementary table.

      Reviewer #3 (Public Review):

      This work focuses on the important problem of how to access the highly polymorphic var gene family using short-read sequence data. The approach that was most successful, and utilized for all subsequent analyses, employed a different assembler from their prior pipeline, and impressively, more than doubles the N50 metric.

      The authors then endeavor to utilize these improved assemblies to assess differential RNA expression of ex vivo and short-term cultured samples, and conclude that their results "cast doubt on the validity" of using short-term cultured parasites to infer in vivo characteristics. Readers should be aware that the various approaches to assess differential expression lack statistical clarity and appear to be contradictory. Unfortunately there is no attempt to describe the rationale for the different approaches and how they might inform one another.

      It is unclear whether adjusting for life-cycle stage as reported is appropriate for the var-only expression models. The methods do not appear to describe what type of correction variable (continuous/categorical) was used in each model, and there is no discussion of the impact on var vs. core transcriptome results.

      The reviewer raises a fair point, and we agree the different methods and results of the var transcriptome analysis are difficult to interpret together without further clarification. Var transcript differential expression analysis has been used several times previously and hence was used here. As mentioned above, this study was in a unique position to perform a more focussed analysis of var transcriptional changes across paired samples. This allowed for changes in the var transcriptome to be identified that would have gone unnoticed in the "traditional" differential expression analysis. To address this point, we will add further explanation to the results and move the var differential expression analysis to the supplementary, to allow for comparison with previous studies.

      We thank the reviewer for this highly important comment about adjusting for life cycle stage. Var gene expression is highly stage dependent, so any quantitative comparison between samples does need adjustment for developmental stage. Var gene expression was adjusted for in the differential expression analysis by using the mixture model determined proportions as covariates in the design matrix. The var group level analysis and the global var gene expression analysis was also adjusted for life cycle stage using the same proportions, by including them as an independent variable. The rank-expression analysis did not have adjustment for life cycle stage as the values were determined as a percentage contribution to the total var transcriptome.

      We will update the methods section to ensure this is clearer.

    1. Author Response

      eLife assessment

      This important study addresses both the native role of the Plasmodium falciparum protein PfFKBP35 and whether this protein is the target of FK506, an immunosuppressant with antiplasmodial activity. The genetic evidence for the essentiality of FKBP35 in parasite growth is compelling. However, the conclusion that the role of FKBP35 is to secure ribosome homeostasis and the claim that FK506 exerts its antimalarial activity independently of FKBP35 rely on incomplete evidence.<br />

      We thank the Reviewers and Editors for their careful evaluation of our manuscript and the constructive criticism. We realized that some of our conclusions may be regarded/misunderstood as overstatements. This was by no means our intention and we apologize for the unnecessary inconvenience. The phenotype of FKBP35 knock-out parasites clearly centers on failing ribosomes and protein synthesis, which in our opinion, provides an important leap towards understanding the role of this drug target in P. falciparum biology. It is however correct that, at this point, we can only make evidence-based hypotheses about direct interaction partners and we will emphasize this more clearly in a revised version of the manuscript. In order to prevent misinterpretation of our work, and as detailed in the point-by-point responses to the reviewer comments, we propose changing the manuscript title to “Genetic validation of Pf_FKBP35 as an antimalarial drug target”. To address the criticism regarding the effects of FK506, we will perform specific additional experiments. We are convinced that this new data set will resolve any remaining ambiguities and allows for a conclusive assessment of FK506 drug activity in _P. falciparum.

      Reviewer #1 (Public Review):

      In this study, the authors investigate the biological function of the FK506-binding protein FKBP35 in the malaria-causing parasite Plasmodium falciparum. Like its homologs in other organisms, PfFKBP35 harbors peptidyl-prolyl isomerase (PPIase) and chaperoning activities, and has been considered a promising drug target due to its high affinity to the macrolide compound FK506. However, PfFKBP35 has not been validated as a drug target using reverse genetics, and the link between PfFKBP35-interacting drugs and their antimalarial activity remains elusive. The manuscript is structured in two parts addressing the biological function of PfFKBP35 and the antimalarial activity of FK506, respectively.

      The first part combines conditional genome editing, proteomics and transcriptomics analysis to investigate the effects of FKBP35 depletion in P. falciparum. The work is very well performed and clearly described. The data provide definitive evidence that FKBP35 is essential for P. falciparum blood stage growth. Conditional knockout of PfFKBP35 leads to a delayed death phenotype, associated with defects in ribosome maturation as detected by quantitative proteomics and stalling of protein synthesis in the parasite. The authors propose that FKBP35 regulates ribosome homeostasis but an alternative explanation could be that changes in the ribosome proteome are downstream consequences of the abrogation of FKBP35 essential activities as chaperone and/or PPIase. It is unclear whether FKBP35 has a specific function in P. falciparum as compared to other organisms. The knockdown of PfFKBP35 has no phenotypic consequence, showing that very low amounts of FKBP35 are sufficient for parasite survival and growth. In the absence of quantification of the protein during the course of the experiments, it remains unclear whether the delayed death phenotype in the knockout is due to the delayed depletion of the protein or to a delayed consequence of early protein depletion. This limitation also impacts the interpretation of the drug assays.

      We thank the Reviewer for the compliments regarding our experimental setup and the clarity of our manuscript. We agree that the link between FKBP35 knock-out and ribosome homeostasis is indirect and we now emphasize this more clearly in the revised manuscript. To prevent a general misinterpretation of our manuscript, we will adapt the title accordingly.

      We would still like to reiterate that the phenotype of FKBP35 knock-out parasites is best described by their defects in maintaining functional ribosomes. It is for several reasons that we believe the links between FKBP35 and ribosome function are purely evidence driven: First, pre-ribosomal and nucleolar factors are the first proteins (in generation 1 schizonts) to be affected upon knock-out of fkbp35 (Figure 2A, Table S1). We realized that Figure 2A falls short in showing this observation, which is why will update the figure accordingly. Second, the dysregulation of ribosomal factors and the general stall in protein synthesis is dominating the phenotype of FKBP35 knock-out parasites in generation 2. We thus believe it is appropriate to say that knock-out cells are most likely killed in response to defective ribosome maintenance – which is a consequence of reduced FKBP35 levels. We are aware that our experiments (and possibly any other reverse genetics approach) cannot rule out that FKBP35 affects ribosomal factors indirectly. Clearly, more work is required to disentangle this question in more detail in the future.

      We agree with the Reviewer that it is not possible to tell if the delayed death-like phenotype is due to a “delayed protein depletion”. We would however like to note that the DiCre/loxP approach allows for an immediate knock-out at the genome level and is thus as precise as possible. Further, in addition to the substantial depletion of FKBP35 in knock-out cells during the phenotypically silent generation, knocking out of fkbp35 at earlier time points (TPs 24-30 and 34-40 hpi in the preceding generation) resulted in the very same phenotype cycle (Figure 1). Here, parasite death was delayed substantially longer, i.e. more than one complete cycle. Together with the dysregulation of early ribosome maturation in generation 1, these findings point towards a delayed death phenotype. It is of course still possible to explain the delayed death-like phenotype by remnant activity of proteins synthetized prior to the genomic knock-out. We address this possibility and describe the two scenarios mentioned by the Reviewer in lines 141-144. Disentangling the two possibilities in future experiments will be difficult, not only with regards to FKBP35, but regarding “delayed death” phenotypes in general.

      In the second part, the authors investigate the activity of FK506 on P. falciparum, and conclude that FK506 exerts its antimalarial effects independently of FKBP35. This conclusion is based on the observation that FK506 has the same activity on FKBP35 wild type and knock-out parasites, suggesting that FK506 activity is independent of FKBP35 levels, and on the fact that FK506 kills the parasite rapidly whereas inducible gene knockout results in delayed death phenotype. However, there are alternative explanations for these observations. As mentioned above, the delayed death phenotype could be due to delayed depletion of the protein upon induction of gene knockout. FK506 could have a similar activity on WT and mutant parasites when added before sufficient depletion of FKBP35 protein. In some experiments, the authors exposed KO parasites to FK506 later, presumably when the KO is effective, and obtained similar results. However, in these conditions, the death induced by the knockout could be a confounding factor when measuring the effects of the drug. Furthermore, the authors show that FK506 binds to FKBP35, and propose that the FK506-FKBP35 complex interferes with ribosome maturation, which would point towards a role of FKBP35 in FK506 action. In summary, the study does not provide sufficient evidence to rule out that FK506 exerts its effects via FKBP35.

      Noteworthy, we were also very much surprised by data indicating that the antimalarial activity of FK506 is independent of FKBP35. It is for this reason that we conducted a comprehensive set of experiments to disprove our initial observations, but couldn`t find any evidence for an FKBP35-dependent mode of action of FK506:

      We were not able to see altered FK506 sensitivity in (i) inducible knock-down parasites, (ii) inducible overexpression parasites and (iii) inducible knock-out parasites. Parasites with altered FKBP35 levels (as assessed by Western blot and quantitative proteomics at 36-42 hpi, respectively) were equally sensitive to FK506. Importantly, at no sub-lethal FK506 concentration did lower FKBP35 levels lead to an altered response of FKBP35KO compared to the wild-type control population. Furthermore, (iv) induction of the knock-out in the cycle preceding FK506 exposure also had no effect on parasite sensitivity. As mentioned by the Reviewer, we also exposed the parasites to FK506 at 30-36 hpi and (v) did not see any effect, even though we measured a 19-fold difference in FKBP35 protein levels between the parasite populations at 36-42 hpi. At this point, parasite death induced by the knock-out cannot be a confounding factor (as it was mentioned by the Reviewer), because the FKBP35 knock-out has no effect on parasite survival in generation 1 in the absence of FK506 (Figure 1F). This demonstrates that the observed effect is only due to drug-mediated killing and not due to the FKBP35 knock-out.

      To account for a scenario in which the drop in FKBP35 levels only occurs after 36 hpi, we will perform an additional set of experiments, in which we induce the knock-out at 0-6 hpi and treat the parasites at 36-42 hpi (i.e. the time point at which the 19-fold difference in protein levels was measured by quantitative proteomics). This setup will allow determining whether or not the parasite killing activity of FK506 depends on FKBP35 levels.

      So far, our experiments cannot support any scenario in which FK506 kills P. falciparum parasites via inhibiting the essential role of FKBP35 and we would therefore want to insist that this statement is based on highly solid evidence. In this context, it is important to note that our conclusion includes two scenarios: “This indicates that either the binding of FK506 does not interfere with the essential role of _Pf_FKBP35, or that _Pf_FKBP35 is inhibited only at high FK506 concentrations that also inhibit other essential factors.” While this phrase is already present in our initial submission, we will emphasize this point more clearly in the revised manuscript. We are convinced that this information is of high importance for ongoing and future drug development.

      Reviewer #2 (Public Review):

      The manuscript by Thomen et al. FKBP secures ribosome homeostasis in Plasmodium falciparum and focuses on the importance of PfKBP35 protein, its interaction with the FK506 compound, and the role of PfKBP35 in ribosome biogenesis. The authors showed the interaction of the PfKBP54 with FK506, but the part of the FK506 and PfKBP54 in ribosome biogenesis based on the data is unclear.

      The introduction is plotted with two parallel stories about PfKBP35 and FK506, with ribosome biogenesis as the central question at the end. In its current form, the manuscript suffers from two stories that are not entirely interconnected, unfinished, and somewhat confusing. Both stories need additional experiments to make the manuscript(s) more complete. The results from PfFBP35 need more evidence for the proposed ribosome biogenesis pathway control. On the other hand, the results from the drug FK506 point to different targets with lower EC50, and other follow-up experiments are needed to substantiate the authors' claims.

      The strengths of the manuscript are the figures and experimental design. The combination of omics methods is informative and gives an opportunity for follow-up experiments.

      We thank the Reviewer for the evaluation of the manuscript. We apologize for the fact that the Reviewer found the manuscript to be inaccessible. We will use the comments as an incentive to restructure the manuscript and do our best to clarify the presentation, interpretation and conclusion of the presented data in the revised version. We believe that the FKBP35 data are strongly interlinked with the findings on FK506. We will emphasize these links more clearly and are convinced that the complementary nature of the datasets are a particular strength of the presented work.

      Reviewer #3 (Public Review):

      The study by Thommen et al. sought to identify the native role of the Plasmodium falciparum FKBP35 protein, which has been identified as a potential drug target due to the antiplasmodial activity of the immunosuppressant FK506. This compound has multiple binding proteins in many organisms; however, only one FKBP exists in P. falciparum (FKBP35). Using genetically-modified parasites and mass spectrometry-based cellular thermal shift assays (CETSA), the authors suggest that this protein is in involved in ribosome homeostasis and that the antiplasmodial activity of FK506 is separate from its activity on the FKBP35 protein. The authors first created a conditional knockdown using the destruction domain/shield system, which demonstrated no change in asexual blood stage parasites. A conditional knockout was then generated using the DiCre system. FKBP35KO parasites survived the first generation but died in the second generation. The authors called this "a delayed death phenotype", although it was not secondary to drug treatment, so this may be a misnomer. This slow death was unrelated to apicoplast dysfunction, as demonstrated by lack of alterations in sensitivity to apicoplast inhibitors. Quantitative proteomics on the FKBP35KO vs FKBP35WT parasites demonstrated enrichment of proteins involved in pre-ribosome development and the nucleolus. Interestingly, the KO parasites were not more susceptible to cycloheximide, a translation inhibitor, in the first generation (G1), suggesting that mature ribosomes still exist at this point. The SunSET technique, which incorporates puromycin into nascent peptide chains, also showed that in G1 the FKBP35KO parasites were still able to synthesize proteins. But in the second generation (G2), there was a significant decrease in protein synthesis. Transcriptomics were also performed at multiple time points. The effects of knockout of FKBP35 were transcriptionally silent in G1, and the parasites then slowed their cell cycles as compared to the FKBP35WT parasites.

      The authors next sought to evaluate whether killing by FK506 was dependent upon the inhibition of PfKBP35. Interestingly, both FKBP35KO and FKBP35WT parasites were equally susceptible to FK506. This suggested that the antiplasmodial activity of FK506 was related to activity targeting essential functions in the parasite separate from binding to FKBP35. To identify these potential targets, the authors used MS-CETSA on lysates to test for thermal stabilization of proteins after exposure to drug, which suggests drug-protein interactions. As expected, FK506 bound FKBP35 at low nM concentrations. However, given that the parasite IC50 of this compound is in the uM range, the authors searched for proteins stabilized at these concentrations as putative secondary targets. Using live cell MS-CETSA, FK506 bound FKBP35 at low nM concentrations; however, in these experiments over 50 ribosomal proteins were stabilized by the drug at higher concentrations. Of note, there was also an increase in soluble ribosomal factors in the absence of denaturing conditions. The authors suggested that the drug itself led to these smaller factors disengaging from a larger ribosomal complex, leading to an increase in soluble factors. Ultimately, the authors conclude that the native function of FKBP35 is involved in ribosome homeostasis and that the antiplasmodial activity of FK506 is not related to the binding of FKBP35, but instead results from inhibition of essential functions of secondary targets.

      Strengths:

      This study has many strengths. It addresses an important gap in parasite biology and drug development, by addressing the native role of the potential antiplasmodial drug target FKBP35 and whether the compound FK506 works through inhibition of that putative target. The knockout data provide compelling evidence that the KBP35 protein is essential for asexual parasite growth after one growth cycle. Analysis of the FKBP35KO line also provides evidence that the effects of FK506 are likely not solely due to inhibition of that protein, but instead must have secondary targets whose function is essential. These data are important in the field of drug development as they may guide development away from structure-based FK506 analogs that bind more specifically to the FKBP35 protein.

      Weaknesses:

      There are also a few notable weaknesses in the evidence that call into question the conclusion in the article title that FKBP35 is definitely involved in ribosomal homeostasis. While the proteomics supports alterations in ribosome biogenesis factors, it is unclear whether this is a direct role of the loss of the FKBP35 protein or is more related to non-specific downstream effects of knocking down the protein. The CETSA data clearly demonstrate that FK506 binds PfKB35 at low nM concentrations, which is different than the IC50 noted in the parasite; however, the evidence that the proteins stabilized by uM concentrations of drug are actual targets is not completely convincing. Especially, given the high uM amounts of drug required to stabilize these proteins. This section of the manuscript would benefit from validation of a least one or two of the putative candidates noted in the text. In the live cell CETSA, it is noted that >50 ribosomal components are stabilized in drug treated but not lysate controls. Similarly, the authors suggest that the -soluble fraction of ribosomal components increases in drug-exposed parasites even at 37{degree sign}C and suggests that this is likely from smaller ribosomal proteins disengaging from larger ribosomal complexes. While the evidence is convincing that this protein may play a role in ribosome homeostasis in some capacity, it is not sure that the title of the paper "FKBP secures ribosome homeostasis" holds true given the lack of mechanistic data. A minor weakness, but one that should nonetheless be addressed, is the use of the term "delayed death phenotype" with regards to the knockout parasite killing. This term is most frequently used in a very specific setting of apicoplast drugs that inhibit apicoplast ribosomes, so the term is misleading. It is also possible that the parasites are able to go through a normal cycle because of the kinetics of the knockout and that the time needed for protein clearance in the parasite to a level that is lethal.

      Overall, the authors set out to identify the native role of FKB35 in the P. falciparum parasites and to identify whether this is, in fact, the target of FK506. The data clearly demonstrate that FKBP35 is essential for parasite growth and provide evidence that alterations in its levels have proteomic but not transcriptional changes. However, the conclusion that FKBP35 actually stabilizes ribosomal complexes remains intermediate. The data are also very compelling that FK506 has secondary targets in the parasite aside from FKBP35; however, the high uM concentrations of the drug needed to attain results and the lack of biological validation of the CETSA hits makes it difficult to know whether any of these are actually the target of the compound or instead are nonspecific downstream consequences of treatment.

      We appreciate the detailed and valuable suggestions to improve the manuscript. We agree that CETSA could only identify potential targets of FK506 in the micromolar range, while FK506 showed a high affinity for FKBP35, consistent with earlier reports (2). We would however like to point out that FK506 kills P. falciparum at exactly these relatively high concentrations and not at those presumed from the high affinity interactions between FK506 and FKBP35. The relatively high FK506 concentration required to stabilize potential off target proteins is therefore not a concerning observation, but rather corroborates our conclusion that FK506 fails to inhibit the essential function of FKBP35 at concentrations that leave off targets unaffected. As mentioned in response to Reviewer 1, we will describe and discuss these data more clearly in the revised manuscript.

      We thank the Reviewer for pointing out the potential issues regarding the use of the term “delayed death phenotype”. We now refer to the FKBP35 phenotype as “delayed death-like” in the revised manuscript.

      We believe that follow-up work on specific FK506 CETSA hits is out of scope of the current and already quite complex manuscript.

      As mentioned in the response to Reviewer 1, we realize that the short title of the manuscript can be regarded as an overstatement. Again, this was clearly not our intention and we apologize that the Reviewers had to indicate this issue. While we believe that the message of the title holds true (see response to Reviewer 1), we recognize the misconception that might arise from it, which is why we propose the new title: “Genetic validation of _Pf_FKBP35 as an antimalarial drug target”.

      1. Kennedy K, Cobbold SA, Hanssen E, Birnbaum J, Spillman NJ, McHugh E, et al. Delayed death in the malaria parasite Plasmodium falciparum is caused by disruption of prenylation-dependent intracellular trafficking. PLoS Biol. 2019;17(7):e3000376.
      2. Kotaka M, Ye H, Alag R, Hu G, Bozdech Z, Preiser PR, et al. Crystal structure of the FK506 binding domain of Plasmodium falciparum FKBP35 in complex with FK506. Biochemistry. 2008;47(22):5951-61.
      3. Kasahara K, Nakayama R, Shiwa Y, Kanesaki Y, Ishige T, Yoshikawa H, et al. Fpr1, a primary target of rapamycin, functions as a transcription factor for ribosomal protein genes cooperatively with Hmo1 in Saccharomyces cerevisiae. PLoS Genet. 2020;16(6):e1008865.
    1. Author Response:

      The following is the authors' response to the original reviews.

      eLife assessment

      This study presents important findings regarding the quantification of dynamics in fish communities in changing ecosystems by combining a large-scale environmental DNA metabarcoding time series with novel statistical approaches. The methods are convincing, with controlled experiments, thorough statistical analyses, and a substantial dataset covering two years of detailed observation, which can provide sufficient power to detect fine-scale ecological interactions. This work is relevant for informing future research on assessing community stability under climate change.

      Thank you so much for your careful evaluation of our manuscript. We are very pleased to hear that you found our study important. We have revised our manuscript according to the helpful comments to further improve our manuscript.

      Reviewer #1 (Public Review):

      […] Their work provides a highly relevant approach to perform species-interaction strength analysis based on eDNA biodiversity assessments, and as such provides a research framework to study marine community dynamics by eDNA, which is highly relevant in the study of ecosystem dynamics. The models and analytical methods used are clearly described and made available, enabling application of these methods by anyone interested in applying it to their own site and species group of interest.

      Thank you so much for your time and effort to evaluate our manuscript. We are very pleased to hear that you found our study interesting. We have further revised the manuscript according to your comments and hope that the revised manuscript is now better than the original one.

      Strengths: The authors have a study setup that is suitable to measure the effects of temperature of the eDNA diversity, and have taken a large number of samples and all appropriate controls to be able to accurately measure and describe these dynamics. The applied internal spike in to enable relative eDNA copy number quantification is convincing.

      We are happy to hear that you found the study design and the method to estimate eDNA copy number are suitable and convincing.

      Weaknesses: The authors aim to study the relationship between species interaction strength and ecosystem complexity, and how temperature will influence this. However, there is only limited ecological context discussed explaining their results, and a link with climate change scenario's is also limited. A further discussion of this would have strengthened the manuscript.

      Thank you so much for the comment. We have added discussion about how our study contributes to understanding fish community assembly process and predicting the community-level response under ongoing climate change. We have added one subsection, "Implications for fish community assembly and the effect of global climate change ", at L679. As for the ecological discussion for each specific fish-fish interaction, we provided this in Supplementary file 1c.

      The authors were able to find a correlation between water temperature and interaction strengths observed. However, since water temperature is dependent on many environmental variables that are either directly or indirectly influencing ecosystem dynamics, it is hard to prove a direct correlation between the observed changes in community dynamics and the temperature alone.

      Thank you for pointing this. We have discussed the possibility of the effects of other environmental variables (e.g., oxygen) and how we could overcome this issue at L661. Some of the sentences were originally in the subsection " Interaction strengths and environmental variables ", but were moved to the subsection " Potential limitations of the present study and future perspectives".

      Reviewer #2 (Public Review):

      In this work Ushio et al. combine environmental DNA metabarcoding with novel statistical approaches to demonstrate how fish communities respond to changing sea temperatures over a seasonal cycle. These findings are important due to the need for new techniques that can better measure community stability under climate change. The eDNA metabarcoding dataset of 550 water samples over two years is, I feel, of sufficient scale to provide power to detect fine-scale ecological interactions, the experiments are well controlled, and the statistical analysis is thorough.

      Thank you so much for your time and effort to evaluate our manuscript. We are happy to hear that you found our study technically sound and important. We have revised the manuscript according to your comments to improve our manuscript further.

      The major strengths of the manuscript are: (1) the magnitude of the dataset, which provides densely replicated sampling that can overcome some of the noise associated with eDNA metabarcoding data and scale up the number of data points to make unique inferences; (2) the novel method of transforming the metabarcode reads using endogenous qPCR "spike-in" data from a common reference species to obtain estimates of DNA concentration across other species; and (3) the statistical analysis of time-series and network data and translating it into interaction strengths between species provides a cross-disciplinary dimension to the work.

      Thank you for your positive comments. Regarding (1), we are very pleased to hear that (1) our intensive and extensive water sampling, (2) our method for using the common fish species eDNA as "spike-in," and (3) our nonlinear time series analysis were positively evaluated.

      I feel like this kind of study showcases the power of eDNA metabarcoding to answer some really interesting questions that were previously unobtainable due to the complexities and cost of such an exercise. Notwithstanding the problems associated with PCR primer bias and PCR stochasticity, the qPCR "spike-in" method is easy to implement and will likely become a standardised technique in the field. Further studies will examine and improve on it.

      We must admit that our endogeneous "spike-in" method does not overcome all problems associated with PCR. However, we agree with you and believe that we are heading in a correct direction. The method

      does not require the addition of external internal standard DNAs and enables post-hoc evaluation of eDNA absolute concentrations. Although this approach requires an additional experiment (qPCR), the method may be an alternative for quantifying eDNA concentrations.

      Overall I found the manuscript to be clear and easy to follow for the most part. I did not identify any serious weaknesses or concerns with the study, although I am not able to comment on the more complex statistical procedures such as the "unified information-theoretic causality" method devised by the authors. The section on limitations of the study is important and acknowledges some issues with interpretation that need to be explained. The methods, while brief in parts, are clear. The code used to generate the results has been made available via a GitHub repository. The figures are clear and attractive.

      We are very happy to hear that you found our manuscript clear and not containing any serious weakness.

      Reviewer #1 (Recommendations For The Authors):

      This is a very nice manuscript discussing highly relevant methods to use eDNA analysis to study interactions in marine ecosystems. There are some minor concerns that we will address below:

      - As already mentioned above, based on the statements in the introduction we expected a very elaborate discussion section concerning the ecological interaction observed between species. This is however missing, and a more extensive general discussion of the biological interactions would be appreciated, either based on existing literature, or by suggesting further experiments. Alternatively, the claims made in e.g. line 124-128 (Overcoming these difficulties....) could be amended so this expectation is not raised.

      Thank you so much for the comment. As answered in the response above, we have added discussion about how our study contributes to the fish community assembly process and predicting the community-level response under ongoing climate change at L679.

      Specifically, we argued that our study provides a piece of evidence that temperature exerts influences on fish-fish interactions under field conditions at a relatively short time scale (weeks to months). We suggested that temperature effects on fish community assembly involve effects at different time scales, and thus, integrating results from different temporal (and spatial) scales are necessary to understand the fish community assembly process in nature. As stated above, we provided the detailed ecological discussion for each specific fish-fish interaction in the Supporting Information.

      - A lot of negative controls were taken and described in the material & methods. However, there is no clear mention of what was done with the outcome of these negative controls. How did the results of the negative controls influence your analysis? Or were they all completely negative?

      Thank you for pointing this out. The negative controls produced negligible reads (177 ± 665 reads [mean ± S.D.]), which accounted for ca. 0.1% of the positive sample reads. Moreover, all the reads were assigned to non-target taxa, such as fish species that had never been observed in the study region and freshwater fish species. Therefore, we conclude that any contaminations in our experiments were negligible, and we discarded the sequence reads from the negative control samples. We have explained this in L533–L539 in the main text.

      - Line 423 states: "..suggesting that weak interactions are key to the maintenance of species-rich communities." We are wondering if this can be stated like this, as it seems the other way around would also be true, since in a species rich community it can be expected that most interactions are weak?

      Thank you for pointing this. out We agree that there is a possibility that the high species diversity could be a cause of weak intearctions. To clarify this, we have revised the sentence as follows in L568: " ...suggesting that understanding the causes and effects of weak interactions is key to understanding the maintenance of species-rich communities. "

      - There is a correlation between DNA concentration and temperature (e.g. shown in fig. S2b). We wondering what could be an argument to not correct for this temperature effect on eDNA concentrations (as now described) or if it would be better to apply a correction factor for this, as it is also shown that there is a correlation between DNA concentration and interaction strengths.

      In the unified information theoretic (UIC) analysis, we took the effect of temperature into account if temperature had statistically clear influence on eDNA dynamics of a particular fish species (L439). This means that temperature was included as a conditional variable in the calculation of TE (i.e., Zt in Eqn. [1]). Other environmental variables were also included if they had statistically clear influence. Similarly, in the MDR S-map, we included temperature or other environmental variables as conditional variables if they had statistically clear influence on eDNA dynamics of a particular fish species. We explained this in L479.

      - The models used for the interaction dynamics calculations are extensively discussed in this manuscript, although these details are also present in the original papers describing these models, and therefore the manuscript could be shortened by removing some of this explanation.

      Thank you for your suggestion. As you understood, the details of the method (S-map and MDR S-map) are available in Sugihara (1994), Chang et al. (2021), and elsewhere. However, we have kept the explanation so that readers who are not familiar with the methods can briefly understand the methods without the needs to read the detail of the previoius studies.

      Reviewer #2 (Recommendations For The Authors):

      L50-L72: I feel like the abstract could be snappier, i.e. quicker to read with less detail. Consider reducing it a little.

      Thank you for your suggestion. We have deleted some redundant phrases and shortened the abstract a little.

      L173-L176: I don't understand exactly what is suggested here. Perhaps rephrase?

      We have revised the sentence as follows (L165): " As our eDNA time series was taken twice a month, the interactions detected should also have the same time scale (e.g., the interactions detected may cause changes in the population size at the same time scale), which means that we tend to focus on behavior-level interactions (e.g., schooling) rather than birth-death process in the present study (except for predation)."

      L228: How many PCR replicate reactions were undertaken per sample?

      We performed eight technical replicates for the same eDNA template. This information is described in the third paragraph of the section "Paired-end library preparation and MiSeq sequencing." This section has been moved from the previous supplementary methods to the main text in the revision.

      L236: There is no mention later of how these blanks are used to clean up or filter the dataset from the effects of contamination. Consider adding this information.

      Thank you for pointing this. As in the responses above, we have described the negative controls in L533–L539 in the main text. The negative controls generated negligible reads, so we simply discarded the sequence reads.

      L252-L253: "Primer sequences were removed from merged reads and reads without the primer sequences underwent quality filtering"? Wouldn't all of the reads not have primers after the primers were trimmed off? Or is something else intended here?

      All primer sequences were removed after merging the paired- end reads (see "Sequence analysis"). There is no specific reason for this process, and we think that the primer removal before merging the paired- end reads will generate the same results.

      L264-L265: "To refine the above taxon assignments". I assume because there were lots of assignments to species that were not known from the study area? Explain why this was done.

      At present, the reference sequences are available for about 70% of 4,500 fish species in Japan. However, due to the unknown degree of intraspecific variation, using a uniform threshold of 98.5% to delineate species can result in over-splitting or over-clustering MOTUs. To solve this issue, the manual refinement of the taxon assignments was performed based on the phylogenetic tree. This has been explained in L335.

      L274: More details of the qPCR assay are required, or a citation of previous study or supporting information.

      The details of the qPCR assay are provided in the secion "Quantitative PCR and estimation of DNA copy numbers." This section has been moved from the previous supplementary methods to the main text in the revision.

      L327: Explain further how seasonality was treated here? This is an important part of the study, so deserves further attention.

      We included water temperature (if it had statistically clear influence on fish eDNA dynamics) as a conditional variable z(t) in the calculation of TE, and this took the effect of the seasonality in detecting causation into account. We have described this in L436–444.

      L407: Consider giving the code repository a DOI to cite.

      We have archived the analysis codes at Zenodo and provided the DOI in L39 and L521.

      L411: How many MiSeq runs exactly?

      We performed 21 MiSeq runs (often with other eDNA samples). We have described this in the main text (L299).

      L411: What proportion of your total sequencing data were assigned to fishes? This is a useful statistic to compare methods between studies.

      About 98% of the total sequence reads was assigned to fish. We have described this in the main text (L528).

      Figure 2: There does not appear to be a key to the color-coded species ecologies.

      We have added a legend for the fish ecology in Figure 2.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We thank the editor and reviewers for their careful consideration of our manuscript and very helpful feedback, which guided us in improving our manuscript. We would like to highlight three main areas of improvement in this version:

      • Statistical rigor: we have added more detail to justify our 2% cutoff for GLM variable coding, implemented stricter shuffling and cutoffs for value and history coding, and provided more information on the statistical significance of our pairwise comparisons across regions and groups. These go well beyond the field standard for identifying and comparing neural encoding of task features.

      • Identification of value coding: we have implemented reviewer suggestions about kernel regression and value coding shuffles, providing even stronger evidence that value signaling among cue neurons is more prevalent than expected by chance, more prevalent than any other cue coding patterns, and present in all recorded regions. The rigor of this analysis is only possible due to our unique task design with 6 cues across two stimulus sets, and our consideration of 153 possible coding models exceeds standard practice for identifying value signals. We now implement population decoding, as well, providing additional support for a robust and widely-distributed value code.

      • Stability of value code: we have updated our terminology to better highlight that the value signals in our imaging dataset are indeed identified across days, and we add new analysis to show conservation of value-like signals across training days.

      Thanks to the reviewers’ suggestions, our manuscript now has substantially stronger support for the presence of stable and distributed cue value signaling. We address the specific points below.

      Excerpts from the Consensus Public Reviews:

      One limitation is the lack of focus on population-level dynamics from the perspective of decoding, with the analysis focusing primarily on encoding analyses within individual neurons.

      To address this limitation, we now include population-level decoding analysis (new panels, Figs. 3G-H, 4E). This new analysis reveals that, although value neurons can be used to decode cue identity on par with other cue cells, value neurons are more accurate at predicting the value of held out cues (never seen by the model), highlighting the utility of a value signal as a way to consistently represent the value of different stimulus sets.

      Moreover, we find comparable value prediction performance when using value neurons from each region (Fig. 4E), adding more support for the similarity of this signal across regions:

      The authors use reduced-rank kernel regression to characterize the 5332 recorded neurons on a cell-by-cell basis in terms of their responses to cues, licks, and reward, with a cell characterized as encoding one of these parameters if it accounts for at least 2% of the observed variance. At least 50% of cells met this inclusion criterion in each recorded area. 2% feels like a lenient cutoff, and it is unclear how sensitive the results are to this cutoff, though the authors argue that this cutoff should still only allow a false positive rate of 0.02% (determined by randomly shuffling the onset time of each trial.)

      We have provided more information about the 2% cutoff in a new figure, Figure 2-figure supplement 3. We reanalyzed the false positive rate and found that at a cutoff of 2% (but not 0.5% or 1%) there were no false positives (Figure 2-figure supplement 3B). Thus, we are confident that all neurons contain true task-related signals. Moreover, we found that the pattern of results remains largely unchanged as we change the cutoff over a range from 0.5% to 5%. With more stringent cutoffs, we begin to lose neurons with robust task-related responses (Figure 2-figure supplement 3E), so we continue to use the 2% cutoff in this version of the manuscript.

      First, they show that the correlation between cell responses on all periods except for the start of day 1 is more correlated with day 3 responses than expected by chance (although the correlation is still quite low, for example, 0.2 on day 2).

      We agree that a correlation of 0.2 does not seem like a large effect, however the variability in neuronal responses and noise level of the measurement enforce a ceiling that we can estimate by predicting data from the same session that it was trained on. We have replotted these data (new panel Fig. 7G) with the correlation normalized to the cross-validated performance on the training day’s data. This shows that the models do about half as well in session 1 and session 2 compared to session 3. The original plot is in a new supplementary figure, Figure 7-figure supplement 1B.

      To further emphasize the similarity across days, we have added new panels (Fig. 7E and Figure 7-figure supplement 1A) showing that, across mice, a typical neuron was more correlated with its own activity on the subsequent day than with ~90% of the other neurons (shuffle controls, 50%).

      Second, they show that cue identity is able to capture the highest unique fraction of variance (around 8%) in day 3 cue cells across three days of imaging, and similarly for lick behavior in lick cells and cue+lick in cue+lick cells. Nonetheless, their sample rasters for all imaged cells also indicate that representations are not perfectly stable, and it will be interesting to see what *does* change across the three days of imaging.

      We agree that the representations are not perfectly stable and that is an interesting point of further investigation. One difference we did observe is increased cue coding across training (Figs. 6H, 7H).

      Importantly, the authors do not present evidence that value itself is stably encoded across days, despite the paper's title. The more conservative in its claims in the Discussion seems more appropriate: "these results demonstrate a lack of regional specialization in value coding and the stability of cue and lick [(not value)] codes in PFC."

      Due to confusing terminology on our part, the reviewers were mistaken about the timing of the experiment where we assess the stability of value coding. In the imaging sessions, odor sets were always presented on separate days. Thus, when we identify value coding in our imaged population, it is across two consecutive days with different odor sets, which is in itself evidence of a stable value code. We have updated our terminology and the text to make this clearer. We also added a new set of plots (Fig. 8H-I) showing the conservation of value-like signaling in cells we tracked across the first three sessions of odor set A, and, as above, that the correlation of these neurons across days is greater than expected by chance. These analyses lend further support to the stability of the value signal.

      Additional technical comments:

      1) The "shuffle #33" in figure 3B is confusing. The fit kernel in this shuffle shows that the "high" and "medium" responses increase above the pre-stimulus baseline. The "high" response is a combination of set 2 CS+ and set 1 CS50, both of which strongly suppressed the cell's firing over the 2.5-second window shown. Why then does the cue kernel fit these two trials predict an increase in firing rate above baseline at the 2.5-second time point? Is it a consequence of the reduced rank regression process, and if so, how? This strange-looking fit that does not well capture the response of the original cell makes me worry that the high fraction of identified "value" cells may be due to some constraint on the shuffle fits that leads them to often perform poorly.

      To address this concern, we refit the value shuffle and its models using a full kernel regression model (rather than reduced ranks). It does improve the appearance of the kernel fits (updated Fig. 3B), and we now use this new approach when fitting cue coding models in the revised manuscript. The regularization inherent in reduced rank constrains the shape of the cue kernel somewhat, which contributed to the shape of the fits (although this did not negatively impact the variance explained); however, because of the importance of the shape of these alternative cue coding models to the interpretation of the analysis, we agree with the reviewers that this was worth improving. The main constraint on the value model and its shuffles, however, is that all cues must use the same template, scaled according to particular values assigned to each cue in each shuffle, which will doubtless lead to compromised (and strange-looking) fits when the shuffled values do not match the ranking of neuron’s cue activity. Critically, this constraint is applied equally to the value model and all the shuffles and would not bias the fits of any one model.

      2) The "shuffle" condition when testing for value cells always assumes two high responses, two medium responses, and two low responses. This strategy doesn't account for cells that respond to only a subset of cues, as one might expect in a sparse-coding olfactory region. We suggest adding a set of shuffles where responses are split into two groups, with either 3 conditions per group or 2 in one group and 4 in the other.

      We appreciate this valuable suggestion. We added all permutations of models with high responses to 6, 5, 4, 3, 2, or 1 odor cue to the analysis. We still find that the value model is the most frequent best model, displayed in new panels Fig. 3C-D and Figure 3-figure supplement 1A-B. The additional models allowed us to identify other neurons with cue activity best fit by models highly correlated with the ranked value model, which we term “value-like” neurons, including most neurons previously described as “trial-type” neurons. All 153 models and the fraction of neurons best fit by each one are depicted in Figure 3-figure supplement 1.

      After implementing the changes to both the method of model fitting (full kernel regression, as noted above) and the possible alternative models, the distribution of value cells has changed slightly. All regions contain value cells, supporting our original conclusion that the value signal is distributed, but there is slight enrichment in PFC when combining these five regions together (Fig. 4A).

      We have updated the conclusions of the paper accordingly:

      Introduction: “Unexpectedly, in contrast to the graded cue and lick coding across these regions, the proportion of neurons encoding cue value was more consistent across regions, with a slight enrichment in PFC but with similar value decoding performance across all regions.”

      Results: “Interestingly, the frequency of value cells was similar across the recorded regions (Fig. 4A). Indeed, despite the regional variability in number of cue cells broadly (Fig. 2F-G), there were very few regions that statistically differed in their proportions of value cells (Fig. 4A, Figure 4-figure supplement 1). Overall, though, there were slightly more value cells across all of PFC than in motor and olfactory cortex (Figs. 4A, Figure 4-figure supplement 1). Although there were the most cue neurons in olfactory cortex, these were less likely to encode value than cue neurons in other regions (Figure 4-figure supplement 2). Value-like cells were also widespread; they were less frequent in motor cortex as a fraction of all neurons, but they were equivalently distributed in all regions as a fraction of cue neurons (Fig. 4B, Figure 4-figure supplement 1, Figure 4-figure supplement 2).”

      Discussion: “In contrast to regional differences in the proportion of cue-responsive neurons, cue value cells were present in all regions and could be used to decode value with similar accuracy regardless of region.” AND “The distribution of cue cells with linear coding of value was mostly even across regions, with slight enrichment overall in PFC compared to motor and olfactory cortex, but no subregional differences in PFC. Importantly, cue value could be decoded from the value cells in all regions with similar accuracy.”

      3) On pages 11-12, the authors write "value coding is similarly represented across the regions we sampled." I feel this isn't quite what was shown: the authors have shown that all recorded regions contain a roughly comparable number of individual cells that are modulated by value, i.e. "value cells". However, the authors also showed that some recorded cells have mixed selectivity for value and other factors- it is possible that these mixed selectivity cells do vary between brain regions in their quantity or degree of value coding. Regions could potentially also vary in the dynamics of their value response, or in the trial-to-trial variability in the activity of value cells. I suggest the authors revise their original statement, for example by writing "we find a similar proportion of value-specific cells across the regions we sampled."

      We thank the reviewer for carefully reviewing our claims. In addition to showing similar proportions of value cells, we also show that the value-related activity is similar (by plotting the first principal component of value and value-like cells, Fig. 4C-D) and that cue value could be decoded from the value cells in all regions with similar accuracy (new panel, Fig. 4E). We have updated the text to more accurately reflect these observations:

      “In contrast to regional differences in the proportion of cue-responsive neurons, cue value cells were present in all regions and value could be decoded from them with similar accuracy regardless of region.”

      4) We appreciate the authors' idea to introduce a history term to their value cell model but worry that the distinction between history-dependent value cells and lick/cue+lick cells in Figure 4 has gotten fuzzy. At this point, history-dependent value cells are the product of a set of steps: 1) they are identified as "cue" neurons because the cue type accounts for at least 2% of the variance, while the lick rate does not, then 2) among the cue neurons, a subset are identified as "value" neurons because their activity scales with the cue type across both odor sets, and then 3) among value neurons, the "history-dependent" value neurons show a response rate that scales with a model that predicts anticipatory licking. Our concern comes down to this: your conclusion that these cells are not licking cells hinges on the initial point that licking does not account for 2% of the observed variance in cell activity. But if you had dedicated an equal number of model parameters and selection steps to your licking model, might it still not turn out that a licking model predicts their activity as well as the history-dependent cue value model?

      What would bolster our confidence here would be a comparison of variance explained: if you compare the predictions of the history-dependent value-encoding cue neuron model to the predictions of a simple lick neuron model, how much better does the former predict what the cells are doing? Are all those extra parameters and selection steps really contributing to an improved description of how neurons will respond?

      First, we would like to emphasize that “cue” neurons, as a population, have no discernible modulation by licks, which can be seen when comparing their activity on CS50 trials with and without reward, when licking clearly varies (Figure 2-figure supplement 2D). A new panel, Figure 5E now depicts the improvement in variance explained by the history model over a lick only model. The improvement is robust and universal. This is because even though the number of anticipatory licks per trial is used to fit the weights of our trial value model, these cue neurons have temporal dynamics that are more consistent with cue presentation than the presence of licks. We explain more below in our response to point 7.

      5) The paper's title claims that the coding of cue value is both stable and distributed. While the point for value coding being distributed is well supported with analysis, the claim that cue value coding is "stable" is weaker. The authors show in Figure 6 that cue identity best accounts for unique variance among cue cells across three days of imaging, but it does not follow that cue value is similarly stable. Figure 7 shows that on day 3 of imaging, the two odor sets have similar encoding- but this analysis is only performed within day 3, not across days. Why not examine unique variance among value cells over days, as was done for a cue, lick, and both cells in Figure 6G? That seems to be an important missing piece and a logical next step. The Discussion is more conservative in its claims- "these results demonstrate a lack of regional specialization in value coding and the stability of cue and lick [(not value)] codes in PFC." But this subtlety is missing from the paper's title and introduction.

      First, an important correction. “This analysis is only performed within day 3, not across days,” is a misunderstanding of our experiment brought on by our confusing terminology, which we have updated. This figure (now Figure 8) analyzes two sessions performed on consecutive days: Odor Set A day 3 (A3) and Odor Set B day 3 (B3), which constitute days 5 and 6 of our experiment (see updated panels Fig. 1B, 6A). This is why identifying value signaling across both of these sessions is justification for a stable code; by definition, it was present on two consecutive days.

      A limitation of our imaging experiment prevents us from evaluating value signaling in each individual session (like we did for cues and licks). For the imaging, we only presented one odor set per session (unlike the electrophysiology, where odor sets were presented in blocks). Our method of identifying value signals relies on two odor sets, so we cannot quantify it on a per session basis in the imaging. However, to address this as best we could, we identified CS+-preferring cue cells in session A3 (odor set A day 3) and plotted them for sessions A1-A3 (Fig. 8H), which reveals a conserved value-like signal across days. We also found that the correlation of the activity of these neurons across days was higher than expected by chance (Fig. 8I).

      We have edited the discussion text about coding stability, adding in more detail and caveats:

      “Previous reports have observed drifting representations in PFC across time (Hyman et al., 2012; Malagon-Vina et al., 2018), and there is compelling evidence that odor representations in piriform drift over weeks when odors are experienced infrequently (Schoonover et al., 2021). On the other hand, it has been shown that coding for odor association is stable in ORB and PL, and that coding for odor identity is stable in piriform (Wang et al., 2020a), with similar findings for auditory Pavlovian cue encoding in PL (Grant et al., 2021; Otis et al., 2017) and ORB (Namboodiri et al., 2019). We were able to expand upon these data in PL by identifying both cue and lick coding and showing separable, stable coding of cues and licks across days and across sets of odors trained on separate days. We were also able to detect value coding common to two stimulus sets presented on separate days, and conserved value features across the three training sessions. Notably, the model with responses only to CS+ cues best fit a larger fraction of imaged PL neurons than the ranked value model, a departure from the electrophysiology results. It would be interesting to know if this is due to a bias introduced by the imaging approach, the slightly reduced CS50 licking relative to CS+ licking in the imaging cohort, or the shorter imaging experimental timeline.

      The consistency in cue and lick representations we observed indicates that PL serves as a reliable source of information about cue associations and licking during reward seeking tasks, perhaps contrasting with other representations in PFC (Hyman et al., 2012; Malagon-Vina et al., 2018). Interestingly, the presence of lick, but not cue coding at the very beginning of the first session of training suggests that lick cells in PL are not specific to the task but that cue cells are specific to the learned cue-reward associations. Future work could expand upon these findings by examining stimulus-independent value coding within session across many consecutive days.”

      6) Considering licking as the readout of value has pros and cons. Anticipatory licking may be correlated with subjective value, but certainly nonlinearly. After all, licking has a ceiling and floor (bounded rate from 0->10 Hz). Are results consistent with the objective value of the cues (which are 0, .5, 1)? Which measure better explained the data?

      Thanks to this important suggestion, we tried fitting another set of models with 0, 0.5, 1 as the cue values. We found the same pattern of results. Overall, the fits were slightly better with 0, 0.5, 1, with 50.6% of potential value neurons (found with either version of the model) better fit by 0, 0.5, 1, and with mean variance explained of 0.265 with 0, 0.5, 1 (compared to 0.264 with the anticipatory lick values). Without strong evidence to choose one model over the other, we decided to use 0, 0.5, 1 because it exactly reflects reward probability, and is more objective as the reviewer notes, whereas before we relied on a noisier estimate of subjective value. We have changed the text accordingly.

      7) How can a neuron encode "Cue" in a value-dependent manner and not also encode licking, given they are correlated? If the kernel window includes anticipatory licking, and anticipatory licking is by definition related to value, then how could a licking kernel not at least explain some of that neuron's variance?

      The trial estimates of value from the lick linear regression are derived from typical licking patterns across all sessions and do not incorporate the particular number of licks on a given trial or the latency of licking relative to cue onset. Although the trial value model is predicting the number of licks on each trial, it only uses cue identity and reward history to make its prediction, so it is not tightly correlated with the stochastic licks on a given trial. And, importantly, we input the trial value as a cue kernel spanning the entire cue period, whereas lick kernels, per our definition, are restricted to a window around when licking occurs, which generously encompasses neural signals relating to both lick initiation and feedback. Licking can explain some of value and (history) neurons’ variance, which you can see in our new panel Fig. 5E, but it does not contribute any unique variance to the model. That is, with or without licks, the model performs just as well, so the activity of the neuron does not track any of the unique features of licks over cues (like whether or not the mouse licked on trial, when the mouse started licking on a given trial). Without cues, however, the model does worse, which means that the neuron’s activity is modulated by cues separately from when the mouse is licking. Thus, we can conclude the neuron encodes cues, but we have no evidence the neuron encodes licks (beyond the extent to which licks are correlated with cues). In our example fit in 5E, you can see how, although licks track value, they cannot recapitulate the temporal dynamics of this cue neuron. We added more description of this distinction in the manuscript.

      8) The ordering analysis with the 89 permutations is very nice for showing across the population the "value ordered" gains are the best explanation of the neural activity. However, it doesn't tell you that any one neuron significantly encodes value, or the strength of this effect if they do. For the former, they could compare to a null distribution of shuffled order of neural vs CS data, and consider neurons for which model is better than chance ( a .05 FDR on a null distribution would be appropriate). This is important for supporting their conclusion of the fraction of neurons encoding value for each region.

      In fact, with so many alternative models, the probability of a neuron being best fit by the value model but not encoding value above chance is extremely low. To confirm this, we ran the reviewer’s suggested shuffle analysis, and found that 100% of value neurons performed above the 0.05 FDR. We have added this result to the methods:

      “To verify the robustness of value coding in the neurons best fit by the ranked value model, we fit each of those neurons with 1000 iterations of the cue value model with shuffled cue order to create a null distribution. The fits of the original value model exceeded the 98th percentile of the null for all value neurons.”

      9) Similarly the 65% cutoff for trial history relative to shuffled is unusually low and therefore not convincing these neurons significantly encode the value. Usually, 95% or 99% is selected to give you a more standard significance criterion (FDR).

      We have changed the cutoff to 95%. We originally selected 65% because neurons in the 65% to 95% range had clear history effects, especially at the population level, but we appreciate the importance of rigorous selection. Note this shuffle is very strict, preserving CS+, CS50, CS- ranking but shuffling within-cue fluctuations in value due to trial history. With the stricter value and history shuffling, we now observe fewer history neurons, and they are most prevalent in PFC (Fig. 5I)

      10) "Regions with non-overlapping CIs were considered to have significantly different fractions of neurons of that coding type." This isn't a statistical test. Confidence intervals are not the same as significance.

      We now perform Bonferroni-corrected pairwise contrasts between all regions in the generalized linear mixed effects model. We added the p-values for all the comparisons that previously relied on non-overlapping confidence intervals in supplementary tables.

      Minor comments:

      The methods are hard to read. Most of the information seems to be there but in general, paragraphs need to be read over multiple times for meaning to emerge.

      We have edited for clarity, and if there are particular sections that remain unclear, we would be happy to know which ones.

      Why is there a block predictor in the encoding model?

      Because not every odor is present in every block, we did not want our models to use the specific cue predictors to try to account for differences in baseline activity that naturally occur across the session. Thus, each of the six blocks has its own predictor that serves as a constant that can adjust for changing baseline firing rate. Importantly, the block predictor simply marks the passage of blocks and contains no information about the odors present. We added more information about this to the methods:

      “For electrophysiology experiments, the model also included 6 constants that identified the block number, accounting for tonic changes in firing rate across blocks. Because not all cues were present in every block, this strategy prevented the cue kernels from being used to explain baseline changes across blocks.”

      Did you use an elastic net rather than a lasso? What is the alpha parameter for lasso?

      We used an elastic net with alpha = 0.5. We added this information to the methods.

      Figure 3F legend doesn't seem to match the figure.

      Corrected.

    1. Author Response:

      The following is the authors' response to the original reviews.

      Consolidated response to public comments:

      We are grateful to the reviewers for their careful examination of our manuscript and for their insights for improving our work. We appreciate that they recognize the potential of the TARDIS approach for diverse transgenesis applications.

      We address two primary concerns that the reviewers raise. First is a concern that this approach is not as innovative as stated. We acknowledge that our work builds upon previous studies in the field, such as those by Nonet, Mouridi et al., with Malaiwong coming after our initial preprint. However, we believe that our approach offers a unique contribution, in that prior work does not provide a protocol or process to provide large-scale multiplexed transgenesis. Specifically, our introduction of large sequence library arrays (TARDIS Library Arrays or TLAs). While high throughput multiplexed transgenesis is discussed in Nonet & Mouridi manuscripts, it is never demonstrated. It is the combination of library construction, heritable transmission of the library itself, and then induced transgenesis of library components at a defined location within single individuals that makes this approach particularly useful.

      Second, there were concerns that we have not demonstrated that this approach will work beyond C. elegans. We agree that our discussion of the potential application of TARDIS beyond C. elegans is speculative at this point. Our intention was to highlight the potential for future development and application in other systems. In some cases, large integrations into the genome are possible, such as in the case of H11 locus in mice, which could provide a means to inherit a sequence library. We are hopeful that our success in C. elegans will inspire work in other systems. The motivation for this will naturally depend on the usefulness of actual TARDIS implementations, which will be forthcoming in due course.

      Reviewer #1 (Recommendations For The Authors):

      1. Section titled "Integration from TARDIS array to F1" beginning on line 161 has some missing details that make it difficult to follow. Many of those details are present in the following section titled "Generation and Integration of TARDIS promoter library", but should have been present sooner.<br /> a. How many barcodes were in the array in line PX786?<br /> b. Clarify the use of G-418, heat shock, hygromycin, etc. in this paragraph.<br /> c. Please clarify that the L1 death is due to selection with G-418 - "We found that a portion of the initially plated worms die, likely due to lack of array inheritance." is confusing unless you add that they are selected in this step.<br /> d. "These results suggest that approx. 100-200 worms need to be heat shocked to obtain an integrated line" - the math actually looks like 200-300, and this would be to get a single integrant.<br /> 2. In general, the barcoding study and results reported here read like a teaser/proof-of-concept but do not really robustly demonstrate the application of the method for barcoding and tracing individual lineages in a population of C. elegans. How many barcodes were in the array, and how many ended up in F1s? Would one need to screen for duplicate barcodes after integration?<br /> 3. The promoter library study is impressive but again, rather limited.<br /> 4. The Discussion section about extending this technology to other systems is fairly balanced, acknowledging the limitations that would need to be overcome. The language in the abstract and introduction is less balanced and oversells the current translation of this approach to systems outside C. elegans.

      Reviewer #2 (Recommendations For The Authors):

      As I mentioned in the Public Review, I appreciate the design of the selection markers for integration. However, I do not see a major advance in the field. The use of barcoding of individuals to address a biological question would change that impression.

      Regarding the integration of promoters, I think this is something that anyone could address in diverse forms using existing knowledge.

      Suggestions:<br /> - Use one or two more landing pads for barcoding of animals and check numbers, efficacy, enrichments..etc. About 500 sequences overrepresented may be too much for future applications;<br /> - Increase the number of landing pads for inserting promoters. Genomics context matters and this could help to have a better summary of the real expression patterns driven by the promoter of interest;<br /> - Other references about landing pads would be Vicencio et al, Genetics 2019, and Nonet microPublication Biology 2021.

      In addition to the general comments, the reviewers provided useful suggestions to the text that we have used to clarify the manuscript.

    1. Author Response

      We believe that these findings make a significant contribution to the field of CNS endothelial cell biology and blood-brain barrier. We thank you for your time and consideration.

      Reviewers' 1 and 2 concern on endothelial cells (ECs) transcription changes on culture.

      We would like to express our gratitude to the reviewers for their critical comments. We are pleased to address the concerns raised by performing FACS sorting of the CNS ECs from E-13.5 and adult brain. However, it is important to note that both E-13.5 ECs and adult ECs were cultured in the same media. It is worth mentioning that this work was initiated in 2017, whereas the article mentioned by Reviewer 1 was published in 2020. We went through a series of standardization steps before identifying the Corning endothelial cell culture media (Cat#355054) with 2% FCS as the optimal medium for preserving EC identity in culture. Conversely, if PromoCell media (C-22110) is used, a decrease in the Wnt pathway can be observed, and the use of 5% FCS enhances the Wnt pathway in E-13.5 ECs. The article mentioned by Reviewer 1 (https://elifesciences.org/articles/51276) did not take these differences in culture media into account. Additionally, we did not employ puromycin for obtaining pure ECs, and the ECs were cultured for a maximum of 8 days. Our in vitro study serves as a model for identifying the epigenetic regulators HDAC2 and PRC2 as controllers of BBB gene transcription, which is subsequently validated in an in vivo model.

      Reviewer-1 Comment 2- An additional concern is that for many experiments, siRNA knockdowns are performed without validation of the efficacy of the knockdown

      In the revised version of this manuscript, we will include validation results to demonstrate the effectiveness of siRNA knockdown experiments.

      Reviewer-1 Comment 3- Some experiments in the paper are promising, however. For example, the knockout of HDAC2 in endothelial cells resulting in BBB leakage was striking. Investigating the mechanisms underlying this phenotype in vivo could yield important insights.

      We appreciate your positive comment. The in vivo HDAC2 knockout experiment will serve as a validation of our in vitro findings, indicating that the epigenetic regulator HDAC2 can control the expression of endothelial cell (EC) genes involved in angiogenesis, blood-brain barrier (BBB) formation, and maturation. We are actively working on this model, and we plan to publish additional molecular data on epigenetically regulated CNS vascular development and maintenance in our future publications.

      Reviewer 2 Comment-2 The use of qPCR assays for quantifying ChIP and transcript levels is inferior to ChIPseq and RNAseq. Whole genome methods, such as ChIPseq, permit a level of quality assessment that is not possible with qPCR methods. The authors should use whole genome NextGen sequencing approaches, show the alignment of reads to the genome from replicate experiments, and quantitatively analyze the technical quality of the data.

      We appreciate the reviewer's comment. While it is true that whole-genome methods such as ChIP-seq and RNA-seq provide comprehensive and high-throughput analysis compared to qPCR assays, it would be incorrect to consider qPCR as inferior. qPCR assays offer advantages in terms of sensitivity, specificity, validation, confirmation, and targeted analysis. We agree that performing a comprehensive analysis of HDAC2 and PRC2 targeted endothelial cell (EC) genes is important. We are currently in the process of generating this data, and as soon as it is complete, we will publish it accordingly.

      Reviewer 2 Comment-3 Third, the observation that pharmacologic inhibitor experiments and conditional KO experiments targeting HDAC2 and the Polycomb complex perturb EC gene expression or BBB integrity, respectively, is not particularly surprising as these proteins have broad roles in epigenetic regulation in a wide variety of cell types.

      We appreciate the comments from the reviewers. Our results provide valuable insights into the specific epigenetic mechanisms that regulate BBB genes It is important to recognize that different cell types possess stage-specific distinct epigenetic landscapes and regulatory mechanisms. Rather than having broad roles across diverse cell types, it is more likely that HDAC2 (eventhough there are several other class and subtypes of HDACs) and the Polycomb complex exhibit specific functions within the context of EC gene expression or BBB integrity.

      Moreover, the significance of our findings is enhanced by the fact that epigenetic modifications are often reversible with the assistance of epigenetic regulators. This makes them promising targets for BBB modulation. Targeting epigenetic regulators can have a widespread impact, as these mechanisms regulate numerous genes that collectively have the potential to promote the vascular repair.

      A practical advantage is that FDA-approved HDAC2 inhibitors, as well as PRC2 inhibitors (such as those mentioned in clinical trials NCT03211988 and NCT02601950, are already available. This facilitates the repurposing of drugs and expedites their potential for clinical translation.

      Please note: illustrations of Fig-1, 4 and 6 are created using Biorender.com, license purchased by Spiros Blackburn. This will be added to the Acknowledgments.

    1. Author Response

      eLife assessment

      This study presents a potentially valuable discovery which indicates that activation of the P2RX7 pathway can reduce the degree of lung fibrosis caused by other inflammatory pathways. If confirmed, the study could clarify the role of specific immune networks in the establishment and progression of lung fibrosis.

      Thanks for this positive comment. Indeed, knowing that lung fibrosis is partly driven by inflammation, with a dysregulated Th1/Th2/Th17 ratio (PMID 20176803, PMID 19682929), we hypothesized that modulating the immune response would be able to attenuate lung fibrosis. To address this issue, we proposed to boost the activation of P2RX7, a purinergic receptor with immunomodulatory properties (PMID 8614837, PMID 11035104), in the well characterized bleomycin-induced lung fibrosis mouse model (PMID 25959210). In this study, we used a pyroglutamic derivative compound (HEI3090) able to specifically enhance P2RX7-dependent biological activities (cationic channel and macropore opening) only in the presence of extracellular ATP, which was qualified as the first representative of an immunotherapy relying on the activation of P2RX7 expressed by dendritic cells (PMID 33510147), and we showed that lung fibrosis is attenuated in mice treated with HEI3090 as compared to vehicle treated mice.

      However, the presented data and analyses are incomplete as they rely on limited pharmacological treatments and because there is an absence of key control studies, validation experiments and statistical analyses.

      Quantification of lung fibrosis:

      Quantification of lung fibrosis was made on the basis of a modified Ashcroft score which assigns 8 grades to quantify lung fibrosis reliably and reproducibly (PMID 18476815). To be even more accurate and not biased by patchy lesions observed in all existing lung fibrosis induced mouse models, the whole lungs (left and right lobes) were divided in section of 880 µm2 and each section was scored individually. A total of 80 to 110 sections were analyzed per mouse. We agree that our text requires clarification. In parallel, the collagen amount given by the polarization intensity of the Sirius red staining of the lung slices was quantified with a homemade ImageJ/Fiji macro program. Further, we recently analyzed by FACS the percentage of PDGFRα (a specific marker of fibroblasts and myofibroblasts) positive cells in lungs isolated from vehicle and HEI3090-treated mice. All these 3 different markers of lung damage show that HEI3090 attenuates bleomycin-induced lung fibrosis and therefore validate the use of the Ashcroft score to accurately study the extend of lung fibrosis. We are going add quantification of collagen fibers in all figures.

      Limited pharmacological treatments:

      We have designed and characterized HEI3090 in a previous study and have shown that it is a positive modulator of P2RX7 (PMID 33510147).

      To test its effect on lung fibrosis, we tested two pharmacological regimens using HEI3090 and have shown that both regimens are effective in limiting the progression of fibrosis. While having shown the requirement of P2RX7 for the activity of HEI3090 (PMID 33510147), we used in this study p2rx7 KO mice which were adoptively transferred with splenocytes isolated from p2rx7 KO mice to demonstrate the involvement of P2RX7 to mediate the antifibrotic effect of HEI3090. This experiment also serves as control to validate the adoptive transfer experiment.

      We agree that proving and validating furthermore that activation of the P2RX7/IL-18 pathway can limit the progression of fibrosis requires the use of other activators of P2RX7. However, to date, HEI3090 is the only pharmacological compound described to activate the receptor. Indeed, the other chemical compounds described in the literature are negative allosteric modulator of P2RX7 (PMID 27935479),

      Absence of key control studies and validation experiments:

      The importance of P2RX7 in the antifibrotic effect of HEI3090 was demonstrated thanks to P2RX7 KO mice (supplementary figures S6B). We are going to implement this figure with additional mice.

      The importance of immune cells was demonstrated thanks to adoptive transfer of WT splenocytes (expressing P2RX7) into P2RX7 KO mice. We agree that lung fibrosis is attenuated in vehicle-treated P2RX7 KO mice, but lung fibrosis is still present and could be modulated by treatments as demonstrated by adoptive transfer of splenocytes isolated from IL-1B KO mice who still respond to HEI3090 as shown in Supplementary figure S6C.

      As suggested by reviewers we examined the effect of genetic background using two-way Anova test and the result is “the interaction is considered not significant”.

      The prevalence of transferred immune cells on endogenous cells is demonstrated in supplementary figure S5, where intravenous injection of splenocytes isolated from P2RX7 KO mice into WT mice abolishes the antifibrotic effect of HEI3090. This experiment further validates the requirement of immune cells and the efficacy of the adoptive transfer approach.

      Statistical analyses:

      In this study we compared side by side the effect of HEI3090 versus vehicle in different genetic backgrounds in order to characterize the implication of actors of the P2RX7/IL-18 pathway in the antifibrotic effect of HEI3090. We also examined the effect of genetic background using the two-way Anova test. Following European recommendations, and in agreement with the ARRIVE guidelines for mice studies, we performed provisional statistic to evaluate the number of mice required in the study and stopped the experiments when significantly statistical results were observed. We agree that results are heterogeneous, however this heterogeneity does not prevent data analyses as shown in supplementary figure S6D, where adoptive transfer of splenocytes isolated from IL-1B KO mice into P2RX7 KO mice dampens BLM-induced lung fibrosis (with an Ashcroft score of 1.8 versus 3 in WT mice) but still responds to HEI3090, thus indicating that IL-1B is not required to mediate the antifibrotic effect of HEI3090.

    1. Author Response

      We thank all reviewers for constructive critiques. We plan to perform new experiments and revise our manuscript accordingly. The text and Figures are currently undergoing the revision process. Below highlights our revision plan.

      eLife assessment

      The findings of this article provide valuable information on the changes of cell clusters induced by chronic periodontitis. The observation of a new fibroblast subpopulation, which was named as AG fibroblasts, was quite interesting, but needs further evidence. The strength of evidence presented is incomplete.

      RESPONSE: We discovered a new subpopulation of gingival fibroblasts, named AG fibroblasts, using non-biased single cell RNA sequencing (scRNA-seq) of mouse gingival samples undergoing the development of ligature-induced periodontitis. AG fibroblasts exhibited a unique gene expression profile: [1] constitutive expression of type XIV collagen; and [2] ligature-induced upregulation of chemokines such as CXCL12. As a biomedical data science experiment, we validated the scRNA-seq observation using immunohistochemical experiment, which showed the presence of type XIV collagen-positive and CXCL12-positive gingival fibroblasts localized immediately under the gingival epithelium and the coronal region of periodontal ligament.

      We agree that the functional/pathological role of AG fibroblasts must be further explored. We have hypothesized that AG fibroblasts initially sense the pathological stress including oral microbial stimuli and secrete inflammatory signals through chemokine expression. To address this hypothesis, in this revision, we plan to analyze a separate scRNA-seq data for AG fibroblast gene expression profile derived from mouse gingival tissues that have been stimulated by Toll-Like Receptor 9 (TLR9) ligand (unmethylated CpG oligonucleotide) and TLR2/4 ligand (LPS). This approach mimics the initial pathological stress applied to gingival tissue. The new insight of AG fibroblasts will be presented in the revision.

      Reviewer #1 (Public Review):

      In this article, the authors found a distinct fibroblast subpopulation named AG fibroblasts, which are capable of regulating myeloid cells, T cells and ILCs, and proposed that AG fibroblasts function as a previously unrecognized surveillant to orchestrate chronic gingival inflammation in periodontitis. Generally speaking, this article is innovative and interesting, however, there are some problems that need to be addressed to improve the quality of the manuscript.

      RESPONSE: We appreciate this comment. As suggested, we further investigated the surveillant function of AG fibroblasts by reanalyzing the scRNA-seq data for stress sensing receptors such as Toll-Like Receptors (TLR). Therefore, we analyzed AG fibroblast gene expression profile when the putative ligands to TLR2/4 and TLR9 are applied to mouse gingival tissue instead of ligature placement. We believe that this first step analysis should warrant to dissect further the function of AG fibroblasts in the future.

      Results:

      1) It is recommended to add HE staining and immunohistochemistry staining to observe the inflammation, tissue damage, and repair status from 0 to 7 days, so that readers can understand cell phenotype changes corresponding to the periodontitis stage. The observation index can include inflammation and vascular related indicators.

      RESPONSE: As recommended, representative histological figures will be included. We will further perform new immunohistochemistry experiment of mouse gingival tissue (D0, D1, D4, D7). We plan to highlight the infiltration of CD45+ immune cells. We also plan to highlight the progressive degeneration of gingival collagen fiber by picrosirius red staining.

      2) Figure 1A-1D can be placed in the supplementary figure.

      RESPONSE: Combining the new data above, Figure 1 will be revised as suggested.

      3) I suggest the authors to put the detection of the existence of AG fibroblasts before exploring its relationship with other types of cells.

      4) The layout of the picture should be closely related to the topic of the article. It is recommended to readjust the layout of the picture. Figure 1 should be the detection of AG cells and their proportion changes from 0 to 7 days. In other figures, the authors can separately describe the proportion changes of myeloid cells, T cells and ILCs, and explored the association between AG fibroblasts and these cell types.

      RESPONSE: As suggested, the presentation order of Figures and text will be revised to bring the information about AG fibroblasts first. The chemokine-receptor analysis is moved below.

      Methods:

      It is recommended to separately list the statistical methods section. The statistical method used in the article should be one-way ANOVA.

      RESPONSE: A separate statistical method section is created. As pointed out, we used one-way ANOVA with post-hoc Tukey test (when multiple groups were compared).

      Reviewer #2 (Public Review):

      This study proposed the AG fibroblast-neutrophil-ILC3 axis as a mechanism contributing to pathological inflammation in periodontitis. However, the immune response in the vivo is very complex. It is difficult to determine which is the cause and which is the result. This study explores the relevant issue from one dimension, which is of great significance for a deeper understanding of the pathogenesis of periodontitis. It should be fully discussed.

      RESPONSE: We agree with this comment. We expanded the current understanding of oral immune signal communication in Discussion and highlight how AG fibroblast may fit to it.

      1) Many host cells participate in immune responses, such as gingival epithelial cells. AG fibroblast is not the only cell involved in the immune response, and the weight of its role needs to be clarified. So the expression in the conclusion should be appropriate.

      RESPONSE: Following this critique, we revised INTRODUCTION, DISCUSSION and CONCLUSION, to highlight how AG fibroblasts function within a comprehensive immune response network.

      2) This study cannot directly answer the issue of the relationship between periodontitis and systemic diseases.

      RESPONSE: We agree with this critique. We either deleted or de-emphasized the relationship between periodontitis and systemic diseases throughout the text.

    1. Author Response:

      We appreciate the thorough, fair and concise comments and agree with most, if not all, of the interpretations and critiques. We also value the recommendations and guidance for what constitute the most important additional experiments and analyses. Thank you for your hard work and time. Your investment helps improve the impact and clarity of our work and that is very much appreciated. We look forward to submitting a revised version soon.

    1. Author Response:

      Reviewer #1 (Public Review):

      Overall, I find the work performed by the authors very interesting. However, the authors have not always included literature that seems relevant to their study. For instance, I do not understand why two papers Dunican et al 2013 and Dunican et al 2015, which provide important insight into Lsh/HELLS function in mouse, frog and fish were not cited. It is also important that the authors are specific about what is known and in particular about what is not known about CDCA7 function in DNA methylation regulation. Unless I am mistaken, there is currently only one study (Velasco et al 2018) investigating the effect of CDCA7 disruption on DNA methylation levels (in ICF3 patient lymphoblastoid cell lines) on a genome-wide scale (Illumina 450K arrays). Unoki et al 2019 report that CDCA7 and HELLS gene knockout in human HEK293T cells moderately and extremely reduces DNA methylation levels at pericentromeric satellite-2 and centromeric alpha-satellite repeats, respectively. No other loci were investigated, and it is therefore not known whether a CDCA7-associated maintenance methylation phenotype extends beyond (peri)centromeric satellites. Thijssen et al performed siRNA-mediated knockdown experiments in mouse embryonic fibroblasts (differentiated cells) and showed that lower levels of Zbtb24, Cdca7 and Hells protein correlate with reduced minor satellite repeat methylation, thereby implicating these factors in mouse minor satellite repeat DNA methylation maintenance. Furthermore, studies that demonstrate a HELLS-CDCA7 interaction are currently limited to Xenopus egg extract (Jenness et al 2018) and the human HEK293 cell line (Unoki et al 2019). Whether such an interaction exists in any other organism and is of relevance to DNA methylation mechanisms remains to be determined. Therefore, in my opinion, the conclusion that "Our co-evolution analysis suggests that DNA methylation-related functionalities of CDCA7 and HELLS are inherited from LECA" should be softened, as the evidence for this scenario is not very compelling and seems premature in the absence of molecular data from more species.

      We appreciate this reviewer’s thorough reading of our manuscript.

      Regarding the citation issues, we will cite Dunican 2013 and Dunican 2015.

      As pointed out by the reviewer, the role of CDCA7 in genome DNA methylation was extensively studied in Velasco et al 2018. The result, together with Thijssen et al (2015), and Unoki et al. (2018), supports the idea that ZBTB24, CDCA7 and HELLS act within the same pathway to promote DNA methylation, the pattern of which is overlapping but distinct from DNMT3B-mediated methylation. This observation suggests that a ZBTB24-CDCA7-HELLS mechanism for DNA methylation may involve an alternative DNMT. Interestingly, our analysis of the gene presence-absence pattern revealed that the presence of CDCA7 coincides with DNMT1 more than DNMT3 genes. Indeed, while CDCA7 is lost from diverse branches of eukaryote species, genomes encoding CDCA7 always encode HELLS, and almost always encode DNMT1. Based on this observation, we speculate the role of CDCA7 is tightly linked to HELLS and DNA methylation throughout evolution.

      As pointed out by Reviewer 1, the link between CDCA7, HELLS and DNA methylation has not been determined experimentally across these species. However, based on our previously published and unpublished data, we are confident about the functional interaction between CDCA7 and HELLS in Xenopus laevis and Homo sapiens. Furthermore, the importance of HELLS homologs in DNA methylation has been extensively studied in human, mouse and plants. We hope our current study will motivate the field to experimentally test the evolutionary conservation of HELLS-CDCA7 interaction, as well as their importance in DNA methylation, in other species.

      The authors used BLAST searches to characterize the evolutionary conservation of CDCA7 family proteins in vertebrates. From Figure 2A, it seems that they identify a LEDGF binding motif in CDCA7/JPO1. Is this correct and if yes, could you please elaborate and show this result? This is interesting and important to clarify because previous literature (Tesina et al 2015) reports a LEDGF binding motif only in CDCA7L/JPO2.

      We searched for a LEDGF binding motif ({E/D}-X-E-X-F-X-G-F, also known as IBM described in Tesina et al 2015) in vertebrate CDCA7 proteins, and reported their position in Figure 2A. Examples of identified LEDGF-binding motifs will be presented.

      To provide evidence for a potential evolutionary co-selection of CDCA7, HELLS and the DNA methyltransferases (DNMTs) the authors performed CoPAP analysis. Throughout the manuscript, it is unclear to me what the authors mean when referring to "DNMT3". In the Material and Methods section, the authors mention that human DNMT3A was used in BLAST searches to identify proteins with DNA methyltransferase domains. Does this mean that "DNMT3" should be DNMT3A? And if yes, should "DNMT3" be corrected to "DNMT3A"? Is there a reason that "DNMT3A" was chosen for the BLAST searches?

      As described in the Methods section, both Human DNMT1 and DNMT3A were used to initially identify any proteins containing a domain homologous to the DNA methyltransferase catalytic domain. Within Metazoa, if their orthologs exist, the top hit from BLAST search using human DNMT1 and DNMT3A show E-value 0.0, and thus their orthology is robust. This is even true for DNMT1 and DNMT3 homologs in the sponge Amphimedon queenslandica, which is one of the earliest-branching metazoan species. For other DNMTs, such as DNMT2, DNMT4, DNMT6, we conducted separate BLAST searches using those proteins as baits as described in Methods. The domain was then isolated using the NCBI conserved domains search. The selected DNMT domain sequences were aligned with CLUSTALW to generate a phylogenetic tree to further classify DNMTs (Figure S6). It has been suggested that vertebrate DNMT3A and DNMT3B are derived from duplication of a DNMT3 gene of chordates ancestor (e.g., Liu et al 2020, PMID 31969623). As such many invertebrates encode only one DNMT3. As previously shown (Yaari et al., 2019, PMID 30962443), plants have two distinct DNMT3-like protein family, the ‘true DNMT3’ and DRM, the plant specific de novo DNMT that is often considered to be a DNMT3 homolog (see Reviewer 2’s comment). Our phylogenetic analysis successfully deviated the clade of DNMT3 and DRM from the rest of DNMTs (Figure S6). Yaari et al noted that PpDNMT3a and PpDNMT3b, the two DNMT3 orthologs encoded by the basal plant Physcomitrella patens, are not orthologs of mammalian DNMT3A and DNMT3B, respectively. Therefore, to minimize such nomenclature confusions, any DNMTs that belong to either the DNMT3 or DRM clades indicated in Figure S6 are collectively referred to as ‘DNMT3’ throughout the paper (see Figure S2 for overview).

      CoPAP analysis revealed that CDCA7 and HELLS are dynamically lost in the Hymenoptera clade and either co-occurs with DNMT3 or DNMT1/UHRF1 loss, which seems important. Unfortunately, the authors do not provide sufficient information in their figures or supplementary data about what is already known regarding DNA methylation levels in the different Hymenoptera species to further consider a potential impact of this observation. What is "the DNA methylation status" of all these organisms? This information cannot be easily retrieved from Table S2. A clearer presentation of what is actually known already would improve this paragraph.

      As the DNA methylation status of the species in the Hymenoptera clade has not been comprehensively tested, this precluded us from adding this information to Figure 7. However, we have included the published reports of DNA methylation status for these species in Supplementary Table S2 (see column ‘5mC’; species for which 5mC is detected are marked with Y and the relevant PMID). As indicated, DNA methylation was detected in most tested species except for Microplitis demolitor. Many of these data are based on Bewick et al. 2017 (PMID 28025279). During the preparation of this response, we realized that the DNA methylation status reported for some species in Bewick et al. was inferred from the CpG frequency instead of the direct experimental detection of methylated cytosines. Therefore, we have amended Table S2 to indicate the presence of DNA methylation only for those species where this was experimentally tested. As such, we now consider the DNA methylation status of Fopius arisanus, which lacks DNMT1 and CDCA7, to be unknown. In addition, we realized that Bewick et al. reported that DNA methylation is absent in Aphidius ervi. We originally conducted synteny analysis on Aphidius gifuensis, which lacks DNMT1 and CDCA7, since Aphidius ervi protein data were not available in NCBI. By conducting tBLASTn search against the Aphidius ervi genome, we confirmed that the presence and absence pattern of CDCA7, HELLS, DNMT1, DNMT3 and UHRF1 in Aphidius ervi is identical to that of Aphidius gifuensis. In other words, DNA methylation is known to be absent in Aphidius ervi, which has lost DNMT1 and CDCA7. Altogether, among the 17 Hymenoptera species that we analyzed (listed in the amended Table S2), the 6 species that have detectable DNA methylation all encode CDCA7, whereas the 2 species that do not have detectable DNA methylation lack CDCA7. We will note this finding in the revised text.

      Furthermore, A. thaliana DDM1, and mouse and human Lsh/Hells are known to preferably promote DNA methylation at satellite repeats, transposable elements and repetitive regions of the genome. On the other hand, DNA methylation in insects and other invertebrates occurs in genic rather than intergenic regions and transposable elements (e.g. Bewick et al 2017; Werren JH PlosGenetics 2013). It would be helpful to elaborate on these differences.

      This point was discussed in the third paragraph of the Discussion, but we will better highlight this. It should be noted that, in the Arabidopsis ddm1 mutant, reduction of CG methylation of gene bodies is common (50% of all methylated euchromatic genes) (Zemach et al, 2013). In addition, hypomethylation is not limited to satellite repeats and transposable elements in ICF patients defective in HELLS or CDCA7 (Velasco et al., 2018).

      Reviewer #2 (Public Review):

      In this manuscript, Funabiki and colleagues investigated the co-evolution of DNA methylation and nucleosome remolding in eukaryotes. This study is motivated by several observations: (1) despite being ancestrally derived, many eukaryotes lost DNA methylation and/or DNA methyltransferases; (2) over many genomic loci, the establishment and maintenance of DNA methylation relies on a conserved nucleosome remodeling complex composed of CDCA7 and HELLS; (3) it remains unknown if/how this functional link influenced the evolution of DNA methylation. The authors hypothesize that if CDCA7-HELLS function was required for DNA methylation in the last eukaryote common ancestor, this should be accompanied by signatures of co-evolution during eukaryote radiation.

      [...]

      The data and analyses reported are significant and solid. However, using more refined phylogenetic approaches could have strengthened the orthologous relationships presented. Overall, this work is a conceptual advance in our understanding of the evolutionary coupling between nucleosome remolding and DNA methylation. It also provides a useful resource to study the early origins of DNA methylation related molecular process. Finally, it brings forward the interesting hypothesis that since eukaryotes are faced with the challenge of performing DNA methylation in the context of nucleosome packed DNA, loosing factors such as CDCA7-HELLS likely led to recurrent innovations in chromatin-based genome regulation.

      Strengths: - The hypothesis linking nucleosome remodeling and the evolution of DNA methylation. - Deep mapping of DNA methylation related process in eukaryotes. - Identification and evolutionary trajectories of novel homologs/orthologs of CDCA7. - Identification of CDCA7-HELLS-DNMT co-evolution across eukaryotes.

      Weaknesses: - Orthology assignment based on protein similarity. - No statistical support for the topologies of gene/proteins trees (figure S1, S3, S4, S6) which could have strengthened the hypothesis of shared ancestry.

      We appreciate the reviewers’ accurate summary, nicely emphasizing the importance of the our study. We agree that better phylogenetic analysis for orthology assignment will strengthen our conclusion, and we would like to explore this. Having anticipated this weakness, we specifically conducted a CoPAP analysis exclusively for Ecdysozoa species, where orthology assignment is straightforward, which supported our major conclusion. (For example, if we conduct BLAST search the clonal raider ant Oocerea biroi using human HELLS as a query, top 1 hit is a protein sequence annotated as one of three isoforms of ‘lymphoid-specific helicase” (i.e., HELLS), with E value 0.0. Similarly, top BLAST hit from Oocerea biroi using human DNMT1 as a query also returns with isoforms of DNMT1 with E value 0.0. As such, there are little disputes in orthology assignment in Ecdysozoa. Outside of Chordata, regardless of the alternative methods employed for orthology assignment, this will never be perfect (particularly in Excavata and SAR). Our current orthology assignment for the major targets in this study (HELLS, DNMT1, DNMT3, DNMT5) is largely consistent with published results (Ponger et al., 2005 PMID 15689527; Huff et al, 2014 PMID 24630728; Yaari et al., 2019 PMID 30962443; Bewick et al., 2019 PMID 30778188). However, while we are preparing this response and re-crosschecking our assignments with these references, we realized that we erroneously missed DNMT5 orthologs of Leucosporidium creatinivorum, Postia placenta, Armillaria gallica and Saitoella complicata., and DNMT6 ortholog from Fragilariopsis cylindrus. We also had recognized that DNMT4 orthologs were identified in Fragilariopsis cylindrus and Thalassiosira pseudonana In Huff et al 2014 (PMID 24630728), but in our phylogenetic analysis, these proteins form a distinct clade between DNMT1/Dim-2 and DNMT4 (Figure S6). Due to this ambiguity, we did not count them as DNMT1 or DNMT4 in our CoPAP analysis. These minor errors and ambiguity should not affect our presence-absence pattern in our original CoPAP analysis, and thus we feel that further refinement is unlikely to significantly affect our major conclusion.

    1. Author Response:

      The following is the authors' response to the current reviews.

      Reviewer #1 (Public Review):

      This revised manuscript by Walker et. al. addresses some of the editorial points and conceptual discussion, but in general, most of my suggestions (as the previous reviewer #1) for additional experimentation or addition were not addressed as discussed below. Therefore, my overall review has not changed.

      In our previous response, we included i) extra experimental data illustrating the reproducibility of our results and ii) added transcription start site data at the request of this reviewer. We included the information because we agreed with the reviewer that these were important points to address. For the points raised again below, we explained why the additional analysis was unlikely to add much in terms of insight or rigour. We have elaborated further below.   

      1) For example, in point 1, the suggested analysis was not performed because it is not trivial. My reason for making this suggestion is that the original manuscript was limited to Vibrio cholerae, and the impact of the manuscript would increase if the findings here were demonstrated to be more broadly applicable. I expect papers published in eLife to have such broad applicability. But no changes were made to the manuscript in this regard. The revised version is still limited to only Vibrio cholerae.

      Our paper is focused on the unexpected co-operative interactions between HapR and CRP. Such co-binding of two transcription factors to the same DNA site is unexpected. Consequently, it is this mode of DNA binding that is likely to be of broad interest. With this in mind, we did provide experimental, and bioinformatic, analyses for other regulatory regions and other vibrio species (Figures S3 and S6). This, in our view, is where the “broad applicability” for papers published in eLife comes from.

      The analysis the reviewer suggests is not related to the main message of our paper. Instead, the reviewer is asking how many HapR binding sites seen here by ChIP-seq are also seen in other vibrio species by ChIP-seq. This is only likely to be of interest to readers with an extremely specific interest in both vibrio species and HapR. The reviewer states above that we did not make the change “because it is not trivial”. This is an oversimplification of the rationale we presented in our response. The analysis is indeed not straightforward. However, much more importantly, the outcome is unlikely to be of interest to many readers, and has no bearing on the rigour of work. With this in mind, we do not think our position is unreasonable. We also stress that, should a reader with this very specific interest want to explore further, all of our data are freely available for them to do so.

      2) For point 2, the activity of FLAG-tag luxO could have been simply validated in a complementation assay. Yes, they demonstrated DNA binding, but that is not the only activity of LuxO.

      DNA binding by LuxO is the only activity of the protein with which we are concerned in our paper. Furthermore, LuxO is very much a side issue; we found binding to only the known targets and potentially, at very low levels, one additional target. No further LuxO experiments were done for this reason. Indeed, even if these data were removed completely, our conclusions would not change or be supported any less vigorously. We are happy to remove the LuxO data if the reviewer would prefer but this would seem like overkill.

      3) For point 7, the transcriptional fusions were not explored at different times or different media, which is also something that was hinted at by other reviewers. In regard to exploring expression at different time points, this seems particularly relevant for QS regulated genes.

      In their previous review, the reviewer did not request that such experiments were done. Similarly, no other reviewer requested these experiments. Instead, this reviewer i) commented that lacZ fusions were not as sensitive as luciferase fusions ii) asked if we had done any time point experiments. We agreed with the first point, whilst also noting that lacZ is not unusual to use as a reporter. For the second point, we responded that we had not done such experiments (which by the reviewer’s own logic would have been complicated using lacZ as a reporter). This seems like a perfectly reasonable way to respond.   

      We should stress that these comments all refer to Figure 2a, which was our initial screening of 23 promoter::lacZ fusions, supported by separate in vitro transcription assays. Only one of these fusions was followed up as the main story in the paper. Given that the other 22 fusions were not investigated further, and do not form part of the main story, there would seem little value in now going back to assay them at different time points.

      4) For point 13, the authors express that doing an additional CHIP-Seq is outside of the scope of this manuscript. Perhaps that is the case, but the point of the comment is to validate the in vitro binding results with an in vivo binding assay. A targeted CHIP-Seq approach specifically analyzing the promoters where cooperative binding was observed in vitro could have addressed this point.

      We did appreciate the original comment, and responded as such, but we do think additional ChIP-seq assays are outside the scope of this paper.

      Reviewer #2 (Public Review):

      This manuscript by Walker et al describes an elegant study that synergizes our knowledge of virulence gene regulation of Vibrio cholerae. The work brings a new element of regulation for CRP, notably that CRP and the high density regulator HapR co-occupy the same site on the DNA but modeling predicts they occupy different faces of the DNA. The DNA binding and structural modeling work is nicely conducted and data of co-occupation are convincing. The work seeks to integrate the findings into our current state of knowledge of HapR and CRP regulated genes at the transition from the environment and infection. The strength of the paper is the nice ChIP-seq analysis and the structural modeling and the integration of their work with other studies.

      We thank the reviewer for the positive comments.

      The weakness is that it is not clear how representative these data are of multiple hapR/CRP binding sites

      This comment does not consider all data in our paper. We did test our model experimentally at multiple HapR and CRP binding sites. These data are shown in Figure S6 and confirm the co-operative interaction between HapR and CRP at 4 of a further 5 shared binding sites tested. We also used bioinformatics to show the same juxtaposition of CRP and HapR sites in other vibrio species (Figure S3). Hence, the model seems representative of most sites shared by HapR and CRP.

      or how the work integrates as a whole with the entire transcriptome that would include genes discovered by others.

      At the request of the reviewers, our revision integrated our ChIP-seq data with dRNA-seq data. No other suggestions to ingrate transcriptome data were made by the reviewers. 

      Overall this is a solid work that provides an understanding of integrated gene regulation in response to multiple environmental cues.

      We thank the reviewer for the positive comment.

      —————

      The following is the authors' response to the original reviews.

      Reviewer #1 (Public Review):

      This manuscript by Walker et. al. explores the interplay between the global regulators HapR (the QS master high cell density (HDC) regulator) and CRP. Using ChIP-Seq, the authors find that at several sites, the HapR and CRP binding sites overlap. A detailed exploration of the murPQ promoter finds that CRP binding promotes HapR binding, which leads to repression of murPQ. The authors have a comprehensive set of experiments that paints a nice story providing a mechanistic explanation for converging global regulation.

      We thank the reviewer for their positive evaluation.

      I did feel there are some weak points though, in particular the lack of integration of previously identified transcription start sites

      For completeness, we have now added the position and orientation or the nearest TSSs to each HapR or LuxO binding peak in Table 1 (based on Papenfort et al.).

      the lack of replication (at least replication presented in the manuscript) for many figures,

      We assume that the reviewer is referring to gel images rather than any other type of assay output (were error bars, derived from replicates, are shown). As is standard, we show representative gel images. All associated DNA binding and in vitro transcription experiments have been done multiple times. Indeed, comparison between figures reveals several instances of such replication (e.g. Figures 4b & 5d, Figures 4d & 5e). We have added details of repeats done to the methods section.

      some oddities in the growth curve

      We do not know why cells lacking hapR have a growth curve that appears biphasic. We can only assume that this is due to some regulatory effect of HapR, distinct from the murQP locus. Despite the unusual shape of the growth curve, the data are consistent with our conclusions.

      and not reexamining their HapR/CRP cooperative binding model in vivo using ChIP-Seq.

      We agree that these would be interesting experiments and, in the future, we may well do such work. Even without these data, our current model is well supported by the data presented (and the reviewer seems to agree with this above).

      Reviewer #2 (Public Review):

      This manuscript by Walker et al describes an elegant study that synergizes our knowledge of virulence gene regulation of Vibrio cholerae. The work brings a new element of regulation for CRP, notably that CRP and the high density regulator HapR co-occupy the same site on the DNA but modeling predicts they occupy different faces of the DNA. The DNA binding and structural modeling work is nicely conducted and data of co-occupation are convincing. The work could benefit from doing a better job in the manuscript preparation to integrate the findings into our current state of knowledge of HapR and CRP regulated genes and to elevate the impact of the work to address how bacteria are responding to the nutritional environment. Importantly, the focus of the work is heavily based on the impact of use of GlcNAc as a carbon source when bacteria bind to chitin in the environment, but absent the impact during infection when CRP and HapR have known roles. Further, the impact on biological events controlled by HapR integration with the utilization of carbon sources (including biofilm formation) is not explored.

      We thank the reviewer for their overall positive evaluation.

      The rigor and reproducibility of the work needs to be better conveyed.

      Reviewer 1 made a similar comment (see above) and we have modified the manuscript accordingly.

      Specific comments to address:

      1)  Abstract. A comment on the impact of this work should be included in the last sentence. Specifically, how the integration of CRP with QS for gene expression under specific environments impacts the lifestyle of Vc is needed. The discussion includes comments regarding the impact of CRP regulation as a sensor of carbon source and nutrition and these could be quickly summarized as part of the abstract.

      We have added an extra sentence. However, we have used cautious language as we do not show impacts on lifestyle (beyond MurNAc utilisation) directly. These can only be inferred.

      2)  Line 74. This paper examines the overlap of HapR with CRP, but ignores entirely AphA. HapR is repressed by Qrrs (downstream of LuxO-P) while AphA is activated by Qrrs. With LuxO activating AphA, it has a significant sized "regulon" of genes turned on at low density. It seems reasonable that there is a possibility of overlap also between CRP and AphA. While doing an AphA CHIP-seq is likely outside the scope of this work, some bioinformatic or simply a visual analysis of the promoters known AphA regulated genes would be interest to comment on with speculation in the discussion and/or supplement.

      In short, everything that the reviewer suggests here has already been done and was covered in our original submission (see text towards the end of the Discussion). Also, we would like to point the referee to our earlier publication (Haycocks et al. 2019. The quorum sensing transcription factor AphA directly regulates natural competence in Vibrio cholerae. PLoS Genet. 15:e1008362).

      3)  Line 100. Accordingly with the above statement, the focus here on HapR indicates that the focus is on gene expression via LuxO and HapR, at high density. Thus the sentence should read "we sought to map the binding of LuxO and HapR of V. cholerae genome at high density".

      Note that expression of LuxO and HapR is ectopic in these experiments (i.e. uncoupled from culture density).

      4)  Line 109. The identification of minor LuxO binding site in the intergenic region between VC1142 and VC1143 raises whether there may be a previously unrecognized sRNA here. As another panel in figure S1, can you provide a map of the intergenic region showing the start codons and putative -10 to -35 sites. Is there room here for an sRNA? Is there one known from the many sRNA predictions / identifications previously done? Some additional analysis would be helpful.

      We have added an extra panel to Figure S1 showing the position of TSSs relative to the location of LuxO binding. We have altered the main text to accommodate this addition..

      5)  Line 117. This sentence states that the CHIP seq analysis in this study includes previously identified HapR regulated genes, but does not reveal that many known HapR regulated genes are absent from Table 1 and thus were missed in this study. Of 24 HapR regulated investigated by Tsou et al, only 1 is found in Table 1 of this study. A few are commented in the discussion and Figure S7. It might be useful to add a Venn Diagram to Figure 1 (and list table in supplement) for results of Tsou et al, Waters et al, Lin et al, and Nielson et al and any others). A major question is whether the trend found here for genes identified by CHIP-seq in this study hold up across the entire HapR regulon. There should also be comments in the discussion on perhaps how different methods (including growth state and carbon sources of media) may have impacted the complexity of the regulon identified by the different authors and different methods.

      We have added a list of known sites to the supplementary material (new Table S1). We were unsure what was meant by the comment “A major question is whether the trend found here for genes identified by CHIP-seq in this study hold up across the entire HapR regulon”. We have added the extra comment to the discussion re growth conditions, also noting that most previous studies relied on in vitro, rather than in vivo, DNA binding assays.

      6)  The transcription data are generally well performed. In all figures, add comments to the figure legends that the experiments are representative gels from n=# (the number of replicate experiments for the gel based assays). Statements to the rigor of the work are currently missing.

      See responses above. We have added a comment on numbers of repeats to the methods section.

      7)  Line 357-360. The demonstration of lack of growth on MurNAc is a nice for the impact of the work. However, more detailed comments are needed for M9 plus glucose for the uninformed reader to be reminded that growth in glucose is also impaired due to lack of cAMP in glucose replete conditions and thus minimal CRP is active. But why is this now dependent of hapR? A reminder also that in LB oligopeptides from tryptone are the main carbon source and thus CRP would be active.

      We find this point a little confusing and, maybe, two issues (murQP regulation, and growth in general) are being conflated. In particular, we do not understand the comment “growth in glucose is also impaired due to lack of cAMP in glucose replete conditions and thus minimal CRP is active”.

      Growth in glucose should indeed result in lower cAMP levels*, and hence less active CRP, but this does not impair growth. This is simply the cell’s strategy for using its preferred carbon source. If the reviewer were instead referring to some aspect of P_murQP_ regulation then yes, we would expect promoter activity to be lower because less active CRP would be available in the presence of glucose. The reviewer also comments “why is this now dependent of hapR?”. We assume that they are referring to some aspect of growth in minimal media with glucose. If so, the only hapR effect is the change in growth rate as cells enter mid-late log-phase (i.e. the growth curve looks somewhat biphasic). A similar effect is seen in all conditions. We do not know why this happens and can only conclude this is due to some unknown regulatory activity of HapR. Overall, the key point from these experiments is that loss if luxO, which results in constitutive hapR expression, lengthens lag phase only for growth with MurNAc as the sole carbon source.

      *Although in V. fischeri (PMID: 26062003) cAMP levels increase in the presence of glucose.

      8)  A great final experiment to demonstrate the model would have been to show co-localization of the promoter by CRP and HapR from bacteria grown in LB media but not in LB+glucose or in M9+glycerol and M9+MurNAc but not M9+glucose. This would enhance the model by linking more directly to the carbon sources (currently only indirect via growth curves)

      This is unlikely to be as straightforward as suggested. The sensitivity of CRP binding to growth conditions is not uniform across different binding sites. For instance, the CRP dependence of the E. coli melAB promoter is only evident in minimal media (PMID: 11742992) whilst the role of CRP at the acs promoter is evident in tryptone broth (PMID: 14651625). Similarly, as noted above, in Vibrio fischeri glucose causes and increase in cAMP levels. (PMID: 26062003).

      9) Discussion. Comments and model focus heavily on GlcNAc-6P but HapR has a regulator role also during late infection (high density). How does CRP co-operativity impact during the in vivo conditions?

      We really can’t answer this question with any certainty; we have not done any infection experiments in this work.

      Does the Biphasic role of CRP play a role here (PMID: 20862321)?

      Again, we cannot answer this question with any confidence as experimentation would be required. However, the suggestion is certainly plausible.

      Reviewer #3 (Public Review):

      Bacteria sense and respond to multiple signals and cues to regulate gene expression. To define the complex network of signaling that ultimately controls transcription of many genes in cells requires an understanding of how multiple signaling systems can converge to effect gene expression and ensuing bacterial behaviors. The global transcription factor CRP has been studied for decades as a regulator of genes in response to glucose availability. It's direct and indirect effects on gene expression have been documented in E. coli and other bacteria including pathogens including Vibrio cholerae. Likewise, the master regulator of quorum sensing (QS), HapR), is a well-studied transcription factor that directly controls many genes in Vibrio cholerae and other Vibrios in response to autoinducer molecules that accumulate at high cell density. By contrast, low cell density gene expression is governed by another regulator AphA. It has not yet been described how HapR and CRP may together work to directly control transcription and what genes are under such direct dual control.

      We thank the reviewer for their assessment of our work.

      Using both in vivo methods with gene fusions to lacZ and in vitro transcription assays, the authors proceed to identify the smaller subset of genes whose transcription is directly repressed (7) and activated (2) by HapR. Prior work from this group identified the direct CRP binding sites in the V. cholerae genome as well as promoters with overlapping binding sites for AphA and CRP, thus it appears a logical extension of these prior studies is to explore here promoters for potential integration of HapR and CRP. Inclusion of this rationale was not included in the introduction of CRP protein to the in vitro experiments.

      We understand the reviewer’s comment. However, the rationale for adding CRP was not that we had previously seen interplay between AphA and CRP (although this is a relevant discussion point, which we did make). Rather, we had noticed that there was an almost perfect CRP site perfectly positioned to activate PmurQP. Hence, CRP was added.

      Seven genes are found to be repressed by HapR in vivo, the promoter regions of only six are repressed in vitro with purified HapR protein alone. The authors propose and then present evidence that the seventh promoter, which controls murPQ, requires CRP to be repressed by HapR both using in vivo and vitro methods. This is a critical insight that drives the rest of the manuscripts focus. The DNase protection assay conducted supports the emerging model that both CRP and HapR bind at the same region of the murPQ promoter, but interpret is difficult due to the poor quality of the blot.

      There are areas of apparent protection at positions +1 to +15 that are not discussed, and the areas highlighted are difficult to observe with the blot provided.

      We disagree on this point. The region between +1 and +15 is inherently resistant to attack by DNAseI and there are only ever very weak bands in this region (lane 1). Other than seeing small fluctuations in overall lane intensity (e.g. lanes 7-12 have a slightly lower signal throughout) the +1 to +15 banding pattern does not change. Conversely, there are dramatic changes in the banding pattern between around -30 and -60 (again, compare lane 1 to all other lanes). That CRP and HapR bind the same region is extremely clear. Also note that this is backed up by mutagenesis of the shared binding site (Figure 4c).

      The model proposed at the end of the manuscript proposes physiological changes in cells that occur at transitions from the low to high cell density. Experiments in the paper that could strengthen this argument are incomplete. For example, in Fig. 4e it is unclear at what cell density the experiment is conducted.

      Such details have been added to the figure legends and methods section.

      The results with the wild type strain are intermediate relative to the other strains tested.

      This is correct, and exactly what we would expect to see based on our model.

      Cell density should affect the result here since HapR is produced at high density but not low density. This experiment would provide important additional insights supporting their model, by measuring activity at both cell densities and also in a luxO mutant locked at the high cell density. Conducting this experiment in conditions lacking and containing glucose would also reveal whether high glucose conditions mimicking the crp results.

      We agree with this idea in principle but note that the output from our reporter gene, β- galactosidase, is stable within cells and tends to accumulate. This is likely to obscure the reduction in expression as cells transition from low to high cell density. Since we have demonstrated the regulatory effects of HapR and CRP both in vivo using gene knockouts, and in vitro with purified proteins, we think that our overall model is very well supported. Further experimental additions may provide an incremental advance but will not alter our overall story. Also note the unexpected increase in intracellular cAMP due to addition of glucose, in Vibrio fischeri (PMID: 26062003).

      Throughout the paper it was challenging to account for the number of genes selected, the rationale for their selection, and how they were prioritized. For example, the authors acknowledged toward the end of the Results section that in their prior work, CRP and HapR binding sites were identified (line 321-22).

      This is not quite what we say, and maybe the reviewer misunderstood, which is our fault. The prior work identified CRP sites whilst the current work identified HapR sites. We have made a slight alteration to the text to avoid confusion.

      It is unclear whether the loci indicated in Table 1 all from this prior study. It would be useful to denote in this table the seven genes characterized in Figure 2 and to provide the locus tag for murPQ.

      Again, we are unsure if we have confused the reviewer. The results in Table 1 are all HapR sites from the current work, not a prior study. However, some of these also correspond to CRP binding regions found in prior work.

      The reviewer mentions “the seven genes characterised in Figure 2” but 23 targets were characterised in Figure 2a and 9 in Figure 2b. The “VC” numbers used in Figure 2 are the same as used in Table 1 so it is easy to cross reference between the two. We have added a footnote to Table 1, also referred to in the Figure 2 legend, to allow cross referencing between gene names and locus tags (including for murQP and hapR).

      Of the 32 loci shown in Table 1, five were selected for further study using EMSA (line 322), but no rationale is given for studying these five and not others in the table.

      This is not quite correct, we did not select 5 from the 32 targets listed in Table 1. We selected 5 targets from Table 1 that were also targets for CRP in our prior paper. This was the rationale.

      Since prior work identified a consensus CRP binding motif, the authors identify the DNA sequence to which HapR binds overlaps with a sequence also predicted to bind CRP. Genome analysis identified a total of seven sites where the CRP and HapR binding sites were offset by one nucleotide as see with murPQ. Lines 327-8 describe EMSA results with several of these DNA sequences but provides no data to support this statement. Are these loci in Table 1?

      This comment is a little difficult to follow, and we may have misunderstood, but we think that the reviewer is asking where the EMSA data referred to on lines 327-328 resides. We can see that the text could be confusing in this regard. We had referred to the relevant figure (Figure S6) on line 324 but did not again include this information further down in the description of the result. We have changed the text accordingly.

      Using structural models, the authors predict that HapR repression requires protein-protein interactions with CRP. Electromobility shift assays (EMSA) with purified promoter DNA, CRP and HapR (Fig 5d) and in vitro transcription using purified RNAP with these factors (Figure 5e) support this hypothesis. However, the model proports that HapR "bound tightly" and that it also had a "lower affinity" when CRP protein was used that had mutations in a putative interaction interface. These claims can be bolstered if the authors calculate the dissociation constant (Kd) value of each protein to the DNA. This provides a quantitative assessment of the binding properties of the proteins.

      The reviewer is correct that we do not explicitly provide a Kd. However, in both Figures 5d and 5e, we do provide very similar quantification. In 5d, our quantification is the % of the CRP-DNA complex bound by HapR (using either wild type or E55A CRP). Since the % of DNA bound is shown, and the protein concentrations are provided in the figure legend, information regarding Kd is essentially already present. In 5e, we show the % of maximal promoter activity. This is a reasonable way of quantifying the result. Furthermore, Kd is not a metric we can measure directly in this experiment that is not a DNA binding assay.

      The concentrations of each protein are not indicated in panels of the in vitro analysis, but only the geometric shapes denoting increasing protein levels.

      The protein concentrations are all provided in the figure legend. It is usual to indicate relative concentrations in the body of the figure using geometric shapes.

      Panel 5e appears to indicate that an intermediate level of CRP was used in the presence of HapR, which presumably coincides with levels used in lane 4, but rationale is not provided.

      There was no particular rationale for this, it was simply a reasonable way to do the experiment.

      How well the levels of protein used in vitro compare to levels observed in vivo is not mentioned.

      The protein concentrations that we use (in the nM to low μM range) are very typical for this type of work and consistent with hundreds of prior studies of protein-DNA interactions. The general rule of thumb is that 1000 molecules of a protein per bacterial cell equates to a concentration of around 1 μM. However, molecular crowding is likely to increase the effective concentration. Additionally, in vitro, where the DNA concentration is higher.

      The authors are commended for seeking to connect the in vitro and vivo results obtained under lab conditions with conditions experienced by V. cholerae in niches it may occupy, such as aquatic systems. The authors briefly review the role of MurPQ in recycling of the cell wall of V. cholerae by degrading MurNAc into GlcNAc, although no references are provided (lines 146-50). Based on this physiology and results reported, the authors propose that murPQ gene expression by these two signal transduction pathways has relevance in the environment, where Vibrios, including V. cholerae, forms biofilms on exoskeleton composed of GlcNAc.

      We have added a citation to the section mentioned.

      The conclusions of that work are supported by the Results presented but additional details in the text regarding the characteristics of the proteins used (Kd, concentrations) would strengthen the conclusions drawn. This work provides a roadmap for the methods and analysis required to develop the regulatory networks that converge to control gene expression in microbes. The study has the potential to inform beyond the sub-filed of Vibrios, QS and CRP regulation.

      As noted above, quantification essentially equivalent to Kd is already shown (% of bound substrate is indicated in figures and all protein concentrations are given in the figure legends).

      Reviewer #1 (Recommendations For The Authors):

      1.  As similar experiments have been performed in other Vibrios, it would be interesting to do a more detailed analysis of the similarities and differences between the species. Perhaps a Venn diagram showing how many targets were found in all studies versus how many are species specific.

      We appreciate this suggestion but would prefer not to make this change. A cross-species analysis would be very time consuming and is not trivial. The presence and absence of each target gene, for all combinations of organisms, would first need to be determined. Then, the presence and absence of binding signals for HapR, or its equivalent, would need to be determined taking this into account. For most readers, we feel that this analysis is unlikely to add much to the overall story. Given the amount of effort involved, this seems a “non-essential” change to make.

      2.  Line 101-Are the FLAG tagged versions of LuxO and HapR completely functional? Can they complement a luxO or hapR deletion mutant?

      The activity of FLAG tagged HapR (LuxR in other Vibrio species) has been shown previously (e.g. PMIDs 33693882 and 23839217). Similarly, N-terminal HapR tags are routinely used for affinity purification of the protein without ill effect. We have not tested LuxO-3xFLAG for “full” activity, though this fusion is clearly active for DNA binding, the only activity that we have measured here, since all know targets are pulled down.

      3.  Line 106-As the authors state later that there are additional smaller peaks for HapR that could be other direct targets, I think a brief mention of the methodology used to determine the cutoff for the 5 and 32 peaks for LuxO and HapR, respectively, would be informative here.

      We have added a little more text to the methods section. The added text states “Note that our cut- off was selected to identify only completely unambiguous binding peaks. Hence, weak or less reproducible binding signals, even if representing known targets, were excluded (see Discussion for further details)”.

      4.  Line 118-Need a reference here to the prior HapR binding site.

      This has been added.

      5.  Figs. 1e-What do the numbers on the x-axis refer to? Why not just present these data as bases? The authors also refer to distance to the nearest start codon, but this is irrelevant for 4/5 of the luxO targets as they are sRNAs. They should really refer to the distance to the transcription start site. Likewise, for HapR, distance to the nearest start codon is not as informative as distance to the nearest transcription start site. A recent paper used transcriptomics to map all the transcription start sites of V. cholerae, and these results should be integrated into the author's study rather than just using the nearest start codon (PMID: 25646441).

      The numbers are kilo base pairs, this has been added to the axis label. We have also changed “start codon” to “gene start” (since “gene start” is also suitable for genes that encode untranslated RNAs).

      Re comparing binding peak positions to transcription start sites (TSSs) rather than gene starts, this analysis would be useful if TSSs could be detected for all genes. However, some genes are not expressed under the conditions tested by PMID: 25646441, so no TSS is found. Consequently, for HapR or LuxO bound at such locations, we would not be able to calculate a meaningful position relative to the TSS. We stress that the point of the analysis is to determine how peaks are positioned with respect to genes (i.e. that sites cluster near gene 5’ ends). Also note that nearest TSSs are now shown in the revised Table 1. In some cases, these are unlikely to be the TSS actually subject to regulation (e.g. because the regulated gene is switched off).

      6.  Fig. 1e-Is there directionality to the site? In other words, if a HapR binding site is located between two genes that are transcribed in opposite directions, is there a way to predict which gene is regulated? It looks like this might be the case with the list presented in Table 1, but how such determination is made and what the various symbol in Table 1 mean are not clear to me. This also has ramifications for Fig. 2a as the direction to construct the fusion is critical for the experiment.

      The site is a palindrome so lacks directionality. The best prediction re regulation is likely to be positioning with respect to the nearest TSS (which is now included in Table 1). However, this would remain just a prediction and, where TSSs are in odd locations with respect to binding sites (taking into account the caveats above) predictions would be unreliable.

      We are unsure which symbol the reviewer refers to in Table 1, a full explanation of any symbols used is provided in the table footnotes.

      With respect to Figure 2a, if sites were between divergent genes, and met our other criteria, we tested for regulation in both directions. For example, see the divergent genes VCA0662 (classified inactive) and VCA0663 (classified repressed).

      7.  Fig. 2a-It is a little disappointing that the authors use LacZ fusions to measure transcription as this is not the most sensitive reporter gene. Luciferase gene fusions would have been much more sensitive. Also, did the authors examine multiple time points. The methods only describe "mid-log phase" but some of the inactive promoters could be expressed at other time points. Also, it would be simple to repeat this experiment in different media, such as minimal with glucose or another non- CRP carbon source, to expand which promoters are expressed.

      The reviewer is correct regarding the sensitivity of β-galactosidase, which is very stable and so accumulates as cells grow. Even so, this reporter has been used very successfully, across thousands of studies, for decades. We did not examine multiple timepoints. We appreciate that the 23 promoter::lacZ fusions could be re-examined using varying growth conditions but this is unlikely to impact the overall conclusions, though it could generate some new leads for future work.

      8.  Fig. 2a legend-typos

      This has been corrected.

      9.  Line 138-I think you mean Fig. 2a here.

      This has been corrected.

      10.  Fig. 2b and many additional figures quantify band intensity but do not show any replication or error. Therefore, it is impossible to gauge reproducibility of these experiments.

      We have added a reproducibility statement (all experiments were done multiple times with similar results) as is standard throughout the literature. Also note that there is a lot of internal replication between figures. Figure 4d and Figure 5e lanes 1-9 show essentially the same experiment (albeit with slightly different protein concentrations) and very similar results. To the same effect, Figure 5e lanes 10-18 and lanes 19-27 show the same experiment for two different mutations of the same CRP residue. Again, the results are very similar. Also see the response to your comment 15 below.

      11.  Fig. 4a-lanes 2-4-the footprint does not change with additional CRP. In other words, it looks the same at the lowest concentration of CRP versus the highest concentration of CRP. The footprints for HapR look similar. This is somewhat troubling as in these types of experiments one would like to observe a dose dependent change in the footprint correlating with more DNA occupancy.

      For CRP we agree but are not concerned at all by this. The site is simply full occupied at the lowest protein concentration tested. Given that the footprint exactly coincides with a near consensus CRP site (which, when mutated, abolishes CRP binding in EMSAs, and regulation by CRP in vivo) all our results are perfectly consistent. Note that i) our only aim in this experiment was to determine the positions of CRP and HapR binding ii) our conclusions are independently backed up using gel shifts and by making promoter mutations. With respect to HapR, there are changes at the periphery of the main footprint.

      12.  Fig. 4e-Why does the transcriptional activation of murQP decrease with increasing concentrations of CRP? This is also seen in Fig. 5e.

      In our experience, this often does happen when doing in vitro transcription assays (with CRP and many other activators). The anecdotal explanation is that, at higher concentrations, the regulator can start to bind the DNA non-specifically and so interfere with transcription.

      13. The authors demonstrate in vitro that HapR requires binding of CRP to bind the murQP promoter. It would strengthen their model if they demonstrated this in vivo. To do this, the authors only need to repeat their ChIP-Seq experiment in a delta CRP mutant and the HapR signal at murQP would be lost. In fact, such an experiment would experimentally confirm which of the in vivo HapR binding sites are CRP dependent.

      We agree, appreciate the comment, and do plan to do such experiments in the future as a wider assessment of interactions between transcription factors. However, doing this does have substantial time and resource implications that we cannot devote to the project at present. We feel that our overall conclusions, regarding co-operative interactions between HapR and CRP at PmurQP, are well supported by the data already provided. This also seems the overall opinion of the reviewers.

      14.  Fig. 5b-I am confused by the Venn diagram. The text states that "In all cases, the CRP and HapR targets were offset by 1 bp", but the diagram only shows 7 overlapping sites. The authors need to better describe these data.

      We mean that, in all cases where sites overlap, sites are offset by 1 bp (i.e. we didn’t find any sites

      overlapping but offset by 2, 3 4 bp etc).

      15. Line 287-288 and Fig. 5d-The authors state that HapR binds with less affinity to the CRP E55A mutant protein bound to DNA. There does seem to be a difference in the amount of shifted bands at the equivalent concentrations of HapR, but the difference is subtle. In order to make such a conclusion, the authors should show replication of the data and calculate the variability in the results. The authors should also use these data to determine the actual binding affinities of HapR to WT and the E55A mutant CRP, along with error or confidence intervals.

      All of these experiments have been run multiple times and we are absolutely confident of the result. With respect to Figure 5d, this was done many times. We note that not all experiments were exact repeats. E.g. some of the first attempts had fewer HapR concentrations. Even so, the defect in HapR binding to the CRP E55A complex was always evident. The two gels to the left show the final two iterations of this experiment (these are exact repeats). The top image is that shown in Figure 5d. The lower image is an equivalent experiment run a day or so previously. Both clearly show a defect in HapR binding to the CRP E55A complex. We appreciate that our conclusion re these experiments is somewhat qualitative (i.e. that HapR binds the CRP E55A complex less readily) but this is not out of kilter with the vast majority of similar literature and our results are clearly reproducible.

      16.  Fig. 6a-It is odd that the locked low cell density mutants have such a growth defect in MurNAc, minimal glucose, and LB. To my knowledge, such a growth defect is not common with these strains. Perhaps this has to do with the specific growth conditions used here, but I can't find that information in the manuscript (it should be there). Furthermore, the growth rate of the luxO and hapR mutants appears to be similar up to the branch point (i.e. slope of the curve), but the lag phage of the luxO mutant is much longer. The authors need to address these issues in relationship to previous published literature and specify their growth conditions because the results are not consistent with their simple model described in Fig 6b.

      This comment is a little difficult to pick apart as it covers several different issues. We’ll try and

      answer these individually.

      a)     The unusual “biphasic growth curve with hapR and hapRluxO cells: We do not know why cells lacking hapR have a growth curve that appears biphasic. We can only assume that this is due to some regulatory effect of HapR, distinct from the murQP locus. Despite the unusual shape of the growth curve, the data are consistent with our conclusions.

      b)     The extended lag phase of the luxO mutant in minimal media + MurNAc: We appreciate this comment and had considered possible explanations prior to submission. In the end, we left out this speculation but are happy to include it as part of our response. The extended lag phase might be expected if CRP/HapR regulation is largely critical for controlling the basal transcription of murQP. The locus is likely also regulated by the upstream repressor MurR (VC0204) as in E. coli. So, if deprepression of MurR overwhelms the effect of HapR on murQP, we think you would expect that once the cells start growing on MurNAc, the growth rates are unchanged. But the extended lag is due to the fact that it took longer for those cells to achieve the critical threshold of intracellular MurNAc-6-P necessary to drive murR derepression. Obviously, we can not provide a definitive answer.

      c)     We have added further details regarding growth conditions to the methods section and the Figure 6a legend.

      17.  Fig. S6-The data to this point with murPQ suggested a model in which CRP binding then enabled HapR binding. But these EMSA suggest that both situations occur as in some cases, such as VCA0691, HapR binding promotes CRP binding. How does such a result fit with the structural model presented in Fig. 5?

      This is to be expected and is fully consistent with the model. Cooperativity is a two-way street, and each protein will stabilise binding of the other. Clearly, it will not always be the case that the shared DNA site will have a higher affinity for CRP than HapR (as at PmurQP). Depending on the shared site sequence, expected that sometimes HapR will bind “first” and then stabilise binding of CRP.

      18. Line 354-356-The HCD state of V. cholerae occurs in mid-exponential phase and several cell divisions occur before stationary phase and the cessation of growth, at least in normal laboratory conditions. Therefore, there is not support for the argument that QS is a mechanism to redirect cell wall components at HCD because cell wall synthesis is no longer needed.

      We did not intent to suggest cell wall synthesis is not needed at all, rather that there is a reduced need. We made a slight change to the discussion to reflect this.

      19. Line 357-360-Again, as stated in point 16, the statement that cells locked in the HCD are "defective for growth" is an oversimplification. The luxO mutants have a longer lag phage, but they actually outgrow the hapR mutants at higher cell densities and reach the maximum yield much faster.

      In fairness, we do go on to specify that the defect is an extended lag phase. Also see our response above.

      Reviewer #2 (Recommendations For The Authors):

      Comments to improve the text

      1)  Line 103-106, line 130, line 136, etc. Details of the methods and the text directing to presentations of figures should be in the methods and/or figure legends with (Figure x) in citation after the statement. The sentences in lines indicated can be deleted from the results. Although several lines are noted specifically here, this comment should be applied throughout the entire results section.

      We appreciate this comment but would prefer not to make this change (it seems mainly an issue of personal stylistic choice). It is sometimes helpful for the reader to include such information as it avoids them having to cross reference between different parts of the manuscript.

      2)  Line 115. Recommend a paragraph between content on LuxO and HapR (before "Of the 32 peaks for HapR binding")

      We agree and have made this change.

      3)  Line 138 and Figure 1a. I am not convinced this gel shows that VC1375 is activated by HapR. Is the arrow pointing to the wrong band? There does seem to be an induced band lower down.

      We understand this comment as it is a little difficult to see the induced band. This is because this is a compressed area of the gel and the transcript is near to an additional band.

      4)  Line 147. Add the VC0206-VC0207 next to murQP (and the gene name murQP into Table 1).

      We have added the gene name to the figure foot note. The text has been changed as requested.

      5) Methods. It is essential for this paper to have detailed methods on the bacterial growth conditions. Referring to prior paper, bacteria were grown in LB (add composition...is this high salt LB often used for vibrios or low salt LB often used for E. coli). Growth is to "mid log". Please provide the OD at collection. Is mid log really considered "high density". Provide a reference regarding HapR activity at mid log to support the method. Could the earlier collection of bacteria account for missing known HapR regulated genes? In preparing the requested ç, include growth conditions for other experiments in the legends.

      Note that we have included a new supplementary table, rather than a Venn diagram. We have also added further details of growth conditions as mentioned above. Also not that, for the ChIP-seq, HapR and LuxO were expressed ectopically and so uncoupled from the switch between low and high cell density.

      6)  Content of Table 1, HapR Chip-seq peaks, needs to be closely double checked to the collected data as there seems to be some errors. Specifically, VC0880 and VC0882 listed under Chromosome I are most likely VCA0880 (MakD) and VCA0882 (MakB), both known HapR induced genes on Chromosome II with VCA0880 previously validated by EMSA. This notable error suggests the table may have other errors and thus requires a very detailed check to assure its accuracy.

      We appreciate the attention to detail! We have double checked, thankfully this is not an error, the table is correct (even so, we have also checked all other entries in the table). As an aside, VCA0880 is one of the locations for which we see a weak HapR binding signal below our cut-off (included in the new Table S1). In cross checking between Table 1 and all other data in the paper we noticed that we had erroneously included assay data for VC0620 in Figure 2A. This was not one of our ChIP-seq targets but had been assayed at the same time several years ago. This datapoint, which wasn’t related to any other part of the manuscript, has been removed.

      If VCA0880 and VCA0882 are correctly placed on Chr. I, then add comment to text that the Mak toxin genomic island found on Chromosome II in N16961 is on Chr. I in E7946. (See recent references PMID: 30271941, 35435721, 36194176, 34799450).

      See above, this is not an error.

      7)  Alternatively for both comments 8 & 9, are these problems of present/missing genes or misannotations the result of the annotation of E7946 gene names not aligning with gene names of N16961? (if so, in Table 1, please give the gene name as in E7946 but include a separate column with the N16961 name for cross study comparison)

      See above and below, this is not an issue.

      8)  Line 126-127. Also regarding Table 1, please add a column with function gene annotation. For example, VC0916 needs to be identified as vpsU. If function is unknown, type unknown in the column. This will help validate the approach of selecting "HapR target promoters where adjacent coding sequence could be used to predict protein function."

      We added an extra column to Table 1 in response to a separate reviewer request (TSS locations). This leaves no space for any additional columns. Instead, to accommodate the reviewer’s request, we have added alternative gene names to the footnote.

      Not following up on VCA0880 (promoter for the mak operon) is a sad missed opportunity here as it is one of the most strongly upregulated genes by HapR (PMC2677876)

      As noted above, this was not an error and VCA0880 was not one of our 32 HapR targets. As such, we would not have followed this up.

      9)  Figure Legends. Add a unit to the bar graphs in Figure 1e (should be kb??) This has been corrected.

      10) The yellow color text labels in figures 3c, 4a, 4c are difficult to read. Can you use an alternative darker color for CRP.

      We have made this slightly darker (although to our eye it is easily reliable). We haven’t changed the colour too much, for consistency with colour coding elsewhere.

      11) Figure S3. Binding is misspelled. Add units to the x-axis

      This has been corrected.

      12) Figure S7. The text in this figure is too small to read. Figure could be enlarged to full page or text enlarged. Are these 4 the only other known regulated promoters? Could all the known alternative promoters linked to HapR be similarly probed?

      We have increased the font size and included a new Table S1 for all previously proposed HapR sites.

      13) Figure S8. Original images..are any of these the replicate gels (see public comment 6)

      We have added a statement regarding reproducibility, and also note the internal reproducibility between different figures in our reviewer response. The gels in Figure S8 are full uncropped versions of those shown in the main figures.

      Reviewer #3 (Recommendations For The Authors):

      None

      Whilst there were no specific recommendations from this reviewer, we have still responded to the public review and made changes if required.

    1. Author Response:

      Regarding the two main points emphasized by the eLife assessment:

      • Potentially confounding effects of overcrowding: This is indeed an important point, which we avoided, unfortunately without explicitly mentioning it in the manuscript (assuming that it went without saying.) We will point out that our proliferation assays, already part of the original manuscript, indicated that cells were not overcroweded. Nevertheless, we will include additional evidence indicating that our cells were not overcrowded and remained subconfluent.

      • Mechanisms: We will mention even more explicitly than we already did that this is beyond the scope of this story and why that is. As we did say, there are lots of factors directly or indirectly involved in translation that depend on Hsp90. Figuring out which one or which ones it might be is a whole new and totally open-ended project.

      Regarding some of the other public comments:

      • While we did provide quantitative (!) data on changes in cytoplasmic density (e.g. diffusion coefficients, total amount of protein relative to cell size), we will emphasize in the revised manuscript that the changes in cell size, as measured by both flow cytometry and image analyses, are a relative and approximate measure of the 3D changes in cell volume. Although our data on the diffusion coefficients, which report on cytoplasmic density, are directly comparable, our measurements of the amounts of protein relative to cell size (if this is what the comment meant with "cell density") have at least relative value.

      • Results of proteomic data not shown in sufficient detail: We recognize that it is not trivial to "read" the data as presented in the paper (volcano plots, full datasets as an Excel file and through ProteomeXchange). We will add subsets of the proteomic data to the Excel file and include some Gene Ontology analyses.

      • We did demonstrate that Hsf1 most likely acts transcriptionally to promote the observed cell size increase.

      • We acknowledge that a large fraction of our data is "observational", but some experiments clearly go beyond providing correlations. When we manipulate some of the players genetically (KO, knockdown, overexpression) or pharmacologically, we get results that support our conclusions about underlying mechanistic connections.

      • GADD34: This protein is not known to be an Hsp90 client (or interactor), which is also supported by our mass spec data since its steady-state levels don't change in Hsp90α or β KO cells compared to wild-type cells.

      • Non-dividing cells: it would indeed be exciting to determine whether the same phenomena and mechanisms apply to non-dividing cells. However, there are likely to be substantial technical challenges. We would need primary human (or alternatively murine) cells such as B-cells or hepatocytes, and it is difficult to predict whether they would tolerate mild heat stress for several days. It might also be possible to explore this with a mouse model, but clearly, this must be left to future studies.

  2. May 2023
    1. Author Response:

      The following is the authors' response to the original reviews.

      We’d like to take this opportunity to thank the reviewers and editors for their consideration of our work. As detailed below, we have made the majority of the suggested corrections by the reviewers and believe these have greatly improved our manuscript. The reviewer’s comment are in blue font below and our response to each of these in black font.

      Reviewer #1 (Recommendations For The Authors):

      Suggestions to improve the manuscript:

      -  Line 33 and 34: "This protein" is vague. Please reword to state whether you are referring to TcaA or to WTA

      This has been corrected in the revised manuscript (Line 33)

      -  Intro: It would be helpful to provide more rationale for testing serum as a surrogate to whole blood in the GWAS screen. Serum is obviously lacking components of the clotting cascade, and some of these components have antimicrobial functions. However, this is easily justified in the text- e.g. to avoid clumping during the screen, to focus only on serum-derived antimicrobial compounds, etc.

      This has been edited in the revised manuscript (Line 84-86)

      -  Line 120: Please state if the 300 clinical isolates represent 300 distinct patients, or if some of the isolates came from the same patient during sequential collections. If the latter, were there any instances in the which the tcaA SNP appeared during the course of infection?

      They each came from individual patients so we were unfortunately unable to look for within host events. This information has been added to the revised manuscript (line 104).

      -  Line 133: the closed parenthesis sign is missing after "CC22"

      This has been corrected in the revised manuscript (Line 135)

      -  Table 1a - NE1296 is misspelled as ME1296. Also there is a typo in the last entry of this table for the locus tag

      This has been corrected in the revised manuscript.

      -  Table 1b - the authors should comment (in the discussion) on the potential reasons why tcaA was not identified in the CC30 background.

      A comment to this effect has been added to the revised manuscript (Lines 553-59)

      -  Figure 2a - Why is the mutant with the empty complementation vector not significantly different from WT JE2?

      The most widely used and reliable expression plasmid for complementation of mutated phenotypes in S. aureus is the pRMC2 plasmid, which requires chloramphenicol selection and anhydrotetracycline to induce expression of the cloned gene. These antibiotics, and the presence of the plasmid often affect the expression of other genes by the bacteria (as noted by this reviewer). As such, to verify complementation of a mutation the comparison we make is between the strain containing the empty plasmid induced with anhydrotetracycline with a strain with the gene containing plasmid induced with anhydrotetracycline. In that situation, the only difference between those two strains under those conditions is whether the gene is expressed or not. A comment explaining this has been added to the revised manuscript (lines 149-153).

      -  Line 188: Statistical analyses should be applied to figure 3C, which also appears to be underpowered.

      P values have been added to this in the revised manuscript. We present data point of three biological replicates, which are the mean of three technical replicates, which we believe is sufficiently powers for this analysis.

      -  Figure 3 legend - Tecioplanin is mentioned in the title, but the data are not included here

      This legend title has been the revised (Line 193).

      -  Figure 4 - here is an example where testing the actual tcaA SNP could have been enlightening. For example, what if the selective pressure makes the SNP more relevant to a specific AMP or AA?

      While we agree that this would be an interesting experiment to perform, the complementing vector that we would need to use to compare the wild type and SNP contains gene requires antibiotics to select for the plasmid and another to induce expression. As such it becomes quite a complex and messy experiment where synergy between the antimicrobial agents would be likely, the results of which will be difficult to interpret.

      -  Lines 317-321 - Suggest moving this to discussion

      We have left this here as we felt it a necessary summation/explanation of the results described in that section. It is discussed again later in the discussion section.

      -  Line 341 - I believe "serum" should actually be "teicoplanin"

      This has been corrected in the revised manuscript (Line 342).

      -  Figure 6e - wouldn't it be more powerful to determine the WTA levels in the supernatants of these strains and conditions?

      We could have done this both ways, but we focussed here only on how TcaA ligates WTA into the cell wall in the presence of serum.

      -  Figure 6 - What is the explanation for the different growth yields for JE2 in tecioplanin in panel A versus panel F? Are these actually two different concentrations? If so, please update the figure legend and the methods.

      The concentration used for the A was inhibitory and for F sub-inhibitory. To improve the clarity of this we have now used a table displaying the MICs for the six strains as panel A. We have also included the concentration of teicoplanin used for each experiment in the legend.

      -  Line 413: Consider more precise language than "the cell wall is stronger". E.g. More crosslinks?

      This has been edited in the revised manuscript (Line 421)

      -  Line 415: Consider changing "altered" to a directional term such as increases. It can be difficult for the reader to follow the expected change when you are discussing how the lack of a gene versus the presence of a gene changes susceptibility in one direction and another phenotype in the opposite direction.

      This has been edited in the revised manuscript (Line 423).

      -  Figure 7: The conclusions made from panels A and B need to be supported by statistical analyses. It is unclear if these lines are truly different from one another.

      These have been included in the revised fig 7.

      -  Line 426: I believe "tcaA" is missing following "producing"

      This has been corrected in the revised manuscript (Line 434).

      -  Line 446: "increase" to "increases"

      This has been corrected in the revised manuscript (Line 460).

      -  Figure 8C: if one goal of the mouse experiment was to look at survival during transit in whole blood, earlier timepoints are indicated based on the described kinetics of bloodstream dissemination in this model.

      The primary goal of this experiment was to see if TcaA contributed positively or negatively to the development of the infection. Work on this protein is ongoing, and so we hope in coming years to be able to provide more detail on its activity in vivo.

      -  Line 506: "changes to the structural integrity of peptidoglycan" seems overstated without additional studies.

      This has been edited in the revised manuscript (Line 524).

      -  Line 564: "represents" to "represent"

      This has been corrected in the revised manuscript (Line 603).

      -  Line 588: The figures all refer to "100 net". Please confirm the concentration used.

      This has been corrected in the revised manuscript (Line 628).

      -  Line 609: This refers to capsule production? Is this a copy error from a prior paper?

      Yes it is, and has been corrected in the revised manuscript (Line 650).

      - Line 763: Please provide the concentrations of arachidonic acid used for each experiment.

      This has been included in the revised manuscript (Line 805)

      - Line 836 and 837: This mentions a time course for blood culture from the infected mice. Where are these data?

      Apologies, this is another cut and paste mistake from another paper, and had been removed.

      -  Line 870: please discuss how multiple comparisons testing was handled.

      This has been included in the revised manuscript (Line 908).

      -  Supplemental figure 5 - Please add statistical analyses to support the conclusions in the manuscript. For example, there appears to be no differences for dalbavancin. Please also italicize tcaA in the legend.

      These have been included and corrected in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Line 65 - I would suggest adding the reference (doi: 10.1128/Spectrum.00116-21), which shows increased mortality in S. aureus bacteremia patients due to agr deficient isolates.

      The suggested manuscript shows this effect of Agr dysfunction to be limited to patients with moderate to severe SOFA scores. As such it would require a nuanced description here that we think will detract from the flow of the introduction.

      Line 68 - Please add DOI: 10.1016/j.cmi.2022.03.015 as a reference to support the mortality rate in S. aureus bacteremia. A systematic review and meta-analysis provides the highest level of evidence, and this is a contemporary study performed in 2022

      This has been included in the revised manuscript (Line 68).

      Line 70 - please add supporting reference for this statement

      This has been included in the revised manuscript (Line 70).

      Figure 2 - This image is low quality and appears pixelated. Please revise

      This has been replaced with a higher resolution image in the revised manuscript.

      Figure 3c Also appears slightly pixelated

      This has been replaced with a higher resolution image in the revised manuscript.

      Line 173 - I think it would helpful to mention the catalytic activity encoded by tcaA (aside from mediating sensitivity to glycopeptides) is unknown.

      This has been included in the revised manuscript (Line 174)

      Line 174 - also confers sensitivity to vancomycin https://doi.org/10.1128/AAC.48.6.1953- 1959.2004

      This has been included in the revised manuscript, albeit at a later point than suggested here (Line 406)

      Line 209 - did the authors test any other antimicrobial fatty acids such as palmitoleic acid? If common mechanism would also expect decreased sensitivity to other HDFA

      No, we focused on arachidonic acid as this is the most relevant antimicrobial fatty acid in serum and it is produced by neutrophils and macrophages during the inflammatory burst.

      Figure 4a-D: it would be useful to know what the MIC to these different components is and how that MIC relates to the concentration in human serum

      We do not have MICs for all of these compounds tested here but can confirm that the concentrations used are physiologically relevant.

      Figure 4b - Can you mention in the legend how the killing assays varied for arachadonic acid versus the other AMPs? I am not immediately clear how this experiment was performed, despite referring to methods

      This has been included in the text of revised manuscript (Line 211-213) and the figure legend.

      Figure 5 - there is no panel D

      This has been corrected in the revised manuscript.

      Figure 6a: Lines 328-329 state the experiment was performed in the MIC for each strain. The legend (line 374) states 0.5 ug/ml teicoplanin was used, which is below the MIC for all of the strains tested per supp table 2. Please correct this discrepancy.

      This figure has been revised and the additional information included to improve the clarity of this section in the revised manuscript.

      Figure 6a: On line 328, the authors state that the tcpA knockout increases the MIC for teicoplanin in each background. Figure 6a is performed in the presence of teicoplanin at 1x the MIC of the wild type (which will be below the MIC for the knockout). Therefore, we know each tcpA mutant will be able to grow in the presence of sub-mic concentrations of teicoplanin. Would a more informative way of conveying this information be to have MIC on the Y axis and background on the X axis?

      This has been corrected and clarified in the revised manuscript with a table showing the MICs (fig. 6a).

      Figure 6b-c: Similarly, would it be more helpful to show how the MIC varies with the different clinical isolate tcpA mutants?

      While MICs have uses in clinical setting, they are a relatively crude and binary (growth V no growth) way to measure and compare sensitivity. For these two groups of isolates the MICs did not vary, which is why we used a concentration that sat that the threshold and quantified growth of all the isolates in this. This information has been added to the legend.

      Figure 6e: The figure legends instructs us to refer to supplemental figure 3 to see the densiometry results. However, Figure 6e appears to be 4 conditions (WT and mutant +/- serum) and only examines the cell wall, whereas the supplemental figure refers to two conditions (WT + mutant) and looks at the cell wall and supernatant. I would recommend providing the densitometry data associated with the conditions in figure 6e, especially as differences seem more subtle by eye.

      This has been included in the revised manuscript (fig. 6f)

      Line 689-691 - description of teicoplanin concentrations used in figure 2. However, no teicoplanin was used in figure 2. Assume is referring to a different figure (figure 6?)

      This has been corrected and clarified in the revised manuscript. Line 724.

      Please add a section in the methods describing how the MIC was determined for JE2, SH1000 and Newman. Was it performed in CA-MHB or the media that the experiment in figure 6a was performed in. Serum can alter the MIC of several antibiotics

      This has been corrected and clarified in the revised manuscript. Line 724-29.

      Please add a section to the methods describing the whole blood killing assay, ideally describing how the blood was not frozen and used same day as venipuncture. This is important as freeze/thaw or time periods >12 hours are likely to severely effect the function of phagocytes, especially neutrophils.

      This has been corrected and clarified in the revised manuscript. Lines 635-639

      Line 588: ng/ul should read ng/µl

      This has been corrected in the revised manuscript too ng/ml. Line 628

      Reviewer #3 (Recommendations For The Authors):

      We have now included a graphical abstract (Fig. 9)

      Major:

      1-    Line 102: I was not able to find the accession numbers of these 300 genomes, did the authors submit it to any public repository (e.g. NCBI)?

      These were submitted previously to a public repository and the associated reference cited, but we have provided these in supplementary Table 1.

      Minor:

      1 -    Typo in line 133. Fix parenthesis after CC22.

      Corrected.

      2 -    Typo: Fix figure 5 panels (5e should be 5d).

      Corrected.

      3 -    Line 276: It is not clear why the extract for this experiment was supplemented at 2% while the other part of the experiment was done with 10%. Clarification is needed.

      The experiments at 10% was using overnight supernatant, whereas those with 2% was a purified WTA extract. This has been clarified in the revised manuscript (lines 283 and in the figure legend)

      4 -    Line 278: Typo: Figure 6e should be figure 5d.

      Corrected. (Line 278)

      5 -    Figure 5f: There is no explanation in the text or in the figure legend what the purpose of using mprF was.

      A comment has been included in the figure legend.

      6 -    Line 328: It would be good if we the authors reports the CC of Newman and SH1000 for a better context for the readers.

      This has been added. (Line 332)

      7 -    Line 341: Did the authors mean less sensitive to teicoplanin?

      Corrected. (Line 342)

      8 -    Line 367: Dose dependent effect does not seem to be followed not only in panel H of Supp. Fig. 4(LL37 and EMRDA15) but also panels C, D and G.

      Corrected.

      9 -    Line 587: Typo: Table 2.

      These have all been corrected and/or clarified in the revised manuscript.

    1. Author Response:

      First and foremost, we would like to thank all the editors and reviewers for their thoughtful and thorough evaluations of our manuscript. We greatly appreciate their assessment about the novelty and strength in this study and will revise the manuscript according to their recommendations. Here we offer a provisional response to Reviewer 2 to clarify our rationale for using TH-Cre rather than DAT-Cre mice in our study of frontal cortical dopaminergic projections.

      We agree with Review 2 that the DAT-Cre line can provide specific labeling of midbrain dopamine neurons projecting to the striatum, as discussed in the cited study (Lammel et al., 2015). But unfortunately, mesocortical dopamine neurons in the VTA are known to express very little DAT (Lammel et al., 2008; Li, Qi, Yamaguchi, Wang, & Morales, 2013; Sesack, Hawrylak, Matus, Guido, & Levey, 1998). This limitation in the use of the DAT-Cre line to target mesocortical dopamine neurons has been acknowledged in the cited publication (Lammel et al., 2015). It is an issue we have also observed when testing the DAT-Cre line in our lab. Additionally, and interestingly, recent extensive evaluation of the DAT-Cre line reported ectopic labeling of multiple non-dopaminergic neuronal populations (Papathanou, Dumas, Pettersson, Olson, & Wallen-Mackenzie, 2019; Soden et al., 2016; Stagkourakis et al., 2018). Our own evaluation of the DAT-Cre line’s utility for cortical imaging also captured sporadic ectopic labeling of cortical cell somas.

      Because mesocortical dopamine neurons have stronger TH expression than DAT (Lammel et al., 2008; Lammel et al., 2015; Li et al., 2013; Sesack et al., 1998), TH-Cre lines have been frequently used to study the mesocortical pathway (Ellwood et al., 2017; Gunaydin et al., 2014; Lammel et al., 2012; Lohani, Martig, Deisseroth, Witten, & Moghaddam, 2019; Vander Weele et al., 2018). While TH-Cre expression itself is not restricted to dopaminergic neurons, we targeted our viral injections to the VTA and optogenetic stimulation to the cortical dopaminergic projection target area (Patriarchi et al., 2018) to specifically modulate mesocortical dopaminergic axons. In addition, we tested D1 antagonist’s effects in our manipulations. Although we targeted dopamine neurons in our adolescent stimulation, the final behavioral outcome likely includes contributions from co-released neurotransmitters and non-dopaminergic neurons via network effects. We will revise our discussion and methods sections to clarify these points of interest. Additionally, we will provide DAT-Cre images in the revised supplementary materials to further explain our choice of the TH-Cre line rather than the DAT-Cre line for our study.

      References

      Ellwood, I. T., Patel, T., Wadia, V., Lee, A. T., Liptak, A. T., Bender, K. J., & Sohal, V. S. (2017). Tonic or Phasic Stimulation of Dopaminergic Projections to Prefrontal Cortex Causes Mice to Maintain or Deviate from Previously Learned Behavioral Strategies. J Neurosci, 37(35), 8315-8329. doi:10.1523/JNEUROSCI.1221-17.2017

      Gunaydin, L. A., Grosenick, L., Finkelstein, J. C., Kauvar, I. V., Fenno, L. E., Adhikari, A., ... Deisseroth, K. (2014). Natural neural projection dynamics underlying social behavior. Cell, 157(7), 1535-1551. doi:10.1016/j.cell.2014.05.017

      Lammel, S., Hetzel, A., Haeckel, O., Jones, I., Liss, B., & Roeper, J. (2008). Unique properties of mesoprefrontal neurons within a dual mesocorticolimbic dopamine system. Neuron, 57(5), 760-773. doi:DOI 10.1016/j.neuron.2008.01.022

      Lammel, S., Lim, B. K., Ran, C., Huang, K. W., Betley, M. J., Tye, K. M., ... Malenka, R. C. (2012). Input-specific control of reward and aversion in the ventral tegmental area. Nature, 491(7423), 212-217. doi:10.1038/nature11527

      Lammel, S., Steinberg, E. E., Foldy, C., Wall, N. R., Beier, K., Luo, L., & Malenka, R. C. (2015). Diversity of transgenic mouse models for selective targeting of midbrain dopamine neurons. Neuron, 85(2), 429-438. doi:10.1016/j.neuron.2014.12.036

      Li, X., Qi, J., Yamaguchi, T., Wang, H. L., & Morales, M. (2013). Heterogeneous composition of dopamine neurons of the rat A10 region: molecular evidence for diverse signaling properties. Brain Struct Funct, 218(5), 1159-1176. doi:10.1007/s00429-012-0452-z

      Lohani, S., Martig, A. K., Deisseroth, K., Witten, I. B., & Moghaddam, B. (2019). Dopamine Modulation of Prefrontal Cortex Activity Is Manifold and Operates at Multiple Temporal and Spatial Scales. Cell Rep, 27(1), 99-114 e116. doi:10.1016/j.celrep.2019.03.012

      Papathanou, M., Dumas, S., Pettersson, H., Olson, L., & Wallen-Mackenzie, A. (2019). Off-Target Effects in Transgenic Mice: Characterization of Dopamine Transporter (DAT)-Cre Transgenic Mouse Lines Exposes Multiple Non-Dopaminergic Neuronal Clusters Available for Selective Targeting within Limbic Neurocircuitry. Eneuro, 6(5). doi:10.1523/Eneuro.0198-19.2019

      Patriarchi, T., Cho, J. R., Merten, K., Howe, M. W., Marley, A., Xiong, W. H., ... Tian, L. (2018). Ultrafast neuronal imaging of dopamine dynamics with designed genetically encoded sensors. Science, 360(6396), 1420-+. doi:10.1126/science.aat4422

      Sesack, S. R., Hawrylak, V. A., Matus, C., Guido, M. A., & Levey, A. I. (1998). Dopamine axon varicosities in the prelimbic division of the rat prefrontal cortex exhibit sparse immunoreactivity for the dopamine transporter. J Neurosci, 18(7), 2697-2708. doi:10.1523/JNEUROSCI.18-07-02697.1998

      Soden, M. E., Miller, S. M., Burgeno, L. M., Phillips, P. E. M., Hnasko, T. S., & Zweifel, L. S. (2016). Genetic Isolation of Hypothalamic Neurons that Regulate Context-Specific Male Social Behavior. Cell reports, 16(2), 304-313. doi:10.1016/j.celrep.2016.05.067

      Stagkourakis, S., Spigolon, G., Williams, P., Protzmann, J., Fisone, G., & Broberger, C. (2018). A neural network for intermale aggression to establish social hierarchy. Nat Neurosci, 21(6), 834-842. doi:10.1038/s41593-018-0153-x

      Vander Weele, C. M., Siciliano, C. A., Matthews, G. A., Namburi, P., Izadmehr, E. M., Espinel, I. C., ... Tye, K. M. (2018). Dopamine enhances signal-to-noise ratio in cortical-brainstem encoding of aversive stimuli. Nature, 563(7731), 397-401. doi:10.1038/s41586-018-0682-1

    1. Author Response:

      I appreciate the time and effort of both Reviewers, who have raised important points that I would like to briefly discuss before I start working on a full revision of the paper.

      Generality. First, there is the question of how much these conclusions broadly apply across experimental paradigms and subjects, which could give rise to potentially very different TGMs. As the Reviewers mention, I have focussed on one specific TGM that I assumed prototypical, and it could be that these conclusions fit other TGMs less well. Further, the model has quite a few hyperparameters so that it can flexibly accommodate a broad span of scenarios. This flexibility comes at a price, as pointed out by Reviewer 1: that “a different selection of parameters could lead to similar results”, i.e. that other configurations could fit this specific TGM just as well. This is related to the next point, so I will address them jointly.

      Lack of quantitative evaluation, “making it hard to draw firm conclusions”. Indeed, I have not explicitly quantified the fit of the hyperparameters to this empirical TGM using a specific measure, and (related to the previous point) I have not made a systematic search through the space of model configurations based on such measure.

      There is here a trade-off between generality and specificity. In fact, it is intentional that I did not optimise the hyperparameters to this specific TGM, and that I chose not to show a quantitative measure of fitness. This is because the TGM that I show in the paper is only meant as an example. Instead of focussing on fitting a specific TGM, I aimed at characterising some prominent general features that we often see throughout the literature, which this specific TGM shows in its own specific way. That is, if the paper was meant to focus on a specific paradigm (e.g. passive vision), then the use of a specific metric to fit the model to one or various empirical TGMs would have perhaps been more adequate, but this was not the case here. In future work, when focussing on specific paradigms, I will adapt methods of Bayesian optimisation (Lorenz et al., 2017) for this purpose, as mentioned in the Discussion. Note that doing this right is not trivial and would complicate the paper significantly; for this reason, I feel it should belong to a different piece of work.

      I would also like to note that evaluating the different features of the data one by one (“in a stepwise manner”) was necessary for interpretation. One can loosely think of it as a sort of F-test: one is showing how important a feature is by comparing the full model vs. a nested model that does not have that feature. While the Reviewer is right that there might be interactions between the features that we can only unveil through a joint evaluation, my approach is at least valid as a first approximation. I will discuss this limitation in an updated version of the paper in more detail.

      In a future revision of the paper, I will argue more specifically why and how these model configurations are, in general terms, necessary to produce these main effects in the TGM, and why other alternative configurations could not easily generate them.

      Practical guidelines for researchers. It was suggested to make it clearer how researchers could leverage this model in their own studies to understand their data better and to help relating their TGMs to specific neurobiological mechanisms.

      In a future revision of the paper, I will introduce a new section explaining how to use genephys practically, emphasising both opportunities and current limitations.

      Neurobiological interpretation. It was criticised that the results were a mere characterisation of sensor space data, and that these were not related clearly to any neurobiological aspect.

      In a future revision, I will work toward relating the main findings to existing literature in order to strengthen the neurobiological interpretation of the results, and toward a better justification of how genephys can help shed light on specific brain mechanisms.

      Above and beyond these specific points, I intend to restructure the text so that the main goals of the study become clearer. This includes clarifying in the Introduction more unambiguously what is the gap of knowledge this work is specifically tackling.

      Again, I would like to thank the Reviewers for helping me realise the limitations of the current version of the paper.

    1. Author Response:

      The following is the authors' response to the original reviews.

      In brief, we incorporated all wording and clarity suggestions into the manuscript. We also updated figure legends to include additional details, including replicate numbers. New data have been added in response to requests from the reviewers. Volumetric intake data are included as a supplemental figure (Figure 1–Figure Supplement 1A) and we will include movies of the confocal stacks from our CaMPARI imaging. We worked hard to address all the reviewers’ concerns and provide a detailed response below to the reviewers’ public comments as well as their author-specific comments.

      Reviewer #1 (Public Review):

      1) All feeding data presented in the manuscript are from the interactions of individual flies with a source of liquid food, where interaction is defined as 'physical contact of a specific duration.' It would be helpful to approach the measurement of feeding from multiple angles to form the notion of hedonic feeding since the debate around hedonic feeding in Drosophila has been ongoing for some time and remains controversial. One possibility would be to measure food intake volumetrically in addition to food interaction patterns and durations (e.g. via the modified CAFE assay used by Ja).

      We acknowledge that our FLIC assays address only one dimension of feeding behavior, physical interaction with liquid food. However, there is clear evidence that interactions are strongly predictive of consumption, and it would be technically difficult to measure feeding durations at the resolution of milliseconds using a Café assay.  Nevertheless, we appreciate the spirit of this comment and agree that expanding our inference to other measures of feeding, as well as feeding environments, is an important next step. To this end, we now include measures of feeding on more traditional solid food, using the ConEx assay, and find that flies in the hedonic environment consume twice as much sucrose volume compared to flies in the control environment. These have been added as supplemental data (Figure 1 – Figure Supplement 1A), and the text has been updated to reflect our findings.

      2) Some of the statistical analyses were presented in a way that may make understanding the data unnecessarily difficult for readers. Examples include:

      a) In Table I the authors present food interaction classifications based on direct observation. These are helpful. However, the classification system is updated or incompletely used as the manuscript progresses, most importantly changing from four categories with seven total subcategories to three categories and no subcategories. In subsequent data analyses, only one or two of these categories are assessed. It would be helpful, especially when moving from direct observation to automated categorization, to quantify the exact correspondences between all of the prior and new classifications, as well as elaborate on the types of data that are being excluded.

      We appreciate the feedback on our usage of the behavioral classification system and have made several adjustments to improve it. We renamed some of the behaviors to make them more intuitive (see Reviewer #2, comment #1), and updated the main text and Table 1 to reflect these changes. We updated the text and figures to be more transparent about when we group subcategories into main categories for quantification and when we quantify all subcategories separately. Because these videos required manual scoring by an experimenter, after our initial characterizations we opted to score only main categories (which contain subcategories). We agree that it would be useful to quantify correspondence between subcategories and the automated FLIC signal. However, we believe this task is better suited for more advanced and automated video tracking software, and, incidentally, more sophisticated analysis of FLIC data, which has a very high-dimensional character that has yet to be properly exploited. At the moment, therefore, we are not confident in the ability to understand the data at the desired resolution.

      b) The authors switch between a variety of biological and physiological conditions with varying assays, which makes following the train of reasoning nearly impossible to follow. For example, the authors introduce us to circadian aspects of feeding behavior to introduce the concept of 'meal' and 'non-meal' periods of the day. It is then not clear in which of the subsequent experiments this paradigm is used to measure food interactions. Is it the majority of the subsequent figure panels? However, the authors also use starved flies for some assays, which would be incompatible with circadian-locked meals. The somewhat random and incompletely reported use of males and females, which the authors show behave differently, also makes the results more difficult to parse. Finally, the authors are comparing within-fly for the 'control environment' and between flies for their 'hedonic environment' (Figure 3A and subsequent panels), which I believe is not a good thing to do.

      We apologize for our difficulties conveying our inference, which was also noted by Reviewer #2.  We have worked hard to improve this component in the revision. With respect to the confusion about circadian feeding, we introduced circadian meal-times to complement starvation as a second (perhaps more natural) way to measure behaviors associated with hunger. Importantly, we do not use circadian meal-times beyond Figure 1; all subsequent FLIC experiments were conducted during non-meal times of day for 6 hours, which avoids confounding our data with circadian-locked meals even when we use starved flies. We have clarified this point in the revision.

      The reviewer also points out that we make both within-fly and between-fly comparisons, which is a point that we note. Perhaps some concern arises, again, from the challenges that we faced in properly delineating our inferences about different types of feeding measures (and motivations). Inference about homeostatic feeding was made using within-fly measures, comparing events on sucrose vs. those on yeast.  Inference about hedonic feeding was made using between fly measures (average durations of different flies on 2% vs. 20% sucrose).  Treatment comparisons to control always used measures of the same type, such that inference was not made using between-fly measures for treatment and within-fly for control (i.e., all of our figure panels were either within-fly or between fly). We have worked to clarify this in the revision.

      Importantly, our approach to all experiments avoided confounding by used randomized design at multiple levels (e.g., randomizing control and hedonic environments to FLIC DFMs, alternating food choice sidedness in the DFMs), by ensuring that flies in both environments are sibling flies that came from the same vial environment before being tested, and by performing each experiment multiple times.

      c) Statistical analyses are not always used consistently. For example, in Figures 3B and C, post hoc test results are shown for sucrose vs. yeast interactions, but no such statistics are given for 3E and 3F, preventing readers from assessing if the assay design is measuring what the authors tell us it is measuring.

      We report p-values for two-way ANOVA interaction terms for all appropriate experiments. If (and only if) the interaction term is significant, we conduct post-hoc tests for more detailed statistical analysis and report the p-values. The reviewer points out that we do not perform post-hoc tests in figures 3E and 3F. These figures had a non-significant interaction term, and thus, we did not feel a post-hoc test was warranted.

      Reviewer #2 (Public Review):

      1) The dissection of feeding into distinct behavioral elements and its correlation with electrical FLIC signals that allow interpreting feeding types is a fundamental new method to dissect feeding in flies. However, the categories of micro-behaviors in Table 1 are not intuitive.

      We agree and have updated the Table, figures, and main text. Please see also our response to Reviewer #1, comment #1.

      2) The details for the behavioral data analysis are not clear and should be made more obvious. For example, how many males and females were used in each experiment? Were any of the females mated or were they all virgins? If all virgins, why not use mated females? Mating status may have an effect on the feeding drive. If mated and virgin females were used, are there any differences between them? Similarly, for diurnal feeding experiments, it is not immediately clear from the graphs how many animals were used and how the frequencies were obtained (Fig. 1F, presumably averages for each category per fly but that is inconsistent with the legend in the supplement for this figure). Why does the transition heat map not include all micro-behaviors (Fig. 1E, no LQ data which are significant in diurnal feeding)?

      We have clarified the number of flies and events for each behavioral experiment in Figure 1, and we updated the figure legend appropriately. We note that these behavioral datasets are non-overlapping, and each time we mention the number of events scored in the text, that number includes only “new” videos. Female and male flies for all experiments were mated, and we have clarified this in the main text and methods.

      For the diurnal experiment in Figure 1F, we scored over 700 events from new (non-overlapping) video compilations and updated the number of flies and event number in the figure legend. The diurnal data we present in the supplement for this figure is a separate experiment conducted on 38 flies, intended only to demonstrate the circadian nature of fly feeding.

      For the transition heat map, analysis of this sort seems to require a large amount of data to have sufficient power to return a transition matrix. LQ events are relatively low in frequency, so we opted to combine them with L events for this analysis. We have updated the figure and figure legend to reflect this.

      3) The CaMPARI images do not look great, particularly in the pan-neuronal condition (Fig. 5A). It would be useful to include the movie of the stack. Did any other brain regions show activity differences, such as SEZ or PI? These regions are known to be involved in feeding so it seems surprising they show no effect.

      We find that CaMPARI imaging is subject to high levels of noise and background, especially when using a broad driver as the reviewer has pointed out. This is why we opted to follow-up our pan-neuronal CaMPARI experiment using a more specific mushroom body driver and to test our correlational findings of increased MB activity in hedonic environments with genetic approaches in the remainder of Figure 5. We have included movies of the confocal stacks for both CaMPARI experiments, as requested. 

      Reviewer #1 (Recommendations For The Authors): 

      Main concern: 

      No measurements of intake, either in volume or in caloric value. Hence, 'hedonic' feeding is only indirectly supported. 

      I would like to suggest to the authors that they measure intake volumetrically in addition to food interaction patterns and durations. For example, William Ja developed a modified CAFE assay that measures consumption volume in real-time in freely behaving flies (http://dx.doi.org/10.1038/nprot.2017.096). Liming Wang has another capable assay. Additional values of expanding measurement methods for feeding are that it helps tie the research more directly to that of others, and it helps remove the concern that any one assay may introduce unknown biases. 

      For the CaMPARI, it would be helpful to provide a demonstration of its effectiveness by recapitulating a deep brain neural pathway known to be engaged by a stimulus by GCaMP or electrophysiology. 

      Additional concern: 

      The authors assume satiety states during different circadian periods (line 253, for example). It seems critical to directly measure the satiety state. 

      Technical concerns: 

      Figure 5 A, B: there is reported near zero UV transmission through the head: https://doi.org/10.1364%2FBOE.6.000514, hence the CaMPARI measurements are suspect. It appears that there may be an effect in the optic lobes that may receive greater UV illumination by being more peripheral. A positive control to demonstrate deep brain access by UV is needed. 

      Y-axes vary for the same measurement types within figures, for example, Figure 5 C-G. Also Figures 3F, G, I, K, M and Figures 3D, E, H, J, L. This hinders direct comparisons. 

      Figure 2: why are there no statistics to distinguish interaction (I) events from F and L? Why are the example graphs presented using different scale x-axes? For A-C, why no averaged response graphs for the classifications? Were there other events that did not fit these classifications? 

      In lines 224-226, the claim of statistical significance at p=0.061 makes the reader suspicious of the statistical interpretations throughout the manuscript. 

      Figure 3B starved looks the same as Figure 3C sated for females, using the same assay and conditions. This implies a huge amount of variance in behavior between experiments. 

      We appreciate the recommendations from Reviewer #1 and have done our best to address many of their concerns. Regarding their main concerns, we have added volumetric feeding data to the manuscript, included movies of the confocal stacks for the CaMPARI experiments, and clarified the circadian timing of our behavioral experiments. These details are outlined in our public response to both reviewers. The reviewer also expressed a few technical concerns, mostly regarding statistical analyses. We agree that there seems to be a large amount of biological variability between experiments, which we do indeed find to be the case with behavioral experiments of this sort. For this reason, we avoid making direct comparisons on absolute values between experiments, as the reviewer suggests, and thus allow our Y-axes to vary for each figure to better facilitate within-experiment comparisons. The reviewer also points out that, in one instance, we refer to a p-value of 0.061 as statistically significant in the text. While we have changed our language to reflect the perceived convention, we note that there is little inferential difference between these values, and we report exact p-values to allow the reader to make an informed decision.

      Reviewer #2 (Recommendations For The Authors): The writing and data presentation in this paper is somewhat dense and confusing at times. Comments and questions below are intended to help improve data presentation and resolve questions that will help the reader navigate and understand the data to better appreciate the significance of the findings. 

      Comments and questions: 

      Line 160 cites Chen et al, 2002 as an example of behavioral characterization that is useful for read-outs of neural states, but no neural states were defined in that work. A better example where a circuit was linked to a specific behavioral category is PMID30415997 (Duistermars et al., 2018). 

      Line 171: were the females mated or virgin or was it variable? 

      The classification system in Table 1 is a bit confusing. For example, the distinction is made between Fast and Long feeding events as well as interactions with food and other events. FH meet the requirements of F and H, presumably meaning that flies are fast feeding and touching the food with their front legs. Why are front legs and hind legs touching food abbreviated H and FF respectively instead of something more obvious like IF and IH (referring to Interaction with Front legs or Interaction with Hind legs)? 

      Also was there never any tasting with the middle legs? In Fig1B, all the I events are grouped. Are most of these H or FF events? The frequency in Fig. 1B is shown as normalized as a frequency of all events. The statistical analyses are all parametric. Are these data normally distributed? 

      Lines 224-229: the relative frequency of L-type feeding is increased in starved flies and the relative frequency of F feeding is decreased. Is the relative L- or F-type feeding frequency considered on total behavior or just the sum of long and fast feeding or the sum of all types of feeding? 

      The events that are analyzed vary throughout the paper. Line 173 mentions 300 events, line 222, 500 events, and line 257, 700 feeding events. Are these all independent experiments, or are these overlapping data sets analyzed for different parameters? 

      For diurnal feeding behavior, the authors analyzed 700 events and found significantly more LQ events during meal time (i.e. at the beginning and end of the day). Based on the figure legend in the supplement to Figure 1, it appears that these data were collected on 38 female flies. But in Fig 1F, there are ~8 points per feeding type (F, L, and LQ) during meal and non-meal conditions. Shouldn't all 38 flies have an average frequency for each type of feeding during meal and non-meal times? Were these females mated or not? Is this effect also true for males? To help the reader understand the data better, it would be helpful to note the number of flies used in each experiment or in each analysis in the different figures and wherever the data are mentioned in the manuscript. It also seems likely that the mating state may have an effect on feeding so knowing the result in mated versus unmated would be a useful analysis. 

      It is interesting that there is a difference in feeding in starved flies versus diurnal feeding in the presumably hungry versus sated phase (meal versus non-meal phases). As mentioned by the authors earlier in the manuscript, starved flies have a relative increase in L-type feeding. However, they perform less LQ feeding than sated flies, and yet LQ feeding is the only significantly different type of feeding in the hungry state of diurnal feeding. In the morning, the transition to feeding is very abrupt compared to the gradual increase in the evening. Is there any difference between the type of feeding or the transition matrix in the evening versus morning meal times? Also, why is LQ feeding not included as a category in the transition matrix in Fig 1E? 

      In Fig 2, the authors examine FLIC signals with video data to identify feeding types from FLIC signals. Why are there signal durations for F-type feeding that are longer than 3 seconds when it is defined as 1-3 sec of the proboscis contact with food and conversely signals of L-type feeding shorter than 4 seconds when it is defined as >4 seconds of continuous proboscis contact? Does this mean that signal can be longer or shorter than the actual time the proboscis is in the food? 

      With these parameters, the authors develop an assay to identify homeostatic and hedonic feeding by applying the signal analysis to food choices representing homeostatic (2% sucrose versus yeast) and hedonic (2% sucrose versus 20%) conditions. In Fig 3C, they show that fully-fed females show a stronger preference for yeast food than sugar food compared to males (line 335). Is this in fully fed animals? The yeast preference in females looks almost the same as in the starved females in Fig 3B. 

      The CaMPARI images shown in Fig 5A (and to a lesser extent Fig 5B) are not particularly convincing although the quantification looks clear. Providing the movies of the stacks may help the reader better appreciate the difference in MB red signal in the hedonic state. It would also help to show the number of flies that were tested in these experiments as well as the sex and mating status. Provide the n in the figure legend and in the relevant sections in the text. 

      Were the mushroom bodies the only brain region with significant, measurable activity changes? One might expect changes in other feeding areas, such as the subesophageal zone (SEZ) and the peptidergic regions of the brain (PI), which are both known to affect feeding in flies. This may also be a useful method to examine differences in mated versus unmated flies. 

      In Fig 5C the caption reads MB lambda lobe inhibition. Shouldn't this be gamma lobe inhibition as suggested in the figure legend? 

      The paper largely distinguishes homeostatic from hedonic feeding only. It may be useful to discuss other non-homeostatic mechanisms as well or at least make the distinction in the introduction and or discussion.

      We thank reviewer #2 for their thoughtful suggestions to improve the clarity of the manuscript. They suggest several improvements, which we implemented, including that we improve the classification system in Table 1 to make it more intuitive, state how we normalized observed behavioral frequencies, clarify that the number of events we cite for each experiment are non-overlapping, and explain the use of circadian meal vs. non-meal times. We also noticed, as did this reviewer, that the usage of L vs. LQ events differs between starved flies and flies observed during meal-time. We agree that it may be interesting to sort out the nuances of why and how these differences occur, as it suggests that starvation may in some ways be different from physiological hunger. However, our method of manually observing flies would make this difficult at present. We hope to utilize more advanced video tracking software in the future to investigate this question. The reviewer also posed several questions about the hunger/satiety state of flies that we used for each experiment, which we clarified throughout the main text, figure legends, and methods.

      This reviewer points out two technical concerns, which we have addressed. The concerns about our CaMPARI imaging are noted, and we have discussed them in response to reviewer #1 and in our public response. We now include movies of the confocal stacks, as requested. There was also a question about FLIC durations of F and L events in Figure 2, with some visually identified F events producing FLIC signals longer than 4 seconds and some L events producing FLIC signals shorter than 4 seconds. Although we show that population averages from the FLIC can reliably recapitulate our visual metrics, there is occasional noise at the individual level. For example, although a fly may have contact of its proboscis with the food for less than 4 seconds, the FLIC signal may persist slightly beyond that interaction due to sustained contact with a non-proboscis body part or due to liquid food contacting the signal pad. We also occasionally observed L events that we visually identified to last longer than 4 seconds, but nevertheless did not produce a FLIC signal of equal length. This can occur when a fly feeds on the liquid food but transiently loses contact with the signal pad. Although there is some noted technical noise, we show that population-level data is sufficient to reflect our visual observations.

    1. Author Response

      Reviewer 1:

      The reviewer indicated the data convincingly demonstrates absence of Perlecan causes a severe perturbation of the ECM-based neural lamella, that synaptic terminals degenerate, and that axons and even entire nerve bundles break. The reviewer noted that future studies will be important to define the precise source of Perlecan and the underlying mechanism for axonal breakage, and suggested several follow-up experiments. We address these comments below.

      1. The reviewer noted our data indicate Perlecan’s role in synaptic retraction is not due to its absence from neurons and that some of the wording is confusing in this regard.

      We’ve tried to make it clear throughout the manuscript that Perlecan functions non-cell autonomously, as our failure to rescue with neuronal re-expression or recapitulate the phenotype with neuronal-only RNAi indicates. As such, we agree that the phenotypes are not due to Perlecan loss within neurons, consistent with our data showing breakdown of the neural lamella ECM and subsequent axonal breakage. These phenotypes do manifest in neurons, but the defect is triggered non-cell autonomously as described in our study and stated by the reviewer here.

      1. The reviewer suggested future experiments to resolve the source(s) of Perlecan secretion from defined tissues that control neuronal stability, noting that showing ubiquitous rescue with a pan-cellular Gal4 driver would be useful.

      We did do pan-cellular rescue and overexpression experiments with the ubiquitous Tubulin-Gal4 driver, but expression of our two UAS-trol transgenes with this strong driver resulted in lethality. This observation indicates too much Perlecan expression is also detrimental for ECM function. Interestingly, we found that NMJ synapses do not retract following ubiquitous Perlecan overexpression in wildtype larvae, so another aspect of ECM dysfunction is responsible for lethality under this condition. As reported in the manuscript, we found driving a Trol RNAi with multiple Gal4 lines expressed in specific cell populations was unable to recapitulate the synaptic retraction phenotype, including pan-neuronal (elavC155), neuronal and muscle (elavC155 and mef2-Gal4), glial (repo-Gal4), fat body (ppl-Gal4, Lsp2-Gal4), hemocytes (Hml-Gal4), and fat body and hemocytes (c564-Gal4) driven expression. These data suggest Perlecan secretion is required by multiple cell types to achieve sufficient accumulation in the ECM to prevent neuronal instability.

      1. The reviewer indicates future studies of the blood-brain barrier might reveal insights into the pathology and axonal breakage we observe. The reviewer also suggests we perform a detailed timeline of the axonal breakage timeline.

      We agree with the reviewer that examination of the blood-brain barrier and glial dysfunction will be exciting experiments for future studies. For the phenotypic timeline, this was an important component of our study and was done in two ways and described in the manuscript. In Figure 4, we describe serial in vivo imaging of synapses with briefly anesthetized larvae over 4 full days of imaging. In Figure 9, we describe fixed imaging of larval axons at specific developmental stages (2nd, early 3rd, wandering 3rd instar). This set of experiments provided a detailed timeline for synaptic retraction and axonal breakage. As suggested, we also used single neuron drivers (MN1-Ib) to label a single motoneuron and examine axonal breakage and synaptic retraction at this scale. This data is shown in Figure 9E. Together, these experiments provided a timeline for the biology we observe – disruptions of the neural lamella ECM, disorganization of the axonal microtubule cytoskeleton, followed by axonal breakage and fragmentation (usually in a hemi-segment coordinated manner), with subsequent synaptic retraction at NMJs.

      1. The reviewer indicates the final model in Figure 10 may not be fully representative.

      We feel this model best describes our complete dataset on the Trol mutant. We provide evidence for each of these phenotypic events in detail in the paper. The disruptions to the neural lamella are described in Figure 8. The onset of synaptic retraction does occur in the 3rd instar stage and not the 2nd instar stage – Figure 4 shows this with serial in vivo imaging where we see normal synaptic morphology on Day 1 (2nd instar stage) and degeneration over the 3rd instar period (Days 2-4). The figure does not indicate Perlecan functions for synaptic stability by residing at the NMJ, only that synaptic retraction occurs. Indeed, as stated in the text, we argue against a role for Perlecan function directly at the NMJ for the phenotypes we describe, but rather as a downstream consequence of ECM disruption and following axonal breakage.

      Reviewer 2:

      The reviewer noted the work provided a strong and thorough genetic analysis of the role of Perlecan in neuronal stability and axonal retraction. The reviewer provided some suggestions for future experiments and requested a few clarifications.

      1. The reviewer wondered whether mutations in other neural lamella components also cause synaptic retraction and potential genetic interactions between Trol and Vkg.

      We agree further genetic studies of other neural lamella components will be of interest. In the case of Vkg, null mutations in the locus result in embryonic lethality, suggesting it plays a more critical role in overall ECM function. Although we did not perform genetic interaction studies between the two mutants (for example trans-heterozygotes), they have been shown to interact in multiple other contexts as described in the manuscript.

      1. The reviewer noted the lack of whole animal Trol rescue.

      As described in point #2 above, we did do pan-cellular rescue experiments with the ubiquitous driver Tubulin-Gal4, but driving our two UAS-trol transgenes resulted in lethality, indicating a strong-dosage sensitivity to Perlecan function.

      1. The reviewer indicated the hyperactive Mhc mutant was an interesting experiment but only examines one alternative. They wondered if we could reduce muscle contraction and see if that "rescues" the trol phenotype. The Mhc1 null mutant is embryonic lethal, and the retraction phenotypes do not occur until the 3rd instar stage, so that experiment would not be possible. However, we did attempt to block muscle contraction by expressing a UAS-tetanus toxin to eliminate evoked neurotransmitter release with our MN1-Ib Gal4 driver (pan-neuronal expression of tetanus is lethal). This did not alter the synaptic retraction phenotype, but it was difficult to make strong conclusions for this experiment as the co-innervating Is motoneuron was not expressing tetanus toxin. As such, we did not include this data in the manuscript, though it does generally support the model that synaptic retraction is independent of muscle contraction and rather occurs downstream of the axonal breakage that we highlight.

      2. The reviewer wondered whether other Wnt signaling manipulations might be useful to test interactions with the Trol retraction phenotype.

      Given we used the same Sgg-CA that was used to block the previously reported ghost bouton phenotype in Trol mutants and saw no effect on retraction, we did not feel that was a fruitful pathway to keep pushing on. Indeed, all our evidence point to a non-Wnt role, with neural lamella disruption and axonal breakage being the key insults.

      Reviewer 3:

      The reviewer indicated the work described an interesting and important role for Perlecan in motor neuron axon maintenance. The reviewer suggested experiments to elucidate the mechanism of action of Perlecan would benefit the study.

      1. The reviewer indicated it would be beneficial to validate the Wnt and Wallerian degeneration transgenic lines used in the study to provide a positive control.

      Our study used previously published and well-established Sarm RNAi and Sgg-CA transgenic lines (Sarm RNAi from the DiAntonio lab) and Sgg-CA from Kamimura et al., 2013, via BDSC) that have been published multiple times and are well-validated in the field. These were not new lines that we generated. We also blocked Wallerian degeneration with a number of other perturbations to the pathway and did not see rescue of synaptic retraction in these cases either. Sarm is an upstream pathway component and thus the manipulation we included in the manuscript.

      1. The reviewer notes similar questions on cell-autonomy that we addressed in point 2 to Reviewers 1 and 2 above.

      The reviewer noted it would be helpful to show that the single cell-type RNAi experiments are working by western blotting for Perlecan. We performed a similar approach by examining knockdown of the endogenous Trol-GFP by the RNAi with immunostaining. Pan-cellular knockdown with Tubulin-Gal4 eliminates the staining (validating the RNAi line, Figure 1D-I), while knockdown with the individual drivers does not (Figure 5C-G). Although we used well-established cell-type specific Gal4 drivers that have been used to many other studies, we cannot eliminate strength of expression of the driver as an issue for failure to recapitulate the phenotypes. However, other experiments we performed and presented in the figures supports a non-cell autonomous role for Perlecan in axonal breakage and synaptic retraction.

      1. The reviewer suggested a similar approach that Reviewer 2 did above in point 3 about the role of muscle contraction.

      We agree eliminating muscle contraction altogether would be a nice assay for the role of mechanical stress, but we don’t have muscle specific drivers to eliminate contraction from only a single muscle (eliminating it everywhere is lethal). However, we did attempt to block muscle contraction by expressing a UAS-tetanus toxin to block evoked neurotransmitter release with our MN1-Ib Gal4 driver as described above. Future experiments with the newly described BoNT-C toxin produced by the Dickman lab might be a promising approach for a full elimination of all motoneuron release to achieve a similar effect and test in the Trol mutant.

      1. The reviewer wondered what other components of the ECM are affected beyond Vkg in the Trol mutant.

      This is an exciting question to pursue in future studies. Together with genetic interaction experiments with other ECM components, as well as a detailed analysis of the effects on glia that surround larval nerves, such studies will further refine mechanistic actions on how loss of Perlecan triggers axonal breakage and downstream synaptic retraction.

    1. Author Response

      Reviewer #1 (Public Review):

      In the present study, Yasuko Isoe, Ryohei Nakamura & colleagues follow a lineage analysis study aiming at identifying the clonal organization of the dorsal telencephalon. The authors use the teleost fish medaka to conduct their experiments since it displays a clearly delineated dorsal pallium. After identifying the clonal units that constitute the dorsal telencephalon, they analyze the epigenetic landscape in each unit. The authors identify then differential open chromatin patterns that they relate to functional aspects of each unit, and additionally, use the epigenetic landscape to infer the identity of transcription factors operating as putative regulators. Overall, the study consists of an impressive amount of data that shed light on the organization of a central brain region in vertebrates.

      The findings in the manuscript are organized into two main sections: lineage analysis and epigenetic organization. The authors combine genetic tools with laser dissections of specific clones and ATAC-seq and RNA-seq analysis in multiple samples, an approach that is very elegant and follows high technical standards. For lineage analysis, the authors used a basic, but appropriate, tool to induce and follow clones generated in early embryos, with the side note that lineages are followed using a non-ubiquitous promoter so that the authors restrict their analysis to neural progenitors. My overall impression is that the authors have collected a massive amount of high-quality data, which unfortunately is not properly integrated or discussed in the manuscript. There is only a superficial effort in incorporating the two main findings, which contrasts with the depth of acquired data.

      The observation of clonal sectors in the pallium is a great finding that deserves a more detailed analysis in terms of their developmental and evolutionary origin: How many progenitors are used to set up the entire pallium? What is the smallest clone that contributes to it? Is there any laterality bias in the clonal architecture?

      Thank you for the question. We interpret the first question as, “how many neural progenitors (or neural stem cells) at the early developmental stage contribute to the adult pallium?”. Based on the number of clonal units visualized in the pallium, we assume that there are around 50 neural stem cells at the neurula stage that provide cells in the pallium.

      In terms of the smallest clone, we found a dozen of cells in the anterior lateral pallium region (Dla) as the smallest clone. But since the HuC promoter activity is not strong in Dla (shown in Figure 1 – figure supplement 2B), we didn’t observe the clones in a reproducible way, so we removed the clones in Dla from the comprehensive structural analysis. The second smallest clone is the cells in the Dcpm, in which only a few dozens of cells were labeled at once.

      And for the last question, we didn’t find any lateral bias in the clonal architecture in the telencephalon (shown in Figure 1- figure supplement 3A, 3C)

      We added the explanation above in the revised manuscript. (page 29, line 591 - 595)

      Is the clonal architecture exclusive for progenitors or does it extend to neurons as well?

      Though we used HuC promoter to visualize the clones which should label the neural progenitors, we observed long axonal projections from Dp to the olfactory bulb, which suggest that this transgenic line labels both neural progenitors and young mature neurons, at least in some brain regions. So yes, we assume this clonal architecture extends to neurons as well, and we added descriptions to the revised manuscript. (page 10, line 205-207)

      How has the clonal architecture impacted the morphological diversity of the pallium among teleosts? What are possible evolutionary paths to explain this phenomenon? The authors' discussion on this point circles around one concept, illustrated in the following sentence: " (The clonal architecture) ... possibly explains how the difference in diversity between the pallium and subpallium has emerged: the subpallium is conserved because cells belong to various clonal units intertwined with each other, which has constrained their modification during evolution; whereas the pallium is diverse because of the modular nature of the clonal units which allows for the emergence of diversity". This is the concept that I have the most problems with. The authors' reason that a more defined clonal structure (pallium) makes a system more prone to evolutionary novelties, while a region where clones intermingle (subpallium) is more rigid and therefore more conserved between species. Is there experimental data that backs up this statement in any other systems? If there is, I urge the authors to share these here. If this is just a speculation, then the argument would benefit from an explanation of how this clonal organization allows for evolutionary novelty.

      We appreciate the reviewer’s question. In order to make our point, we added the following paragraph to the revised manuscript,

      “Our structural analysis in the adult medaka telencephalon revealed that the clonal architecture between the pallium and subpallium differs in the distribution of cells in clonal units: clonal units in the subpallium intertwine with each other, whereas the pallium is formed by the compartmentalized clonal units, giving rise to a modular structure. Modular structure is frequently seen in the animal body, including brain; central complex in insect 40, cerebellum in vertebrates 41. And the modularity of cell populations or organs is generally thought to contribute to evolutionary flexibility; one module can acquire a new phenotype without impacting the others.42, 43, 44 . We assume that the modular nature of the clonal units in the pallium plays a key role in the diversity across teleost.” (page 23, line 448-452)

      Would it happen by the appearance of more clones at the early stages of development? The authors leave this central point untouched even when discussing the evolutionary origin of the pallium in teleosts.

      Thank you for the comment. As shown in the previous report, when the Cre-loxp recombination was induced at the early developmental stage, a wider expression of GFP is observed across the whole brain (Okuyama et al. 2013). This suggests that the neural stem cells at the earlier developmental stage generate daughter neural stem cells which produce neural progenitors later. We added a few sentences mentioning this in the revised manuscript. (page 7, line 146-149)

      Having shown the clonal architecture of the pallium and conducted a detailed epigenetic analysis in clones, the authors could also speculate on what is special about this type of organization. Particularly, how they envision that cells belonging to the same clone inherit a common epigenetic landscape that will define their function later on.

      Thank you for the comment. To explain the epigenetic feature of this pallial organization, we added the following paragraph in the revised manuscript.

      “As shown in mammals, the epigenetic landscape can be inherited from apical progenitors, which have a multipotency, to the late neural progenitors during development 37. Since the teleost exhibit post-hatch neurogenesis in the entire life, we think that the common epigenetic landscape is inherited in each clonal unit in the adult medaka telencephalon. And as a result, we make the assumption that function and characteristic of each clonal unit is defined already in progenitors by specific regulators (e.g. TFs), and those progenitors continuously produce neurons that possess the same property to function in a coordinated manner.” (page 22, line 433-439)

      There is little analysis of the cellular organization of each clone, mainly because the authors labeled only a subset of the real, genetic clone. The authors present images of entire brains and optical horizontal and transverse sections, which largely sustain their claims for a clonal organization. The difference in the clonal arrangements between the Dld and the Vd is clear, but the authors could provide a higher-resolution image of some clones in the telencephalon to get an idea of the cellular composition of the regions they use for their analysis.

      Here, we added a new panel in Figure 2 which is a combination of previous supplemental figures S3-1,2,3 to show our analysis on the cellular organization of each clone. We showed how the pallial regions, other than Dld, are formed by multiple genetic clones in different colors, and also the projection from each clone. (page 9, Figure 2B)

      What is the extent of non-GFP cells in the regions they use for RNAseq and ATACseq? From the images shown it is very difficult to realize whether all cells in the clonal sector do indeed belong to the clone.

      Thank you for your question. In our revised manuscript, we analyzed the ratio of cells labeled in this transgenic line (HuC:loxp-DsRed-loxp-GFP). We found that a large portion of cells (around 60-70% cells) are DsRed positive in our transgenic line (Figure 1 - figure supplement 2B). (page 7, line 142-143)

      Reviewer #2 (Public Review):

      In this study, Isoe and team produced an atlas of the telencephalon of the adult medaka fish with which they better defined pallial and subpallial regions, characterized the expression of neurotransmitters, and performed clonal analysis to address their organization and maintenance during the continuous neurogenesis. They show that pallial anatomical regions are formed by independent clonal units. Furthermore, the authors demonstrate that pallial compartments exhibit region-specific chromatin landscapes, suggesting that gene expression is differentially regulated. Specifically, synaptic genes have a distinct chromatin landscape and expression in one of the regions of the dorsal pallium, the Dd2. Using the region-specific RNA expression and chromatin accessibility data they have generated; the authors propose several transcription factors as candidate regulators of Dd2 specification. Lastly, the authors use the enrichment of transcription factor binding motifs to establish homology between medaka and human telencephalon, aiming to describe an evolutionary origin for the Dd2 region.

      Overall, the study carefully describes diverse aspects of neurogenesis in the telencephalon of the adult medaka fish. As such, the manuscript has the potential to contribute insights to the understanding of circuits and neurogenesis in teleosts and the medaka fish, as well as the evolution of cellular heterogeneity and organization of the telencephalon. Furthermore, the atlas, if easily accessible to the broader community, could be a substantial resource to researchers interested in medaka and teleosts neuroscience. However, there are some conceptual and technical concerns that should be addressed to strengthen this work.

      Improving the atlas: The different interpretations of the imaging data generated remain isolated or fragmented and could be better integrated to describe anatomical, connectivity, and ontogeny differences through pallial and subpallial regions.

      In the revision process, we described the details of anatomical, and connectivity differences in the adult pallial and subpallial regions in Table 2. This document includes the description of comparing the brain regions with previous atlases.

      In terms of the ontogeny differences, we described the neural stem cells localization in the telencephalon in Figure 1 figure supplement 4. “The cell-body distribution in the pallium and subpallium is consistent with the pattern of the neural stem cell (radial glial) (Figure 1 – figure supplement 4). In the teleost telencephalon, the cell bodies of radial glia are located in the surface of the hemispheres and project inside the telencephalon 15. Since neural progenitors migrate along those axons, it is consistent that the cell bodies of the pallial clonally-related units are clustered along those axons in a cylindrical way.”(page 8, line 175-179; page 22, line 427-431))

      Molecular differences across regions and species: Differential gene expression and chromatin accessibility throughout regions should be better and more deeply characterized and presented, exhibiting more region-specific features, and leading to a better description of candidate transcription factors that could differentially regulate regional gene expression.

      The comparison between medaka fish and human telencephalon regions would benefit from a more extensive molecular analysis. Comparison of gene expression and accessible regions could expand the analysis together with TF-binding motif enrichment.

      In order to check the gene expression across brain regions in the different vertebrate species, we examined the mammal gene expression data (in situ hybridization) from the Allen Institute database. We analyzed the expression of all the Dd-specific expressing genes (809 genes) across the mammalian brain regions (12 regions), but we could not observe strong correlations with any specific brain regions in mice. Therefore, we have revised our conclusions regarding the correspondence between medaka's Dd2 and mammalian brain regions to be more cautious. (page 20, line 396 - page 21, line 401)

      Lineage tracing: The authors claim that the functional compartmentalization of the pallium relies on different cell lineages, which also mostly share connectivity patterns and, at least to some extent, expression patterns. It would be interesting to see how homogenous these lineages are at the molecular level and whether their compartmentalization is retained when neurons reach maturity.

      Thank you for the comment. We think single-cell RNA-seq in cell lineages in the future will allow us to see how homogenous cells that derived from the same lineages are at the molecular level and to assess the cell-type of the cells.

    1. Author Response

      Reviewer #2 (Public Review):

      The paper by Arribas et al. examines the coding properties of adult-born granule cells in the hippocampus at both single cell and network level. To address this question, the authors combine electrophysiology and modeling. The main findings are:

      Noisy stimulus patterns produce unreliable spiking in adult-born granule cells, but more reliable responses in mature granule cells.

      Analysis of spike patterns with a spike response model (SRM) demonstrates that adult-born and mature GCs show different coding properties.

      Whereas mature GCs are better decoders on the single cell level, heterogeneous networks comprised of both mature and adult-born cells are better encoders at the network level.

      Based on these results, the authors conclude that granule cell heterogeneity confers enhanced encoding capabilities to the dentate gyrus network.

      Although the manuscript contains interesting ideas and initial data, several major points need to be addressed.

      Major points:

      1) The authors use and noisy stimulation paradigm to activate granule cells at a relatively high frequency. However, in the intact network in vivo, granule cells fire much more sparsely. Furthermore, granule cells often fire in bursts. How these properties affect the coding properties of granule cells proposed in the present paper remains unclear. At the very least, this point needs to be better discussed.

      In vivo whole cell recordings of granule cells are very scarce. In our study, we based the design of our stimulus on recordings from the intact network in vivo (PerniaAndrade and Jonas 2014), which show that granule cells receive a wide range of frequencies, with a power spectrum that exhibits a power law decay. These properties are built in our noisy stimuli. These in vivo recordings have also reported the presence of theta oscillations, showing a peak in the spectrum. However, in our approach we deliberately removed these oscillations from our stimuli because it is best to fit GLMs using white noise or noise with an exponentially decaying autocorrelation (Paninski et al. 2004).

      Thus, our choice of the stimuli is far from arbitrary, but rooted on experimental evidence from intact network in vivo recordings, together with previous knowledge about GLM/SRM fitting. This comment reveals to us that we did not clarify this enough in the manuscript. We are grateful to the reviewer for revealing this omission, since this is in fact an important aspect of the study strategy. In the revised manuscript, we brought these points up front in the results section when we introduce the stimulus for the first time, and more thoroughly discussed it in the Methods section that describes the stimulus.

      Still, the bursts observed in granule cells are an important feature and they have been observed to be phase locked to the theta-gamma oscillations in vivo (Pernia-Andrade and Jonas 2014). In the revised version of the manuscript we included new experiments and simulations with stimuli that include a peak in theta frequency. We found that immature neurons also improve decoding performance with these theta modulated stimuli.

      2) The authors induce spiking in granule cells by injection of current waveforms. However, in the intact network, neurons are activated by synaptic conductances. As current and conductance have been shown to affect spike output differently, controls with conductance stimuli need to be provided. Dynamic clamp is not a miracle anymore these days.

      The use of dynamic clamp sounds in principle like a good suggestion. However, in the manuscript we have taken a different approach to enable the use of a single neuron GLM that uses currents as inputs. To control for the differences between mature and immature neurons we used currents with amplitude normalized by the input resistance, and both types of neurons were measured with the same technique to allow for the comparison.

      Importantly, the GLM type model that we use assumes that the membrane potential is a linear convolution of the input, which permits a straightforward and robust fitting approach. We argue that this is not a minor issue, since using dynamic clamp would require a drastic modification of the model. Furthermore, the use of conductance stimuli would not allow for the straightforward model fitting we perform with our approach. The key point here is that the membrane potential would not be correctly approximated as a linear function of the conductance stimulus, precluding the fitting strategy.

      Finally, at the moment we do not have the equipment to perform the suggested experiment, so this suggestion would require a big amount of time to acquire the equipment and set up the experiments in mature and immature neurons. In addition, we would have to change the model and develop a different fitting strategy. With the controls that we already have in the manuscript, we do not think dynamic clamp experiments would fundamentally change the conclusions of the manuscript. Thus, we argue that this is beyond a reasonable timeframe for this revision, but could be something to further explore in future. We now mention this possibility in the discussion.

      3) The greedy procedure is a good idea, but there are several issues with its implementation. First, it is unclear how the results depend on the starting value. What we end up with the same mixed network if we would start with adult-born cells? Second, the size of the greedy network is very small. It is unclear whether the main conclusion holds in larger networks, up to the level of biological network size (1 million). Finally, the fraction of adult-born granule cells in the optimal network comes out very large. This is different from the biological network, where clearly four or five-week-old granule cells cannot represent the majority. Much more work is needed to address these issues.

      The reviewer approves the greedy procedure that we apply in our manuscript and poses three issues for consideration.

      First, the reviewer queries what would be the result of starting the procedure with a different pool of simulated neurons, and whether we would obtain “the same mixed network if we would start with adult-born cells”. Let us remark that the outcome of the greedy procedure is not always the same mixed population of neurons. For each different mature neuron that we use to start the procedure, the trajectory (see Fig. 4A) of selected neurons will be different. Thus, the final population (network) will be different, and this is reflected in the error bars that we obtain in Fig. 4. Presumably, starting with adult-born cells will change the outcome of the greedy procedure. However, note that this is not the point of the approach. The motivation to start with mature neurons is to ask whether adult-born cells can contribute something to decoding, given that mature cells on their own perform better.

      Second, the reviewer questions the size of the population that we reach with the greedy procedure. Note that for the population sizes that we show in the manuscript the decoding performance already begins to saturate, Fig. 4F-H. Furthermore, it is unfeasible to construct a 1M neurons population due to the computational cost –the time it takes to run the algorithm. These two facts motivated us to stop at 12 neurons as it strikes a good balance between computational time and saturation. Importantly, as we expand below, the aim of the greedy procedure simulation is not reconstructing the actual network of the dentate gyrus. Rather, we seek to understand whether immature neurons could improve coding in a population.

      Third, the reviewer observes that the fraction of adult born cells in the reconstructed populations using the greedy procedure are large as compared to the biological network. Again, here note that the aim of the whole in-silico experiment is not to recover the biological network, where other aspects are at play. More simply, we query the possible contribution of adult born cells to coding. In fact, if we obtained the same proportion it would be by chance, since we do not think that adult-born cells in the dentate gyrus are chosen according to the greedy algorithm.

      Still, this comment from the reviewer motivated us to include further simulations of the greedy procedure with constraints. In the revised manuscript we show new results using the greedy procedure, but constraining the fraction of immature neurons in the resulting populations, see Figure 4-figure supplement 2.

      More generally, we think that these comments reveal a possible misunderstanding about the approach, its purpose and the interpretation of the results. The point of the greedy procedure is to show that immature neurons do in fact contribute to improve the decoding, despite being generally worse individually. We do not claim that the population obtained with the greedy procedure faithfully reflects the actual shape of the in vivo network. We are aware that it does not. We see that this may have not been clear in the original version. In the revised version, we now explain the purpose of the greedy procedure when we introduced it. Additionally, we comment on the proportion of immature neurons in the same paragraph.

      4) Likewise, the idea of dynamic pattern separation seems quite nice. However, the authors focus on the differences between mixed and pure networks, which are extremely small. Furthermore, the correlation coefficients of "low", "medium", and "high" correlation groups are chosen completely arbitrarily. A correlation coefficient of 0.99, considered low here, would seem extremely high in other contexts. Whether dynamic pattern separation is possible over a wider range of input correlation coefficients is unclear (see O'Reilly and McClelland, 1995, Hippocampus, for a possible relationship). Finally, aren't code expansion and lateral inhibition the key mechanisms underlying pattern separation? None of these potential mechanisms are incorporated here.

      The reviewer positively appreciates the idea of the pattern separation task that we propose in the manuscript, and poses some questions concerning the extent of the contribution of adult-born neurons.

      We agree that code expansion and lateral inhibition are key mechanisms for pattern separation in the DG, and we do not claim that adult-born neurogenesis is the key mechanism behind pattern separation. Rather, in our work we explore the role of adultborn immature neurons in coding in general, and in pattern separation in particular, given that it’s a commonly attributed function to the DG.

      We note that the correlation in O'Reilly and McClelland 1994 (actually, what they call pattern overlap) is of a very different nature than the one we compute in our work. They compute the overlap between different patterns of activation in a population of neurons, that is the probability that a single neuron is active in two different patterns of activation. In our manuscript we compute the correlation between different continuous time-varying stimuli that stimulate single neurons.

      Importantly, previous work has shown that ablating neurogenesis particularly affects fine spatial discrimination, that is when the separation between patterns is small, but not when it is large (Clelland 2009, Science). Hence, we were actually expecting the impact of adult-born neurons to be important only for relatively large correlation coefficient values.

      In the revised manuscript, we now explain the rationale for the choice of correlation values, both in the main text when we introduce the task, and in the Methods when we set the values for the low, medium and high correlation classes. We also added a sentence to the discussion on pattern separation, bringing in the importance of the ideas of lateral inhibition, code expansion, and the work of O’Reilly 1994.

      5) A main conclusion of the paper is that while mature GCs are better decoders on the single cell level, heterogeneity in mixtures improves coding in neuronal networks. However, this seems to be true only for r^2 as a readout criterion (Fig. 4F). For information, the result is less clear (Fig. 4G). The results must be discussed in a more objective way. Furthermore, intuitive explanations for this paradoxical observation are not provided. Saying that "this is an interesting open question for future work" is not enough.

      This is an interesting point raised by the Reviewer. While r^2 is quantified by comparing the decoded stimuli with the true stimuli, mutual information is related to the uncertainty about the decoding. That is, it quantifies the correspondence between decoded and true stimuli, but does not tell us whether it is a good approximation to it. For example, a decoder could achieve perfect mutual information but result in a poor reconstruction by performing a perfectly scrambled one-to-one mapping of the true stimulus [Schneidman et al. 2003], see also our reply to point [5] by Reviewer #1 above.

      We agree that this is an important point and we realize that it was not clear in the original version of the manuscript. In the revised manuscript we added some sentences to clarify this point.

      6) The authors ignore possible differences in the output of mature and adult-born granule cells in their thinking. If mature and adult-born granule cells had different outputs, this could affect their contributions to the code (either positively or negatively). At the very least, this possibility should be discussed.

      Newborn neurons contact the same targets as mature neurons, born during development: pyramidal cells in CA3, and interneurons in CA3 and the DG. During the maturation, there is a sequence of connectivity with CA3 and within the DG (Toni et a. 2008). At 4 weeks, newborn cells are already contacting their postsynaptic targets. Still, there may be subtle differences in the strength of these connections compared to mature neurons.

      So, although the targets are the same, there may be quantitative differences in the way they contribute to the code. Thus the point raised by the reviewer is interesting, so we decided to discuss it further in the revision.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript by Toshima et al. describes a study of the organization of traffic in the endomembrane system of the budding yeast Saccharomyces cerevisiae. The authors address the relation between endocytosis and the Golgi (TGN: a collection of maturing membrane elements derived from the trans-Golgi). The study builds on a previous article by the group of Benjamin Glick. In that study (Day et al., 2018), it was postulated that the TGN is the first destination for yeast endocytic traffic after internalization from the plasma membrane. Additionally, Day et al. had shown that endocytic recycling traffic towards the plasma membrane departs from the TGN as well. Therefore, early endosome and recycling endosome compartments would be identical to the TGN or contained within it. Here, Toshima et al. use super-resolution confocal live imaging microscopy (SCLIM) to refine a model of endocytic pathway organization. This powerful imaging technology allows them to show that out of two partially overlapping TGN markers, namely Tlg2 and Sec7, the syntaxin Tlg2 correlates better with the arrival of fluorescently labeled endocytic cargo than alternative TGN marker Sec7. Building on this main finding, the authors conclude that a specific part of the TGN (an "independent sub-compartment") functions as the early endosome. Further experiments in mutants for GGA clathrin adaptors, required for departure of endocytic cargo from the TGN to the Rab5-positive prevacuolar endosome, show again that endocytosed cargo accumulation correlates better with Tlg2 than with Sec7. Furthermore, in GGA mutants the overlap between Tlg2 and Sec7 is decreased, suggesting that GGA is required for maturation of this Tlg2 sub-compartment.

      The study is well conducted and its main conclusion that a Tlg2 subregion within the TGN functions as the early endosome seems well supported by the superb live imaging and the analysis of GGA mutants.

      Although a technical feat in live superresolution imaging, this single kind of data (moving, shape-shifting blobs of fluorescently-labeled proteins) does not totally fill with meaning the terms "compartment", "sub-compartment", or "independent sub-compartment". This, I think, is the main limitation of the study. Are these compartments or sub-compartments individuated membrane elements, collections of vesicles, regions of the same cisterna or saccule? For this, electron microscopy would be needed.

      We are very grateful for the reviewer’s favorable evaluation of our study. In accordance with the editors’ judgment in "Essential Revision", we have not performed electron microscopy analysis for this revision. However, we have addressed all other valuable comments.

      Reviewer #2 (Public Review):

      In this manuscript Toshima et al document the use of sophisticated microscopy - with powerful spatial and time resolution - to image markers of the yeast endosomal system.

      The initial work documented in this paper does a good job of defining the compartment endocytic cargoes internalise to. This is convincingly shown to be a compartment that is not marked by Sec7 but is instead a distinct (sub)compartment marked by the SNARE protein Tlg2. This agrees with many previous studies, (including biochemical experiments and microscopy of cargoes in a series of membrane trafficking mutants) but has different conclusions to another study (Day et al 2018 - Developmental Cell). Although the microscopy techniques used in the two studies are different, the yeast system and many of the reporters (FP tagged Tlg1, Sec7, Vps21 and fluorescently labelled mating factor) are the same. The Day et al study is suitably referenced throughout the manuscript but as to why the authors have come to fundamentally different answers about endocytic cargoes internalising to a Sec7+ compartment, is not discussed.

      According to the reviewer's suggestion, we have added a paragraph discussing about this (line 533-539).

      The work goes on to show endocytic carriers (marked by Abp1) and endocytic cargoes like fluorescently labelled mating factor internalise to the Tlg2+ compartment. The forward trafficking of these molecules is then observed to transit to a later endosome compartment labelled by Vps21. The super-resolution and time lapse imaging, sometimes even using 3 colours - is of very high quality and fully support the model presented at the end of the paper for this trafficking itinerary. Trafficking mutants are also used (such as a defective allele of arp3 and deletion of VPS21 / YPT52 GTPases) to interrupt trafficking routes and define the pathways followed by endocytosed mating factor.

      The endocytic trafficking from Tlg2+ to Vps21 compartments is shown to be defective in mutants lacking GGA adaptors (gga1∆ gga2∆), with cargoes accumulating in the Tlg2+ compartment and other clathrin adaptor mutants not causing this defect. This research avenue also reveals that the GGA proteins are required to maintain the distinct Tlg2 sub compartment.

      The final section of the paper uses the same tools to analyse the localisation of the recycling v-SNARE protein Snc1. This is arguably the most important set of experiments in the paper, not only is Snc1 a putative v-SNARE that functionally interacts with Tlg2, but this cargo, unlike pheromone, allows the investigation of recycling back to the PM from TGN/endosomes. However, the authors do not comment on the fact that Snc1 does not localise to the plasma membrane in either experiments using different microscopy techniques (Figure 5A + 5B), calling into question whether the recycling pathway is operating properly or that the FP-tagged machinery has disrupted processing? The steady state localisation of Snc1 in WT cells only looks normal in Supplemental figure, this discrepancy should be discussed or addressed.

      As the reviewer points out, fluorescent protein-tagged Snc1p usually localizes to the plasma membrane in addition to cytosolic puncta, as shown in Fig. 6–figure supplement 1A. In Fig. 6A, localization of GFP-Snc1p is demonstrated by focusing on the cell surface using a TIRF microscope, which differs from that focusing on the medial focal plane. Therefore, Fig. 6A shows that GFP-Snc1p localizes to the plasma membrane, albeit with evident punctate localization.

      Localization of mCherry-Snc1p to the plasma membrane was also observed in the images obtained by SCLIM. However, since the intracellular signals of mCherry-Snc1p are partially blocked by those around the plasma membrane, in Fig. 6B the intracellular localization has been emphasized by modulating the contrast, thereby reducing the fluorescence signals at the plasma membrane. In the new manuscript, we have added an image with only slight contrast (Fig. 6–figure supplement 1C) in the same cell as that shown in Fig. 6B.

      Reviewer #3 (Public Review):

      The manuscript by JY Toshima et al. is an excellent and important study that demonstrates very clearly the existence of an endosomal compartment in yeast, distinct from the trans-Golgi network, to which endocytic vesicles fuse upon internalization. They show that this compartment is enriched in the SNARE protein Tlg2, a yeast homologue of syntaxin, and is segregated from the Golgi-localized Sec7-containing compartment, indicating that the organization of the endocytic system in yeast is similar to that of animal cells. Furthermore, they demonstrate the trafficking machinery required for maturation of this compartment, and that it is also a station on the pathway back to the plasma membrane. Because there have been conflicting reports in the literature as to the existence of an endosomal compartment in yeast distinct from the trans-Golgi network, this paper is of great importance for the cell biology community.

      Major strengths of this study are the cutting-edge imaging technology used, and the careful, quantitative analyses carried out. The authors use a super-resolution live cell imaging approach that allows them to discriminate to a high resolution different compartments and membrane domains of highly dynamic yeast organelles, and to follow an internalizing cargo over time. With their manuscript, they have provided a full set of movies, along with quantifications, to support their conclusions.

      The authors use fluorescent-protein-labelled endocytic cargo (alpha-factor) and florescent-protein-labelled compartment markers, assaying them in high resolution and super-resolution live cell imaging microscopy systems. In this way, they are able to follow trafficking of cargo through compartments in real time. The authors first demonstrate that the alpha-factor cargo substantially colocalized with the SNARE protein Tlg2, a marker of early endosomes, but very little with Sec7. They also show that Tlg2 marks a sub-compartment distinct from the Sec7 compartment, but adjacent to it. Furthermore, they demonstrate using super-resolution microscopy and triple color 4D imaging that endocytosed alpha-factor cargo structures make contact with the Tlg2 compartment, adjacent to the Sec7 compartment, then disappear, supporting the conclusion that endocytic vesicles first fuse with the Tlg2 compartment. Next the authors show that alpha factor is transported from the Tlg2 compartment to the Vps21 compartment, a process that requires the GGA adaptors Gga1 and Gga2. Finally, the authors show that recycling of the endocytic R-SNARE Snc1 also occurs by passage through the Tlg2 compartment.

      The use of mutants that affect different stages of endosomal trafficking is a strength of the manuscript, as it allows elucidation of the mechanism of transport through successive compartments. Importantly, using a gga1-delta gga2-delta mutant, the authors demonstrate convincingly that the GGA adaptors Gga1 and Gga2 are required for alpha factor transport from the Tlg2 compartment to the Vps21 compartment.

      Throughout this study, the authors use fluorescent protein-labelled cargo and compartment markers (EGFP, mCherry, iRFP), but don't explicitly state to what extent these fusion proteins are functional compared to the endogenous proteins. They could cite previous publications or their results describing the functionality of the fusion proteins used.

      According to the reviewer's suggestion, we have cited previous publications for GFP-Tlg2 (Seron et al., MBoC, 1998), Sec7-GFP/-mCherry (Seron et al, MBoC, 1998; Llinares et al., Sci Rep, 2015), Abp1-mCherry (Kaksonen et al., Cell, 2003; Picco et al., eLife, 2015), GFP-Vps21 (Toshima et al., Nat. Comm, 2016), Gga2-mCherry (Daboussi et al., NCB, 2012), GFP-Snc1p (Lewis et al., MBoC, 2000), and GFP-Ypt31 (Kim et al., Dev Cell, 2016). We have also added data showing the functionality of Abp1-mCherry (Fig. 2–figure supplement 1A), Sec7-iRFP (Fig. 1–figure supplement 1F), Gga2-mCherry (Fig. 5–figure supplement 2G), and GFP-Ypt31p (Fig. 7–figure supplement 1A) in the new manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      Luu et al. have developed a genome-edited African elite rice variety, Komboka. The work was initiated in response to the outbreak in Eastern Africa by Xanthomonas oryzae strains that are phylogenetically related to Asian strains and carry TALes, similar to strains from China, possessing an expanded repertoire of TALes compared to those in endemic strains. As these emerging strains contain TALe targeting SWEET11a, as well as that suppressing Xa1, pthXo1, and iTALes, the authors have generated edited lines targeting promoter regions of SWEET11a, 13 and 14 in African elite rice variety, Komboka. The same team has previously generated genome-edited lines targeting the promoter regions of SWEET11a, 13, and 14 in varieties Kitaake, IR64, and Ciherang-Sub1. Bacterial blight outbreaks and emerging pathogen lineages remain to be a threat to rice production. Thus, efforts in targeting pathogen weaknesses to generate genome-edited varieties possessing broad-spectrum resistance are required. The survey, collection of isolates, and strain characterization studies on >800 strains are commendable. This study has taken advantage of this ongoing collection to stay ahead in the arms race to deploy broad-spectrum resistance in an elite rice variety using TALe targets.

      Overall conclusions presented here are supported to some extent; however, I have listed some of my comments and concerns below.

      1) Data in supplementary table 2 suggests that Komboka is still a moderately resistant variety under field conditions in Africa, with a disease severity scale of 2 i.e. 4-6% disease severity, compared to other varieties having a disease severity scale of 5. Thus, I am not convinced that emerging strains are of concern on the Komboka variety under field conditions, thus, question the justification of Komboka being a choice for editing to tackle emerging strains.

      We apologize, because the Table 2 is admittedly hard to read with the geo data. We have thus added a new figure 1 with maps. Please note that the data in this Table are from 2022. If you look at for example the Morogoro region (Dakawa and Lunkege), it appears that also there, the initial scale (number of plants infected) was low and became more severe in the subsequent years as one might expect. We thus hypothesize, that in the upcoming analyses, the scale will also become much higher, thus this snapshot cannot serve as a measure of general susceptibility. As we noted in the response to the Editor, the Kaufmann clipping assays are widely used by breeders to evaluate resistance in greenhouse conditions, and since the assays uses severe wounding and extremely high bacterial inocula, this assay is a reliable measure of susceptibility. Note also, that Komboka was chosen before the outbreak was characterized. Our data show that Komboka is highly susceptible to Asian strains, as well as to the introduced strains. Note also that we characterized the R gene outfit as far as feasible, an found two R genes that can explain the resistance to the endemic African strains. Note that single, double and triple R gene mutant combinations have been broken in India, thus we deemed it necessary to create a rational approach that prevents SWEET gene recruitment to generate broad spectrum resistance. xa13 has likely only been broken by circumventing SWEET11a (by using SWEET13 or 14), but still stands up in quintuple breeding combinations in India. Thus, we expect that our lines will be rather robust, which will have to be tested in future field trials in Kenya where this variety is highly cultivated. We added text to Results, Discussion sections and a new section on sampling in Methods with respective references that show the correlation of data from assays with the same strains in greenhouse and field.

      2) Is Xa4 from Komboka related to Xa4_Teqing? The breakdown of Xa4T was due to the mutant allele of avrXa4 in virulent Xoo CR6. But this breakdown was accompanied by a fitness penalty and residual QTL had a significant residual effect on virulent strains. Would this be why Komboka carrying Xa1 (Xa45(t) and Xa4 under field conditions still showed moderate resistance? But Xoo strains showed susceptibility in leaf clipping assays.

      We apologize, this was a typo that has been corrected. Komboka is a high yielding variety, we thus cannot comment on any yield penalty here, it is superior and widely accepted now in Kenya. And we responded regarding on the moderate resistance in the previous paragraph. Komboka is fully susceptible to the Asian strains that induce SWEET11a.

      3) I felt a bit of a disconnect in sections on phenotypic assays confirming the virulence profile of strains on Komboka and then understanding mechanisms underlying virulence since the same strains used in path data were not the ones mentioned in WGS and TALe analysis, leaving the readers with the only one strain to support the hypothesis of the basis for higher disease severity on Komboka due to new TALes, pthXo1, and iTALe. Do authors have pathogenicity data for African strains T19, Dak16, and Xoo3-1 that grouped with endemic African strains on Komboka? Authors present data on CIX4457, 4458, and 4462 being virulent on Komboka, and show that they cluster with Asian strains. However, in the tree, 4462 is the only one shown to be closely related to Chinese strains. Where are 4457 and 4458 placed? Do 4457 and 4458 also contain pthXo1 and iTALe? Authors could also provide path data for 4506/4509 that they included in TALe figure and in the phylogenetic tree.

      We had initiated WGS of 8 strains (3 from Dakawa and 5 from Lukenge), but at the time of submission, not all genomes were fully polished. Although not all are in a publishable state by now, we were able to determine the similarity as well as presence of pthXo1 and iTALes. The number of SNPs among the 8 strains is extremely low (between 1 and 4), strongly intimating that they are siblings. They are so similar, that we can at present not trace the origin. All eight strains isolated in Dakawa in 2019 and in Lukenge in 2021 contain iTALes and the PthXo1B variant. With near certainty that they are all derived from a single introduction event. We fully understand your comment. We apologize, since we should not have used the CIX nomenclature, which was introduced to obtain a more reliable code for the strains. We have introduced a clearer nomenclature while keeping the code for the database. We added a new Figure 2-supplement 1 which shows that Komboka is susceptible not only to the three strains isolated in Dakawa in 2019, but also to one of the strains isolated from Lukenge in 2021. We replaced Fig. 3 with a new phylogenetic tree including the eight strains and provide more detailed information on the relation of those strains. In principle it would be sufficient to use a single isolate in this case. We now provide, as far as possible the new data (analysis is ongoing) as well as new data for some strains collected in 2022 and conclude that also the strains identified in 2022 are derivatives from an initial introduction in the Morogoro region. It is also clear from Fig. 2 and supplement that Komboka is fully susceptible to the strains isolated from Dakawa and Lukenge, as susceptible as to the Philippine reference strain PXO99A, which also uses PthXo1.

      4) The authors present pathogenicity data on EBE-edited T0, T1, and T2 lines of Komboka which are promising against the Tanzanian strains carrying new TALes. The cas9/cpf1 system developed here to target multiple EBEs will be a valuable contribution to the scientific community. What are the agronomic characteristics of these edited lines? As the edited lines have not been tested against a diversity panel of Asian and African strains, I would be skeptical of the choice of the term "broad-spectrum" yet.

      Virulence of Xoo depends critically on the recruitment of at least one of the three SWEETs (11a, 13 or 14). Single R genes, such as xa13 can be overcome by using SWEET13 or 14. All strains that are virulent carry at least one TALe that targets a SWEET. Thus, by blocking all known EBEs, we obtain broad spectrum resistance. We have not observed a single case yet where this is not working. Note that in the case of EBE edited Kitaake, we tested about 100 different strains from a world-wide collection, for IR64 and Ciherang-Sub 1 also many representative strains, and we now show data for Komboka and additional varieties. Thus, based on the current knowledge, including the information gained from Xoo genome sequences that have been published, e.g., recently from India, there is at present no strain known that can overcome this resistance.

      Regardless of my comment earlier on Komboka being moderately resistant under field conditions and thus a questionable choice for EBE-editing here, the genome-edited lines in any variety imparting resistance to bacterial blight remain to be a valuable contribution to managing disease outbreaks.

      We commented on the interpretation of moderate resistance above, but appreciate the comment that these lines will be valuable.

      5) As this manuscript utilizes the diversity of African strains to generate edited lines, it would be good to make diagnostics and path data for 833 strains available to the scientific community (instead of select strains as indicated in the supplementary table), especially for the fact stated here in the manuscript about scarce data on Xoo in Africa and the goal of systematic comparison of the pathogen population.

      We are currently preparing a manuscript that will include an extensive analysis of these strains, and focus on the diversity of African Xoo strains, i.e., MLVA-based diversity of the collection. This manuscript, which is in preparation, will include the requested data.

      Reviewer #2 (Public Review):

      This study describes the emergence of virulent strains of the rice bacterial blight pathogen Xanthomonas oryze pv. oryzae in the Morogoro rice-growing region in Tanzania. The aims of the study were to describe the virulence features of the emerging population, as compared to previous bacterial blight outbreaks in Africa, and generate an elite rice variety that is resistant to both pathogen populations. To achieve these aims, the authors characterized the genetic basis of the virulence of these new strains by sequencing the genomes of three representative strains and phenotyping using excellent genetic resources for identifying the susceptibility gene targets of this pathogen in rice. They then used two rounds of hybrid CRISPR-Cas9/Cpf1 to successfully edit six targets of the pathogen in an East African rice variety, which conferred resistance to all strains tested.

      The strengths of this paper are the systematic analysis of the virulence of emerging pathogen strains relative to strains from previous outbreaks and the successful creation of edited lines that will form the basis for continued efforts to gain regulatory approval for the introduction of resistant rice in East Africa. The creation of the edited line is a substantial and important contribution, indeed, the authors include strains collected in 2021 and include disease severity data from 2022 in the supplementary data, illustrating the urgent need for solutions.

      The weaknesses of the study are largely related to the quick turnaround between data collection and manuscript submission.

      1) Different strains are used for different experimental work and sequence analysis, making relationships between different parts of the work unclear and also more challenging for the reader to follow because of changing strain designations. CIX4457, CIX4458, and CIX4462 were virulent on rice near-isogenic-lines, CIX4457 and CIX4505 were used for identifying SWEET targets and phenotyping edited lines, while whole genome sequencing was conducted with CIX4462, CIX4506, CIX4509.

      We added new information which demonstrates that the strains isolated in 2019 in Dakawa and the strains from Lukenge (2021) are very closely related and differ only by a 1 to 4 core genome SNPs (see new supp Fig. 3A). We added a new Figure2-supplementary Figure 1 and expanded Table 1 to show that the strains from Lukenge and Dakawa behave in a similar manner. We are aware of the differences in the figures but hopefully have now addressed them in an acceptable manner, we did not want to combine data from different experiments. The differences in strain use are due to i) the different timing of strains sampling and isolation (those from 2019 were isolated first and the long and tedious work of leaf-clipping the whole set of NILs with all the diversity strain panel did therefore not include Tanzanian strains from 2021 that were isolated much later also due to long delay in having the infected leaf material sent out; including them in the NILs testing would have taken us another year given the volume of this experiment), and ii) the variable quality of whole genome sequencing of the strains. Overall, we have sequenced the genome of 8 newly introduced strains including 3 from Dakawa_2019 and 5 from Lukenge_2021 (see new suppl. Table 3 that gives a detailed overview of the genomic analysis of these strains). The best genome sequences were obtained for strains CIX4462, CIX4506 and CIX4509 (renamed in the revised version of this MS and for sake of clarity as iTzDak19-3, iTzLuk21-1 and iTzLuk21-2) of which a circularized chromosome could be generated. Unfortunately, these were not the strains that we had selected for SWEET characterization and phenotyping of edited lines, whereby one representative strain of each collection had been randomly picked, namely CIX4457 and CIX4505 (now iTzDak19-1 and iTzLuk21-3, respectively). To reconcile these two sets of data and show that strains from Dakawa and Lukenge are actually extremely similar, we have performed a SNP-based phylogenetic analysis of the 8 strains demonstrating that they all cluster as one homogeneous genetic lineage, in line with a scenario whereby all these strains result of a single introduction event from Asia. Careful analysis of these additional genomes also confirmed the presence of a pthXo1like allele (pthXo1B) and iTALes in all Tanzanian strains introduced from Asia. One exception is strain iTzLuk21-3 (CIX4505) where the poor quality of the pthXo1B sequence with potential frameshifts prevented any confirmatory analysis. Taken together, these data support the hypothesis that all new isolates, irrespectively of the year of sampling, are genetically very close and share the same virulence characteristics.

      2) Disease survey results from 2022 are listed in Supplementary Table 2, but it is challenging for the reader to summarize across many lines of data, which appear to represent individual samples.

      We agree that this was not the best way to show the data. In addition to the new suppl. Tables 1 and 3 we have now generated a new Figure 1 which contains maps of the disease distribution and severity across Tanzania in the different years as well as photos from the fields in Dakawa from 2019 and Lukenge in 2021 that highlight the massive infections.

      3) The focus of the editing is Komboka but bacterial blight in 2022 was mostly on other varieties. It would be helpful to have more context on this variety and what has prevented adoption by the growers in the Morogoro region to date.

      The variety was chosen several years ago after extensive consultations with breeders from IRRI, IRRI Africa, and India, since it is high yielding, and was specifically generated for Kenya where it has a high level of adoption. Tanzania has apparently not yet adopted this variety as you can see from Table 2. Also, Tanzania does NOT have any regulations for genome edited crops and we can thus NOT provide the lines to Tanzania. By contrast, Kenya has established a regulatory framework by which the local government authorities can import transgene-free edited lines. We are currently segregating the transgenes out and have established a through set of measures to validate whether the lines still contain transgenes (including vector backbone and T-DNA remnants). Tanzania will have to establish suitable guidelines. We would like to note that establishing protocols for different elite varieties is challenging and time consuming and we had early on, in 2019, decided to initiate work on transformation protocols for this variety. If Tanzania also adopt regulations, it would be possible to provide the lines to Tanzania as well, and possibly by then Tanzania has a higher level of adoption of Komboka. If you look at the maps we show, it is very likely that the disease will spread to all neighboring countries, including Kenya. Thus, our lines may become one possible measure to try to address the outbreak.

      Reviewer #3 (Public Review):

      One key finding of this work is the identification of Xanthomonas oryzae pv. oryzae (Xoo) strains in Africa, based on their genomes sequence and their TALE repertoires, have high similarity with Asian strains. Asian Xoo strains typically overcome NLR-mediated recognition of TALEs in rice by so-called iTALEs. Moreover, some Asian strains contain a TALE resembling PthXo1, a TALE protein that was shown to overcome xa5 resistance.

      The authors now found that some of the newly identified African strains have iTALEs and PthXo1-like TALEs. Such newly evolved African strains were found to be fully virulent on the African rice elite variety Komboka, which is resistant to a broad panel of African Xoo strains.

      Previous studies have shown that TALEs bind to effector binding elements (EBEs) present in promoters of rice SWEET genes to promote disease. Work from the lab of the authors and other labs has shown that TALEs can no longer promote the disease if matching EBEs are changed or deleted by CRISPR or TALEN-mediated mutagenesis. In fact, pioneering work by Bing Yang, one of the authors of this article published about ten years ago a Nature Biotechnology article where he showed that rice plants with mutated EBEs are resistant to Xoo. Recently, a combined effort of the Yang and Frommer labs resulted in two further Nature Biotechnology publications (2019), in which they described along with other useful tools rice lines where multiple EBEs were mutagenized in parallel and that provide broad spectrum resistance.

      The work under review describes now CRISPR mutagenesis of an African elite cultivar resulting in a line that mediates resistance to Asian and newly evolved African strains.

      Overall, the work is technically sound. Yet, the approach that has been described - mutagenesis of multiple EBEs - has been used before and is a routine procedure for labs that are focused on such undertakings. While such approaches do not provide new insights for fundamental research, they nevertheless are certainly important and useful in translational research, as demonstrated here.

      We thank reviewer for the comments. If we may, we would like to add aspects of novelty. We detected an outbreak that is spreading. We determined the disease mechanism, and we used CPF1 to obtain ‘optimal’ mutations at all sites (massive improvement over 2019 publication, which used Cas9) and we try to provide a solution for the outbreak when it spreads to Kenya, or when Tanzania and neighboring Countries adopt similar guidelines. This seems highly urgent das Reviewer 2 points out.

    1. Author Response

      Reviewer #1 (Public Review):

      This study used intersectional genetic approaches to stimulate a specific brainstem region while recording swallow/laryngeal motor responses. These results, coupled with histology, demonstrate that the PiCo region of the IRt mediates swallow/laryngeal behaviors, and their coordination with breathing. The data were gathered using solid methods and difficult electrophysiological techniques. This study and its findings are interesting and relevant. The analysis (and/or the presentation of the analysis) is incomplete, as there are analyses that need to be added to the manuscript. The interpretation of the data is mostly valid, but there are claims that are too speculative and are not well-supported by the results. The introduction and discussion would benefit from more citations and a deeper exploration of how this study relates to other work - especially a thorough accounting of and comparison to other studies concerning putative swallow gates.

      General/major concerns:

      The field of respiratory control is far from unified regarding the role of PiCo in breathing or any other laryngeal behaviors. If anything, the current consensus does not support the triple-oscillator hypothesis (in which PiCo is one of 3 essential respiratory oscillators). The name "PiCo", short for "post-inspiratory complex", suggests a function that has not been well-supported by data - it is a putative post-inspiratory complex, at best. I suggest putting this area in context with other discussions i.e. IRt (such as in Toor et al., 2019) or Dhingra et al. 2020 showed broad activation of many brainstem sites at the post-I period (including pons, BotC, NTS)

      The reviewer’s comment refers to our previous publication and not the present one. With all due respect to the reviewer, the submitted study investigates PiCo’s involvement in swallow and laryngeal activation and its coordination with breathing.

      We did not feel that it is appropriate for us to critique the Dhingra paper in the present study. However, since this seems to be important to this reviewer, we would like to clarify: Because of filter characteristics, and the low temporal and spatial resolution of these field recordings, the approach used by Dhingra is inappropriate for providing insights into the presence or absence of PiCo. We therefore developed an alternative approach, which provides more detailed insights into population activity, the Neuropixel approach. This Neuropixel recording from PiCo (black trace) exemplifies how field recordings (yellow) fail to pick up post-I activity. We could provide many more examples, but as stated above, addressing the study by Dhingra is tangential to the present study.

      We would also emphasize that the study by Dhingra was never designed to provide negative evidence, and Dhingra et al. never claimed that their study demonstrates the absence of PiCo. Unfortunately, the data by Dhingra were misinterpreted by Swen Hülsmann in his Journal of Physiology editorial which created considerable confusion, but also sensation in the field. Objectively, Toor et al reproduced the Anderson study in rats as we will elaborate below. Unfortunately, Toor et al added to the confusion, by renaming the PiCo area into IRt. The field of respiration would have also been confused if the first study reproducing the Smith et al. 1991 study in a different rodent species would have refused to call this area preBötC and instead would have called it e.g. ventrolateral reticular field.

      Did you perform control experiments in which the opto stimulations were done on animals without the genetic channels (for example, WT or uncrossed ChAT-ires-cre, etc.), or in mice with the genetic channels that weren't crossed (uncrossed Ai32 mice)? If so, please include. If not, why?

      Yes, we performed many control experiments. Aside of many recordings in which viral injections were targeted outside PiCo, we also performed optogenetic stimulations in mice lacking channelrhodopsin. We have now added the following statements and supplemental figure.

      Optogenetic stimulation in mice lacking channelrhodopsin

      Stimulation of PiCo, across all stimulation durations, in 3 Ai32+/+ mice and 4 ChATcre:Vglut2FlpO:ChR2 mice where the ChR2 did not transfect ChATcre:Vglut2FlpO, as confirmed by a post-hoc histological analysis, resulted in no response (Fig. S3).

      How do you know that your opto activations simulate physiological activation? First, the intensive optical activation at the stim site does not occur in those neurons naturally.

      This seems like a generic critique of the optogenetic approach. In none of the 10,000+ published optogenetic studies is it known to what extent optogenetic activation stimulates exactly the same neurons and the same degree of activity as during a natural behavior. What we know is that PiCo neurons are activated during postinspiration (Anderson et al. 2016) and that optogenetic activation stimulates these neurons and that this activation evokes the same muscles in the same temporal sequence as a water-evoked swallow. We assume that the reviewer’s comment does not intend to imply that “swallows” evoked by nonspecifically stimulating the SLN is more physiological than the optogenetically-evoked swallows of a specific neuron population? From the reviewer’s other comments, it is obvious that the reviewer has no problems with the results of the Toor study that used exclusively SLN stimulations, an approach which is known to be very non-specific.

      Doing a natural (water) stim for comparison is good, but it cannot necessarily be directly compared to the opto stim. The water stim would activate many other brainstem regions in addition to PiCo.

      Can the reviewer provide any hard evidence that “many other brainstem regions” are activated by water stimulation in comparison to optogenetic stimulation?

      A caveat is that opto PiCo stim =/= water stim (in terms of underlying mechanisms) should be included. Second, in looking at the differences between water vs opto swallows in Table S2: it appears that the ChAT animals (S2A) have something weaker than a swallow with opto stim. For the Vglut2 and ChAT/Vglut2 (S2B&C), the opto swallows also aren't as "strong" as the water swallows (the X and EMG amplitudes are smaller). The interpretation/discussion attributes this to the lack of sensory input during opto stim, but does not mention the strong possibility that there is a difference in central mechanisms occurring. It also seems to be dismissed with the characterization of the swallow as "all-or-none" (see note on Fig 3 results).

      With all due respect, we are somewhat surprised that the reviewer dismisses the entire paragraph in the discussion that specifically addresses the comparison between water-swallows and PiCo-stimulated swallows. We discussed the possibility that PiCo stimulated swallows may not activate the full pathway/mechanism as does the water swallow. We carefully compared and confirmed that PiCo-stimulated swallows have the same temporal motor sequence of the same muscles as those activated in water swallows. As already stated, it is surprising that the reviewer has no problem with accepting the validity of previously published methods like electrical non-specific stimulations of the cNTS or SLN, a frequently used and accepted model to produce and study swallow.

      The writing needs extensive copy editing to improve clarity and precision, and to fix errors.

      Thank you for this comment, we have revised and reviewed the writing.

      Results/Fig 1: What proportion had no/other motor response (non-swallow, non-laryngeal) to the opto stim? I can extrapolate by subtraction, but it would be nice to see the "no/other response" on the plot.

      With all due respect to this reviewer, but it is not possible to address this question. Specifically, it is not possible to know if a “No response” (meaning “no behavioral output” occurred in response to PiCo stimulation), would have resulted in a swallow or laryngeal activation. However, figure 2 contains responses other than swallows, i.e. “non swallows”, which includes both laryngeal activation as well as “no responses” meaning “no behavioral response” in response to PiCo stimulation. This was determined to assess how the respiratory rhythm is affected when a swallow is not produced by PiCo stimulation.

      The explanation of genetics is too spread out and confusing. There needs to be more detail about all the genetic tools used, using the standard language for such tools, in one spot. Please also provide a clear explanation of what those tools accomplish. Include a figure if necessary.

      We apologize for creating confusion. We added more explanations to the text.

      Pick a conventional designator/abbreviation for the different strains, define them in the methods and in the first paragraph of the results section, and use those abbreviations throughout. I think that using ChAT as an abbreviation for your ChAT-ires-cre x Ai32 mice is confusing because it makes it sound like you're talking about the enzyme rather than the specific strain/neurons. Saying "ChAT stimulated swallows... swallows evoked by water or ChAT" makes it sound like the enzyme choline acetyltransferase itself is stimulating swallow. As is convention, pick a more precise abbreviation like ChAT-cre/Ai32 or ChAT:Ai32 or ChAT-ChR2 or ChAT/EYFP. This goes for the other strains as well.

      Thank you for pointing this out. To avoid confusion the strains/neurons are now referred to as: ChATcre:Ai32, Vglut2cre:Ai32, and ChATcre:Vglut2FlpO:ChR2

      For Fig S2C&D, why does it say mCherry? Isn't it tdTomato? Is it just an anti-ChAT antibody and then the tdTomato Ai65 is only labeling Vglut2? I don't see this in the methods section.

      Thank you for pointing this out. We apologize for our mistake, and we have corrected the manuscript to say tdTomato.

      I also don't see methods for all the staining in Fig S3. The photomicrograph says Vglut2-cre Ai6, but there's no mention of Ai6 anywhere else. Which mice are these? Did you cross Vglut2-cre with an Ai6 reporter mouse? How can you image an Ai6 mouse (which I assume expresses ZsGreen? and that you excited at 488?) and a 488 anti-goat in the same section (that's the only secondary listed in the methods that would work with your goat anti-ChAT)? Is there an error in listing the fluorophores in the methods? Please give more details on the microscopy including which filters were used for the triple staining.

      We have decided to remove the CTb data from the manuscript.

      Regarding the staining: I would expect the staining/maps in for the 2 different ChAT/Vglut2 intersectional strains to be similar (Fig 5A/B and S2C/D). The photomicrographs look very different to me, while the heat maps (this goes for all the heat maps in the paper) have barely distinguishable differences. In Fig 5, the staining looks much stronger than in Fig S2C. Why does it look like there are so many more transfected neurons in Fig 5A2 than there are red neurons in the corresponding panel Fig S2C2? And for Fig 5A4 and Fig S2C44? The plot and results text for Fig 5 says the avg number of neurons was 123+¬11. The plot for Fig S2D says 112+¬15, but the results text says 242+¬12 (not sure which is the correct number).

      Thank you for your comments. Previously the heat maps had different scale bars if you compare Fig 5A/B and S2C/D (now figure S4C/D). We changed the heat maps keeping the same scale for all of them. Discussing the representative photomicrography, even figure Fig 5A/B and S4C/D represents the same cluster of cells (PiCo Chat/Vglut+). Figure S4D states 242 ± 12 neurons (also stated in the results section).

      However, we want to point out that there are several technical differences between both, 1) figure 5A represents the transfection promoted by the virus injection, impacting the number of cells stained/transfected (133 ± 16 neurons), 2) figure S4C/D represents a intersectional mouse ChATcre: Vglut2FlpO: Ai65; (242 ± 12 neurons). In this case, we have more tdTomato positive cells because this genetic approach is able to detect most of the Chat and Vglut2 cells. The difference between figures is considered normal for anatomical studies, in some studies the same bregma can show different number of cells. Thus, the differences are due to the differences in the type of approaches (viral expressions vs. intersectional approach).

      We have also added additional experiments to figure 5 (now N=7) which has been reflected in the text and figures.

      The results text for Fig S2C also says the staining is "similar to the previous ChAT staining...", which I assume refers to S2A/B. The plot and results text for Fig S2B reports 403+¬39 neurons, while S2D is either 112 or 242 (not sure?). The plots have different Y scales, which should be changed to be the same. But why do the photomicrographs and the heat maps look so similar? I would expect far fewer neurons to be stained in the intersectional mice (Fig 5 and Fig S2C/D) than in the ChAT staining (Fig S2A/B). I am having trouble reconciling the different presentations/quantifications and making sense of the data in these histology figures.

      We removed “similar to the previous ChAT staining” and we have reviewed the heat maps. Since the original submission, we performed more experiments and now added more animals to the analysis (now N=7), each heat map represents the correct number of neurons in PiCo, respectively to each experiment.

      The Y scales has been adjust to better demonstrate the Chat staining vs. the intersectional mice triple conditioned.

      How can you distinguish PiCo from non-PiCo in the histology, especially in the ChAT-only staining? It seems that you have arbitrarily defined the PiCo region, and only counted neurons within that very constrained area.

      Even in ChAT-only staining, the N.ambiguus is very distinct from the cholinergic neurons located more medial to the N.ambiguus. This can be unambiguously be confirmed by combining ChAT with glutamatergic in situ staining as done in the Anderson et al. study, or unambiguously be demonstrated with the viral approach as done in the present study. Thus, we don’t see why it is arbitrary to define the distribution of PiCo neurons. What is arbitrary is the definition of the preBötC, yet the field of respiration seems to have no problem with this. We assume that the reviewer knows that Dbx1 neurons are spread along the entire ventral respiratory column and dorsal portion of the PreBötzinger Complex up to the level of the XII nucleus. Yet it is commonly accepted for authors to refer to the PreBötzinger Complex by counting dbx1 neurons within a constrained area of what is believed to be the PreBötzinger Complex, even though the borders are arbitrary. It is e.g. known that some of the ventrally located preBötC neurons are presumed rhythmogenic while the more dorsally located Dbx1 neurons are premotor. The transition from rhythmogenic to premotor is gradual. Similarly, NK1 staining, or SST staining is not restricted to the preBötC and it is arbitrary to define where preBötC begins and what to include. Indeed, our PNAS paper indicates that inspiratory bursts can be generated by optogenetically stimulating Dbx1 neurons along the entire VRC column – so it is not clear where the rhythmogenic portion of the preBötC begins rostrocaudally and dorsoventrally and where the rhythmogenic portion and preBötC itself ends. Thus, we want to re-iterate and emphasize, that for the present study, we developed a method using the cre/FlpO approach to unambiguously define the PiCo region. It is surprising that this reviewer does not acknowledge this technical advance that added significantly more specificity to the anatomical and physiological characterization of PiCo, than the Toor et al. study, and also the Anderson et al. study.

      I can see stained neurons in the area immediately outside of PiCo, and I'd like to see lower-magnification images that show the staining distribution in a broader region surrounding PiCo as well, especially in the rest of the reticular formation.

      We characterized the PiCo area based on the histological phenotype and in vitro and in vivo experiments performed by Anderson et al., 2016. PiCo is an area located close to the NAmb, presenting the same ChATcre phenotype. As stated above, the distribution and agglomeration of the NAmb is clearly very compact, and different then the observed ChATcre: Vglut2FlpO: Ai65 neurons located outside of NAmb. It is also important to emphasize, that like is the case for the preBötC, other transmitter phenotypes of neurons are also present in the PiCo region (i.e. GABA or Dbx1). However, the study performed by Anderson et al, 2016 paper, described only the functions of cholinergic neurons located in PiCo, and we always planned to publish a paper of the other neurons within PiCo – this area e.g. contains pacemaker neurons etc. But, I hope that the reviewer acknowledges that many investigators have studied the preBötC for the past 30 years. Hence, much more information has been accumulated on this region (which btw was at least as controversial at the beginning), and it will likely take at least another 30 years to fully identify and characterize PiCo.

      Similarly, how can you be sure you're stereotaxically targeting PiCo precisely (600um in diameter?) with your opto fiber (200um in diameter). Wouldn't small variations in anatomy put the fiber outside the tiny PiCo area?

      We assume the reviewer means “stereotactically”. And yes, the reviewer is correct, it is necessary to position the laser at a consistent anatomical location. Placement of the optical fibers outside of this area does not result in activation of PiCo. We have added an additional supplemental figure (Figure S6) to address this.

      Please put N's and stats results in Table S1 for both swallow and laryngeal activity. I took what I assume to be the Ns (10, 11, and 4) and did some stats like the ones you presented for the laryngeal duration. The differences between vagus duration for 40 and 200 ms pulse durations are all significant for each strain, by my calculations. Also, I think there must be an error in the orange swallow plot in Fig 3A. The orange dots don't correspond to the table values. I plotted all the Table S1 values for each strain. Each line looks similar to the blue laryngeal activation plot in Fig 3A. The slopes of the Vglut2 were less than the other strains, and the slopes for the swallow behavior were less than the laryngeal behavior for all strains. Otherwise, they all look similar. Please double-check your values/stats to address these discrepancies. If it is indeed true that the stim pulse duration affects swallow duration, revise the interpretations and manuscript accordingly.

      We thank the reviewer for the diligence in reviewing our manuscript. But, with all due respect, the reviewer is incorrect and misunderstood the data. To clarify: Table S1 is only presenting data for laryngeal activation, swallow data is presented in Table S2. The orange data points in Fig 3A are not detailed in Table S1 or S2. Table S2 is the average of all swallows across all laser pulse durations since the laser pulse duration does not affect swallow behavior duration. All data will be publically available after publication of the manuscript.

      Figure 3A is only representing the ChATcre:Vglut2FlpO:ChR2 column of Table S1

      The N’s have been added to table S1

      Please add more details on stats in general, including the specific tests that were performed, F values and degrees of freedom, etc.

      Thank you, this has been added throughout the results section. Please refer to the results section for this addition. However below we have provided an example.

      An example: A two-way ANOVA revealed a significant interaction between time and behavior (p<0.0001, df= 4, F= 23.31) in ChATcre:Vglut2FlpO:ChR2 mice (N=7).

      How do you know that you're not just activating motoneurons in the NA when you stimulate your ChAT animals, especially given the results in Fig 1B? In this case, the phase-specific results could be explained by inhibitory inputs (during inspiration) to motoneurons in the region of the opto stim.

      As stated in this paper as well as the Anderson et al 2016 paper (and for that matter also the Toor et al study) this is a caveat. This major caveat motivated the development and use of the ChATcre:Vglut2FlpO:ChR2 (specifically targeting the PiCo neurons that co-express ChAT and Vglut2, not laryngeal motor neurons) experiments that have mostly the same response as the ChATcre:Ai32 mice. We cannot say this is due to inhibitory inputs to laryngeal motoneurons, since the cre/FlpO specific experiments are not directly activating laryngeal motoneurons. But we do not want to entirely exclude that some premotor mechanisms may also occur in PiCo. The reviewer may know that there is overlap of rhythmogenic and premotor functions for the Dbx1 neurons in the PreBötC, But, addressing this issue is beyond the scope of this study. In fact, we are working on a separate connectivity study using novel, still unpublished antegrade and retrograde vectors that do not reveal any direct connections to laryngeal motoneurons. Hence, we expect that the connectivity from PiCo to laryngeal motoneurons is more complex and addressing this question cannot be done as a simple add-on to an already complex study. Again, we would refer to the PreBötzinger complex, where nobody expects that one study can resolve all the physiological and anatomical characterizations that have been accumulated over 30 years in one study. We would argue that in some ways, our cre/FlpO approach is more specific than the Dbx1 stimulations which activates not only rhythmogenetic PreBötzinger complex neurons, but also pre motoneurons as well as glia cells, and many neurons rostral and caudal to the PreBötzinger complex. We are aware of these caveats, and we have discussed this in the original submission, and also in the revision.

      While the study from Toor et al is cited, there needs to be a much more thorough discussion of how their findings relate to the current study.

      Many thanks for asking for a more thorough discussion of Toor et al., which we are happy to provide here. Perhaps we were too polite in our original manuscript to emphasize all the problems in that study.

      They demonstrated that PiCo isn't necessary for the apneic portion of swallow. Inhibiting this region also didn't affect TI.

      Please note – the fact that Toor et al did not find an effect on TI confirms Anderson et al. 2016: In Figure 3G,3F of the Nature paper, the reviewer will find that injections of DAMGO and SST into PiCo inhibited post-I activity without affect inspiratory duration. This figure also shows that the inspiratory burst can terminate in the absence of postinspiratory activity.

      The reviewer states: “They demonstrated that PiCo isn't necessary for the apneic portion of swallow”. With all due respect to this reviewer, this is NOT correct. Toor et al showed that inhibiting PiCo did block SLN-evoked fictive-swallows but not the apnea caused by SLN stimulation. This is not the apnea caused by swallows (which was never studied by Toor), but by the SLN stimulation. The apnea evoked by SLN stimulation has most likely nothing to do with the apnea caused by swallows. Unfortunately, the Toor et al. makes the same misleading claim as the reviewer.

      PiCo cannot be the sole source of post-I timing, and the evidence overwhelmingly favors the major involvement of other regions such as the pons.

      This comment seems to be unrelated to the main thrust of this paper that studies PiCo’s involvement in swallow and laryngeal activation in coordination with breathing. However, since this comment seems to discredit the Ramirez lab in general, we would like to clarify that inhibiting PiCo with DAMGO and SST inhibits post-I activity (Anderson et al 2016, Fig.3G,3F). Thus, we don’t understand the rationale or actual data for the reviewer’s conclusion that PiCo cannot be the sole source of post-I timing? We also don’t understand the basis for the reviewer’s conclusion that “the evidence overwhelmingly favors the major involvement of other regions such as the pons”. We also want to add, that no-where in the Anderson et al. study did we state that the pons plays NO role. Indeed, we specifically stated: “In this context it will be interesting to resolve the role of the PiCo in specific postinspiratory behaviors and to identify how the PiCo interacts with other neural networks such as the Kolliker-Fuse nucleus, a pontine structure that has been hypothesized to gate postinspiratory activity and the periaqueductal grey a structure involved in vocalization and the control of postinspiration”.

      They also showed that inhibition of all neurons (not just ChAT/Vglut) in the PiCo region suppresses post-I activity in eupnea. This suppression was overcome by the increased respiratory drive during hypoxia.

      Before comparisons are made with Toor et al. it is important to note the species and methodological differences between Toor et al. rat anesthetized, vagotomized, paralyzed and artificially ventilated model which evaluated fictive swallows (deafferented and paralyzed). By contrast this study uses a mouse anesthetized, vagal intact, freely breathing model and evaluates natural physiologic swallow via water and central stimulation. It seems that the reviewers does not acknowledge one of the main innovations of this study. For this study we introduced a genetic approach to specifically target and activate ChATcre/Vglut2FlpO PiCo neurons. This has never been done before, and developing this approach took more than 4 years of breeding and crossing and testing different options.

      As for Toor et al., these authors pharmacologically, bilaterally inhibited neurons in the area of PiCo with isoguvacine, a specific GABA-A agonist. Even though this pharmacological intervention does not specifically inhibit cholinergic/glutamatergic neurons in PiCo, these authors essentially confirm the study by Anderson et al. We do not find this finding controversial. Perhaps the reviewer finds the definition of PiCo “controversial”, because Toor et al called the identical area IRt instead of PiCo, even though they exactly reproduce the finding by Anderson. Toor et al. not only arrive at the same conclusion as Anderson but they added more details – none of which is contradicting the results by Anderson et al.: Here are excerpts from the Toor study “We therefore conclude that the ongoing activity of neurons in the IRt contributes to eupneic respiratory and sympathetic post-I activities without exerting significant control on other respiratory or cardiovascular parameters” “IRt significantly inhibited the post-I components of VNA” “IRt inhibition was also associated with a reduction in PNA” “increase in respiratory cycle frequency” “due to a reduction in TE“ “with no effect on TI observed”. “Bilateral microinjection of isoguvacine selectively reduced the magnitudes of post-I VNA and rSNA, but not PNA responses to acute hypoxemia”.

      In this statement the reviewer probably refers to one particular aspect, i.e. the fact that Toor et al. did not significantly block some of the post-I activity – they state: “had no significant effect on the AUC of post-I rSNA (305+/- 24 vs 230+/- 28,p=0.16,n=6)”. Please note that there is a tendency, a reduction from 305-230. Perhaps the Toor study was not sufficiently powered to fully block the effect, perhaps the drug did not inhibit the entire PiCo. These are all open questions that a critical reader should know. The reviewer will agree that it is as difficult if not more difficult to demonstrate the absence of an effect. To arrive at a negative conclusion experiments should be done with the same scrutiny than to demonstrate a positive result. We also assume that the reviewer is familiar with animal experiments and will understand that pharmacological injections are often difficult to interpret, in particular in case of local in vivo injections. It is possible that Toor et al is inhibiting e.g. parts of the Bötzinger complex.

      We have added to the manuscript the following statement: It is important to note that SLN stimulation does not only trigger swallows, but also changes in the overall stiffness and tension of the vocal cords (Chhetri et al., 2013) as well as prolonged hypoglossal activation independent of swallowing (Jiang, Mitchell, & Lipski, 1991). It has been hypothesized that inhibition of the IRt blocks fictive swallow but not swallow-related apnea. Yet this apnea was generated by SLN stimulation and not by a natural swallow stimulation (Ain Summan Toor et al., 2019). It is known that SLN stimulation causes endogenous release of adenosine that activates 2A receptors on GABAergic neurons resulting in the release of GABA on inspiratory neurons and subsequent inspiratory inhibition (Abu-Shaweesh, 2007), suggesting that the SLN evoked apnea may not be the same as a swallow related apnea. Moreover, microinjections of isoguvacine into the Bötzinger complex attenuated the apneic response but not the ELM burst activity (Sun, Bautista, Berkowitz, Zhao, & Pilowsky, 2011), suggesting the Bötzinger complex, not PiCo, could be involved in modulating apnea.

      We would also like to add that our current study characterized swallow-related specific muscles and nerves in both water-triggered and PiCo-triggered swallows to better characterize the physiological properties of this swallow behavior. By contrast, Toor et al. only characterized nerve activities that are involved in multiple upper airway activities and breathing. It is somewhat surprising that the reviewer did not consider the fact that Toor et al. characterized putative swallows that were triggered by SLN stimulation and that Toor et al. were content with nerve-recordings and failed to confirm that the behavior that they evoked is actually a physiological swallow. Which, according to the comments from this reviewer (see above), indicates the possibility of differences in central mechanisms occurring between fictive swallow and physiological swallows.

      While we have cited Toor et al and their truly excellent work in the broad iRt we did not feel it is appropriate to critique them for the fact that they are confusing the field by using a different anatomical term for the area that was clearly defined by us as an area containing cholinergic-glutamatergic neurons. We also did not feel it is appropriate to discuss results that are similar to comparing Apples and Oranges. Toor et al. never specifically manipulated glutamatergic-cholinergic neurons, thus their entire results rest on indirect stimulation affecting this general area – which will unavoidably also include laryngeal motoneurons. We don’t want to criticize this approach, since PiCo is heterogenous, which is another misunderstanding that we find in the reviewers’ critique. We used cholinergic-glutamatergic neurons to define this area. However, like the preBötC, PiCo is also heterogenous. This region contains inhibitory neurons, it also contains glutamatergic neurons that are not cholinergic, and cholinergic neurons that are not glutamatergic. Because of this heterogeneity we compared the effects of stimulating glutamatergic neurons and cholinergic neurons as well as cholinergic-glutamatergic neurons. This is an approach that is generally accepted in the field. As already stated, there is not a single marker that uniquely characterizes the PreBötC. Thus, when stimulating Dbx1 neurons, glutamatergic neurons, or Somatostatin neurons it only captures subpopulations of this region. The recently published study by Menuet et al. in eLife, used even more indirect methods to inhibit preBötC. They used a pan-neuronal CBA promotor that targets neurons irrespective of phenotype. It is not our intention to discredit this very elegant study, but we object the statement that we “have arbitrarily defined the PiCo region”.

      This study has not demonstrated some of the things that are depicted in Fig 7 and included in the discussion. While swallow can inhibit inspiration, there are many mechanisms by which this can happen other than a direct inhibitory connection from the DGS to PreBotC. You cite Sun et al., 2011 findings of "a group of neurons that inhibits inspiration" during SLN stim, but don't mention that it is the BotC and that the paper shows that swallow apnea is dependent on BotC. That is also supported by the Toor study. I don't understand how post-I (aka E) can be discussed without discussion of the BotC - this is a glaring omission.

      We have removed figure 7, which was only meant as a hypothetical schematic.

      Why is it necessary for PiCo to innervate the cNTS?

      This was a hypothesis based on CTb data that we have now removed.

      That is true if the conjecture that PiCo gates swallowing is true, as the cNTS is the only known region for central swallow gating. However, PiCo could influence afferent input to the NTS less directly, and therefore not function as a gating hub per se. The experimental evidence does not warrant the claim that PiCo gates swallowing. The definition of a swallow gate(s) is a topic of much debate and no conclusive experimental evidence has emerged for swallow gating regions to exist anywhere except in the NTS. The current study's evidence also does not meet the criteria necessary to conclusively call PiCo a swallow gate. The authors should soften this claim and language throughout the manuscript.

      Although we do not know of any studies that has optogenetically gated swallow in the cNTS, it seems the reviewer objects our use of the word “gate”. We have revised the manuscript and removed any wording stating PiCo is a swallow “gate”. It would be interesting to know whether the reviewer has the same objections of the use of the word “relay” as done by Toor et al.?

      It is also unclear that PiCo acts directly on the swallow pattern generator to gate swallowing. It is not just "conceivable that the gating mechanism involves" the pons, but nearly certain. Swallow gating by respiratory activity may not be able to be ascribed to one particular location. At a minimum, it likely involves the NTS/DSG, pons, and possibly IRt (inclusive of PiCo). The authors are correct that "further studies are necessary to understand the interaction between PiCo and the pontine respiratory group on the gating swallow and other airway protective behaviors." This is why it shouldn't be stated that "this small brainstem microcircuits acts as a central gating mechanism for airway protective behaviors."

      We have removed all language stating PiCo is a swallow gate.

      PiCo is likely part of the VSG (and thus the swallow pattern generator). PiCo, as part of the IRt/VSG could indeed be surveilling afferent information and providing output that affects swallow or other laryngeal activation and the coordination of these behaviors with breathing. However, this is not the responsibility of PiCo alone. This role is likely shared by other parts of the SPG, and may require distributed SPG network participation to be functional. If one were to stim other regions of the distributed SPG, similar results might be expected. When Toor et al silenced the PiCo area (and locations that lie at least lightly beyond the borders of what the present study defines as PiCo), stim-evoked fictive swallows were greatly suppressed. However, swallow-related apnea was unaffected. This supports the role of PiCo as a premotor relay for swallow motor activation, but not as the site that terminates inspiration. Therefore, it cannot be called a gate.

      We already addressed the issue that Toor never demonstrated that the “swallow-related” apnea was unaffected. Toor et al only demonstrated that the SLN-evoked apneas were unaffected, and their conclusions were only based on nerve recordings under fictive conditions (deafferented and paralyzed). Also, to the best of our knowledge, many aspects of the putative swallow pattern generator that this reviewer mentions are purely hypothetical. However, to avoid further arguments, we have removed the word gate and Figure 7 from this manuscript.

      Similarly, Fig 7 does not accurately depict things that are already well-supported by evidence. PiCo should be included as part of the swallow pattern generator (VSG), not as a separate entity acting on it. The BotC and pons are glaring omissions. This study has not demonstrated the labeled inhibitory connection from DSG to PreBotC. The legend states speculations as fact and needs to be dialed way back to either include statements with solid experimental evidence or to clearly mark things as putative/speculative.

      We have removed figure 7.

      The discussion of expiratory laryngeal motoneurons needs to be expanded and integrated better into the discussion of swallow, post-I, and other laryngeal motor activation. Why can't PiCo just be premotor to ELMs?

      If PiCo would “only” or “just” be premotor to ELM then it would not be expected that it could trigger an all-or-none swallow response with a temporal activity pattern similar to the one of a water-evoked swallow. We would also not expect that the activation of the activity pattern is independent of the laser stimulation duration as demonstrated in Figure 3. This was our reasoning why we originally called PiCo a “gate” because at the correct phase it will gate/trigger a complex swallow sequence. But, as stated above, we avoid the word gate in the revised manuscript.

      Concerning the discussion of "PiCo's influence as a gate for airway protective behaviors is blurred...": The incomplete swallow motor sequence didn't seem super different in timing or duration compared to the fully transfected animals (comparing plots from Fig 6 to Fig S1, and Table S2 to Table S3. The values for swallow durations (XII and X) for each group for water and opto seem within similar ranges, as do the differences between water & opto-evoked swallows between strains. While the motor pattern is distinctive from the normal swallow, with laryngeal activity rather than submental activity leading, one might not even be able to call that a swallow. Is it evidence against a classic all-or-nothing swallow behavior any more than the graded swallow results from (fully transfected) Table S1?

      We fully agree that it is possible that this unidentified behavior may not be a swallow. We have changed the name of this behavior to “upper airway motor activity.” However we also cannot rule out the possibility of this being some portion of a graded swallow which would argue that a graded swallow response is exact evidence against the classic all or nothing swallow behavior.

      Please expand on this point and put it into context with others' results: "This brings into question whether this is the first evidence against the classic dogma of swallow as an "all or nothing" behavior, and/or whether this is an indication that activating the cholinergic/glutamatergic neurons in PiCo is not only gating the SPG, but is actually involved in assembling the swallow motor pattern itself."

      This has been expanded and included citation of other studies. The following paragraph can be found in the discussion

      Swallow has been thought of as an “all or nothing” response as early as 1883 (Meltzer, 1883). Whether modulating spinal or vagal feedback (Huff A, 2020b), central drive for swallow/breathing (Huff, Karlen-Amarante, Pitts, & Ramirez, 2022) or lesions in swallow related areas of the brainstem (Car, 1979; Robert W Doty, Richmond, & Storey, 1967; Wang & Bieger, 1991) swallow either occurred or did not. Swallows are thought to be a fixed action pattern, with duration of stimulation having no effect on behavior duration (Fig. 3) (Dick, Oku, Romaniuk, & Cherniack, 1993). Thus, it was particularly interesting that in instances when few PiCo neurons were transfected, either unilateral or bilateral, an unknown activation of upper airway activity occurred. Motor activity no longer outlasted laser stimulation rather was contained within, and the timing of the motor sequence was reversed in comparison to a water or PiCo evoked swallow (Fig. 6). Thus, if insufficient numbers of neurons are activated, PiCo’s influence to specifically activate swallow or laryngeal activation is blurred, resulting in the uncoordinated activation of muscles involved in both behaviors. This brings possible evidence against the classic dogma of swallow as an “all or nothing” behavior, or the presence of an entirely different behavior. We are not the first to bring possible evidence against the classic dogma, “small swallows” were described but failed to be discovered if this was in-fact a partial or incomplete swallow (Miller & Sherrington, 1915). The SPG is thought to consist of bilateral circuits (hemi-CPGs) that govern ipsilateral motor activities, but receive crossing inputs from contralateral swallow interneurons in the reticular formation, thought to coordinate synchrony of swallow movements (Kinoshita et al., 2021; Sugimoto, Umezaki, Takagi, Narikawa, & Shin, 1998; Sugiyama et al., 2011). Incomplete activation of PiCo activates the muscular components of a swallow, without establishing the coordinated timing and sequence of the pattern. It is possible that PiCo is involved in assembling the swallow motor pattern itself and unilateral activation of PiCo could either desynchronize swallow interneurons or activates only one side of the SPG. Since we did not record bilateral swallow related muscles and nerves this question needs to be further examined.

      Reviewer #3 (Public Review):

      Huff et.al further characterise the anatomy and function of a population of excitatory medullary neurons, the Post-inspiratory Complex (PiCo), which they first described in 2016 as the origin of the laryngeal adduction that occurs in the post-inspiratory phase of quiet breathing. They propose an additional role for the glutamatergic and cholinergic PiCo interneurons in coordinating swallowing and protective airway reflexes with breathing, a critical function of the central respiratory apparatus, the neural mechanics of which have remained enigmatic. Using single allelic and intersectional allelic recombinase transgenic approaches, Huff et al. selectively excited choline acetyltransferase (ChAT) and vesicular glutamate transporter-2 (VGluT2) expressing neurons in the intermediate reticular nucleus of anesthetised mice using an optogenetic approach, evoking a stereotyped swallowing motor pattern (indistinguishable from a water-induced swallow) during the early phase of the breathing cycle (within the first 10% of the cycle) or tonic laryngeal adduction (which tracked tetanically with stimulus length) during the later phase of the breathing cycle (after 70% of the cycle).

      They further refine the anatomical demarcation of the PiCo using a combination of ChAT immunohistochemistry and an intersectional transgenic strategy by which fluorescent reporter expression (tdTomato) is regulated by a combinatorial flippase and cre recombinase-dependent cassette in triple allelic mice (Vglut2-ires2-FLPO; ChAT-ires-cre; Ai65).

      Lastly, they demonstrate that the PiCo is anatomically positioned to influence the induction of swallowing through a series of neuroanatomical experiments in which the retrograde tracer Cholera Toxin B (CTB) was transported from the proposed location of the putative swallowing pattern generator within the caudal nucleus of the solitary tract (NTS) to glutamatergic ChAT neurons located within the PiCo. We would like to thank the reviewer for acknowledging the technical advances of the present study and for the positive statements in general.

      Methods and Results

      The experimental approach is appropriate and at the cutting edge for the field: advanced neuroscience techniques for neuronal stimulation (virally driven opsin expression within a genetically intersecting subset of neurons) applied within a sophisticated in vivo preparation in the anaesthetized mouse with electrophysiological recordings from functionally discrete respiratory and swallowing muscles. This approach permits selective stimulation of target cell types and simultaneous assessment of gain-of-function on multiple respiratory and swallowing outputs. This intersectional method ensures PiCo activation occurs in isolation from surrounding glutamatergic IRt interneurons, which serve a diverse range of homeostatic and locomotor functions, and immediately adjacent cholinergic laryngeal motor neurons within the nucleus ambiguous (seen by some as a limitation of the original study that first described the PiCo and its roll in post-I rhythm generation Anderson et al., 2016 Nature 536, 76-80). These experiments are technically demanding and have been expertly performed.

      Again, we would like to thank the reviewer for these positive comments acknowledging the advances of the present study.

      The supplemental tracing experiments are of a lower standard. CTB is a robust retrograde tracer with some inherent limitations, paramount of which is the inadvertent labelling of neurons whose axons pass through the site of tracer deposition, commonly leading to false positives. In the context of labelling promiscuity by CTB, the small number of PiCo neurons labelled from the NTS (maybe 5 or 6 at most in an optical plane that features 20 or more PiCo neurons) is a concern. Even assuming that only a small subset of PiCo neurons makes this connection with the presumed swallowing CPG within the cNTS, interpretation is not helped by the low contrast of the tracer labelling (relative to the background) and the poor quality of the image itself. The connection the authors are trying to demonstrate between PiCo and the cNTS could be solidified using anterograde tracing data the authors should already have at hand (i.e. EYFP labelling driven by the con-fon AAV vectors from PiCo neurons (shown in Fig5), which should robustly label any projections to the cNTS).

      We fully agree with the reviewer that the CTB staining is of a lower standard and have removed this approach.

      The retrograde labelling from laryngeal muscles seems unnecessary: the laryngeal motor pool is well established (within the nAmb and ventral medulla), and it would be unprecedented for a population of glutamatergic neurons to form direct connections with muscles (beyond the sensory pool).

      The authors support their claim that PiCo neurons gate laryngeal activity with breathing through the demonstration that selective activation of glutamatergic and cholinergic PiCo neurons is sufficient to drive oral/pharyngeal/laryngeal motor responses under anaesthesia and that such responses are qualitatively shaped by the phase of the respiratory cycle within which stimulation occurs. Optical stimulation within the first 10% of the respiratory cycle was sufficient to evoke a complete, stereotyped swallow that reset the breathing cycle, while stimuli within the later 70% of the cycle, evoked discharge of the laryngeal muscles in a stimulus length-dependent manner. Induced swallows were qualitatively indistinguishable from naturalistic swallow induced by the introduction of water into the oral cavity. The authors note that a detailed interpretation of induced laryngeal activity is probably beyond the technical limits of their recordings, but they speculate that this activity may represent the laryngeal adductor reflex. This seems like a reasonable conclusion.

      We thank the reviewer for this comment. Unfortunately, we felt compelled to remove the word “gating” based on the statements by reviewer 1.

      The authors propose a model whereby the PiCo impinges upon the swallowing CPG (itself a poorly resolved structure) to explain their physiological data. The anatomical data presented in this study (retrograde transport of CTB from cNTS to PiCo) are insufficient to support this claim. As suggested above, complementary, high-quality, anterograde tracing data demonstrating connectivity between these structures as well as other brain regions would help to support this claim and broaden the impact of the study.

      We fully agree with this reviewer. We have been working on a thorough anatomical characterization for more than 3 years using cutting edge anterograde and retrograde viruses in collaboration with vector experts at the University of Irvine. But these are partly novel, unpublished techniques that are in development, and require many careful controls and characterization. We feel that this is a separate study as it doesn’t relate to swallowing coordination and also includes partly different authors. We hope to submit this as a separate study later this year.

      This study proposes that the PiCo in addition to serving as the site of generation of the post-I rhythm also gates swallowing and respiration. The scope of the study is small, and limited to the subfields of swallowing and respiratory neuroscience, however, this is an important basic biological question within these fields. The basic biological mechanisms that link these two behaviors, breathing and swallowing, are elusive and are critical in understanding how the brain achieves robust regulation of motor patterning of the aerodigestive tract, a mechanism that prevents aspiration of food and drink during ingestion. This study pushes the PiCo as a key candidate and supports this claim with solid functional data. A more comprehensive study demonstrating the necessity of the PiCo for swallow/breathing coordination through loss of function experiments (inhibitory optogenetics applied in the same transgenic context) along with robust connectivity data would solidify this claim.

      Thanks again for the positive assessment of our study.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper explores the potential regulatory role of a previously unstudied phosphorylation site in the Src kinase SH3 domain. The data presented conclusively demonstrate that a phosphomimetic mutation of this site, src90E, causes an elevation in Src kinase activity, changes the structure of the Src catalytic domain as determined with a FRET sensor, disrupts certain SH3 domain interactions, causes changes in kinase intracellular dynamicity, and promotes cell invasiveness. Based on the behavior of the phosphomimetic mutant, the idea that constitutive phosphorylation of Y90 could have all of these effects is well-supported by the data. However, in wild-type cells or cells transformed by activated forms of Src, there is no constitutive phosphorylation of this site. Therefore, the question remains whether Y90 phosphorylation occurs to any significant extent in cells, and the data suggesting that it could do so is limited. It also remains to be conclusively established whether Y90 phosphorylation occurs via autophosphorylation.

      Major comments:

      1) Y90 was identified as a site of phosphorylation in Luo et al. It would be helpful if more information were provided about its significance relative to other sites identified in that study. Was it detected in non-transformed cells? Was it a major site? How does it relate to Y416 in abundance? The reference to the identification of the site in a different study from the White lab is made in the discussion but not in the introduction (this should be corrected). How abundant was it that study? A fuller description of its detection would strengthen the rationale for this study. Any additional phosphoproteomics studies that identified it should also be included.

      As indicated in the manuscript (Figure 3C and new 3D), the amount of Y90 phosphorylation increases with the level of Src activation. Standard proteomic/phosphoproteomic data cannot be quantified in absolute values for technical reasons, only relative quantification is possible to some extent. To overcome this issue and address the question of the amount of Y90 phosphorylation, we newly prepared the corresponding stable isotope-labeled phosphopeptides and used them as internal standards. To the best of our knowledge, this allowed us to quantify for the first time the amount of specific tyrosine phosphorylation of Src kinase in cells. We found that in case of WT Src, the major phosphorylation site localized in the activation loop of the kinase domain, Y416, is phosphorylated in 22 % of molecules. In activated Src, this pool of Y416-phosphorylated molecules increases 2,5 times to 57 %. Y90 is phosphorylated in approximately 1 % of WT Src molecules but becomes 5 times more abundant in case of the activated kinase (5,3 % of phosphorylated molecules). This newly added data of absolute Src tyrosine phosphorylation (Figure 3D) is consistent with values we obtained from relative MS quantification of Src variants differing in catalytic activity (Figure 3C). Although the enrichment of Y90 phosphorylation in the catalytically active kinase is lower compared to Y416 phosphorylation in terms of percentage of phosphorylated molecules, it’s increment with respect to the basal state is significantly higher. We believe that this broader dynamic range of Y90 phosphorylation is in agreement with the demonstrated regulatory function of Y90 phosphorylation. We incorporated these new results and methodological approach into the revised manuscript. We also extended the original description of the MS protocol to include a description of relative quantification, which was included in the original manuscript.

      Phosphorylation of Y90 was only detected in Luo et al. and Johnson et al. phosphoproteomic screens. However, phosphorylation of tyrosines homologous to Src Y90 was described in a vast number of proteins. Some of them are mentioned in the discussion e.g., Btk, Abl, p130Cas or Src family kinases Yes and Fyn. The presence of phosphorylation on homologous tyrosines and the evolutionary conserved nature of Y90 in SH3 domains supports relevance of Src Y90 phosphorylation despite the small number of studies that were able to identify it. In our opinion, this can be attributed to its low abundance in the basal state and the technical difficulties of its detection, as discussed below in response to point 2.

      We emphasize the Luo et al. study in the introduction because it was the only study reporting Y90 phosphorylation at the time of the project’s initiation and led us to study Y90 further. Both studies are then mentioned in the discussion, which we believe is appropriate and sufficient.

      2) Related to point 1, is there evidence from the literature indicating a significant site of phosphorylation in Src has been overlooked? Or, was this site only identified because of the recent advances in MS technology and increased sensitivity of this methodology? An introduction to these points would also enhance the rationale for the study.

      In the manuscript discussion, we mention an early study (Erpel et al., 1995) which mapped conserved residues within the binding surface of the Src SH3 domain. It showed that mutation of Y90 to alanine led to partially deregulated Src and reduced affinity of the SH3 domain. Although they acknowledged the importance of Y90 for SH3 domain binding ability, they did not probe or discuss the effect of Y90 phosphorylation status. Furthermore, the level of Src Y90 phosphorylation in untransformed cells is relatively low (20-fold lower than Y416 phosphorylation). It is therefore not surprising that it has not been identified in most general phosphoproteomic studies performed on untransformed cells. In fact, in many of these studies, Y416 phosphorylation was not detected either. The detection of Y90 phosphorylation by Luo et al. likely reflects the fact that it was performed in Src527F-transformed cells, similarly Johnson et al. used HGF-activated cells. Last, we also cannot exclude that the tryptic peptides with Y90/pY90 are less detectable in MS depending on the experimental conditions. In fact, the "heavy" Y90 peptide was consistently much less (10-80 times less) detectable in our hands than the Y416 peptide. This could be because of its worse ionizability, stability in vacuum or some other technical reasons.

      In our approach, we used immunoprecipitated Src molecules to maximize the amount of Src in the sample and targeted MS, which allowed us to specifically detect even low abundant ions/peptides. This represented the critical technical approach that allowed us to consistently detect Y90 phosphorylation in untransformed cells.

      3) The explanation of the MS experiment designed to show that Y90 phosphorylation happens in cells is insufficient in the text. It is not clear why the SYF cells were not used and not clear why the FRET sensor constructs were used. It is also not clear whether or how the proteins were purified before MS analysis. Also, rather than showing the MS data as a relative level, it would be preferable to provide the number of spectra obtained for each peptide/phosphopeptide and compare this also to Y416. A fuller comparison between the phosphorylation of Y90 to that of Y416 is necessary in order to place the potential Y90-mediated phosphoregulation in context.

      We are sorry for the confusing description. With the new quantification data, we have rewritten this section and hopefully made it clearer. We kept the original relative quantification data as they nicely show that abundance of Y90 phosphorylation increases with enhanced activity of Src. However, we added new MS analysis of Src tyrosine phosphorylation performed with labeled peptides as internal standards that provides absolute numbers of Y416 and Y90 phosphorylation in cells. The new <br /> measurements confirm the original data showing increased Y90 phosphorylation in activated Src variants and suggest that Y90 phosphorylation is not a rare event but represents an important regulatory element in Src activation. Our approach of MS quantification of phosphorylation events using labeled peptides as standards, allowed us, to the best of our knowledge, for the first time, to measure absolute quantities of Y416 and Y90 phosphorylation and therefore also the amount of activated Src molecules in cells.

      For technical reasons, the SrcFRET biosensor was used in all these experiments. We attempted to analyze endogenous Src in several cell lines to assess its Y90 phosphorylation. However, in our hands, the amount of Src efficiently precipitated was never sufficient to detect the "very elusive" phosphopeptide containing Y90. We believe this was not caused by low amounts of Src in the cells, <br /> but rather because the anti-Src antibody performed much worse than the anti-GFP antibody used for SrcFRET biosensor (two high affinity epitopes) immunoprecipitation. We have previously shown that the SrcFRET biosensor functions in the same way as endogenous Src (Koudelková et al., 2019), and therefore we presume that it is phosphorylated in a similar manner and rate as endogenous Src.

      4) I would like to see conclusive evidence that Y90 phosphorylation is due to autophosphorylation. This would involve relatively simple experiments. As one possibility, an IP kinase assay followed by immunoblotting with a site-specific antibody or MS or other types of phosphopeptide visualization/identification.

      We further addressed the question of Y90 autophosphorylation using a kinase dead version of Src527F bearing K295M substitution. To quantify the amount of phosphorylated Src we applied the identical approach with labeled standards and measured phosphorylation levels of Y416 and Y90. Compared to Src WT and Src527F, phosphorylation of both tyrosines in the kinase dead variant was negligible despite the presence of endogenous Src and other SFKs in the U2OS cells we used for the experiments. These results suggest that phosphorylation of Y90 does indeed depend on the intrinsic kinase activity of Src and is therefore very likely autophosphorylation.

      We have tried to address the question of Src autophosphorylation on Y90 by analyzing the level of Y90 phosphorylation in cells expressing a kinase-inactive SrcFRET construct with open conformation (527FKD) by quantitative MS. Despite the open conformation, SrcFRET527F-KD did not display any significant phosphorylation of neither Y90 nor Y416, even though we used U2OS cells which express endogenous Src and other SFKs. These results suggest that phosphorylation of Y90 depends on catalytic activity of the kinase rather than on compactness of its conformation and is therefore very likely autophosphorylation.

      5) A few other mutations would be interesting to examine in both kinase and transformation assays for comparison to the mutants that were: Y527F Y416F; Y527F Y416F Y90E. The first is a low activity control and the second is for understanding whether Y90E could overcome the lack of Y416 phosphorylation.

      Due to lack of time, we did not perform these experiments. However, we analyzed our new kinasedead 527F mutant for FRET and found that despite its inactive kinase domain and lack of Y416 phosphorylation, it still retains an open conformation. We believe that this is a strong indication that the Y90E kinase-dead mutant would behave the same way, maintaining an open conformation albeit the kinase domain is inactive.

      6) I recommend that the results are discussed in a more circumspect manner. The results presented in Figure 7 on the double mutant, Y527F Y90F, suggest that phosphorylation of Y90 is not a very significant component of Src kinase regulation, at least in these biological contexts. That Y90 phosphorylation isn't a major regulatory factor does not diminish the value of the work describing Y90 phosphorylation. However, it does alter the interpretations. I encourage a more conservative discussion of its importance and a broader discussion of why it isn't a major site of Src phosphorylation, particularly if it is due to autophosphorylation.

      We believe that given our new quantifications showing that Y90 phosphorylation is indeed considerably present and utilized in cells, the original discussion is consistent with the new data and does not need to be changed.

      Reviewer #2 (Public Review):

      The manuscript "Phosphorylation of tyrosine 90 in SH3 domain is a new regulatory switch controlling Src kinase" describes efforts to understand how phosphorylation of tyrosine (Y90) in the SH3 domain of Src affects the activity and function of this multi-domain kinase. The authors find that an Src variant containing a phospho-mimetic mutation (Glu) at position 90 demonstrates elevated activation levels in lysates and cells (Figure 1) and adopts a less compact autoinhibited conformation within the context of a SrcFRET biosensor in lysates (Figures 3A, 3B). A series of pulldown experiments with an isolated SH3 domain (Figure 2A, 2B) or full-length Src (Figure 2C, 2D) that contain the phospho-mimetic Y90E mutation demonstrates that phosphorylation of Tyr90 would likely disrupt the interaction of Src's SH3 domain with intermolecular binding partners and the linker that couples SH2 domain/C-tail binding to autoinhibition, which provides a mechanistic basis for the observed elevated kinase activity of Src Y90E. By performing a series of imaging experiments with a SrcFRET biosensor, the authors show that the Y90E mutation does not show enhanced localization at focal adhesions like a hyperactivated Src mutant (Y527F) that contains a non-phosphorylatable C-tail (Figure 4A). However, using ImFCS combined with TIRF microscopy (Figure 4B), the authors demonstrate that Src Y90E shows similarly reduced mobility (relative to the WT SrcFRET biosensor) at the plasma membrane (especially at focal adhesions) as Src Y527F. Consistent with the elevated kinase activity of Src Y90E, the authors go on to demonstrate that the Src Y90E variant shows an ability to transform fibroblasts-at levels that are intermediate between wild-type Src and the hyperactive Src mutant Y527F (Figure 5). Similarly, Src Y90E confers an intermediate level (between wild-type Src and Src Y527F) of invasiveness and ability to form spheroids. Together, these comprehensive experiments with a Y90 phospho-mimetic strongly support a model where phosphorylation of Src's SH3 domain at Tyr90 would lead to a more intramolecularly disengaged SH3 regulatory domain and enhanced kinase activity in cells.

      Most of the conclusions in this paper are well supported by solid data, but confidence in several assays would be higher if additional technical detail or controls were provided and the biological significance of these findings would be higher if the role that Y90 phosphorylation plays in Src regulation and function were better delineated.

      1) The kinase activity assays in Figures 1C,1D, and 7A need to be scaled to the Src variant levels present in the lysate (quantification of relative Src levels is not provided).

      For kinase activity measurements, we used lysates of equal protein concentrations prepared from cell lines stably expressing Src variants. These cell lines were sorted and repeatedly tested for equal expression of Src constructs using immunodetection of Src on Western blots. We corrected the <br /> methods section and added this information to the description of kinase assays experimental setup.

      2) More details are required for the experiments quantifying Y90 phosphorylation levels in Figure 3C. The experimental states that equal amounts of IP'd proteins were used for these analyses but there are no details on how this was confirmed. In addition, the experimental states that normalized intensities were used for your quantifying the Y90 phospho-peptide but no details are provided on how normalization was performed (the legend states that a base peptide was used but it is unclear what this means).

      The paragraph on mass spectrometry analysis in the Materials and Methods section has been updated with the required information.

      3) A key question is whether Y90 phosphorylation serves a regulatory role in Src's cellular activity and, if so, what is the regulatory network that mediates this phospho-event. Using a mass spectrometry readout with three Src variants (wild type vs. Y527F vs. E381G) that possess differing kinase activities, the authors demonstrate that Y90 phosphorylation levels correlate to Src's kinase activity (Figure 3C), which they suggest is an indication that this residue is an autophosphorylation site (or phosphorylated by another Src family kinase). However, as Src's kinase activity correlates with SH3 domain disengagement (which leads to a more accessible Y90), it is also entirely possible that another tyrosine kinase is responsible for this phosphorylation event. More importantly, it is unclear under which signaling regime Y90 phosphorylation would play a significant regulatory role. This phospho-event was observed in a previous phospho-proteomic study but it is unclear whether the phosphorylation levels of this site occur high enough stoichiometry to modulate the intracellular function of Src and whether there is a regulatory signaling network that influences Y90 phosphorylation levels.

      We have tried to address the question of Src autophosphorylation on Y90 by analyzing the level of Y90 phosphorylation in cells expressing a kinase-inactive SrcFRET construct with open conformation (527F-KD) by quantitative MS. Despite the open conformation, SrcFRET527F-KD did not display any significant phosphorylation of neither Y90 nor Y416, even though we used U2OS cells which express endogenous Src and other SFKs. These results suggest that phosphorylation of Y90 depends on catalytic activity of the kinase rather than on compactness of its conformation and is therefore very likely autophosphorylation.

      To further support our data on relevance of Y90 phosphorylation in cells, we performed a new MS analysis of Y90 and Y416 phosphorylation in WT and activated Src. This time we used corresponding stable isotope-labeled peptides and phosphopeptides as internal standards for MS quantification. This allowed us to measure absolute amounts of phosphorylated molecules and changes in their numbers, which is information that cannot be acquired by standard biochemical or proteomic approaches and is usually lacking for the majority of known phosphorylation sites. We found that in case of WT Src, the major phosphorylation site localized in the activation loop of the kinase domain, Y416, is phosphorylated in 22,6 % of molecules. In activated Src, this pool of Y416-phosphorylated molecules increases 2,5 times to 57 %. Y90 is phosphorylated in approximately 1 % of WT Src molecules but becomes 5,1 times more abundant in case of the activated kinase (5,3 % of phosphorylated molecules). This newly added data of absolute Src tyrosine phosphorylation (Figure 3D) is consistent with values we obtained from relative MS quantification of Src variants differing in catalytic activity (Figure 3C). Although the enrichment of Y90 phosphorylation in the catalytically active kinase is lower compared to Y416 phosphorylation in terms of percentage of phosphorylated molecules, it’s increment with respect to the basal state is significantly higher. We believe that this broader dynamic range of Y90 phosphorylation is consistent with and reflects the demonstrated regulatory function of Y90 phosphorylation. We incorporated these new results and methodological approach into the revised manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript focuses on a set of neurons from the border between the central and medial amygdala (AMGc/m-PAG ) that project to neurons in the periaqueductal gray (PAG) that gate ultrasonic vocalizations (USVs). These neurons suppress vocal production and are active in contexts where vocalizations would be inappropriate (e.g. in the presence of predator cues, or aggressive encounters with conspecifics). They then further characterized these neurons, demonstrating that like in males, these neurons are GABAergic in females and in both sexes, half of these neurons express estrogen receptor alpha (Esr1). To examine the inputs into these neurons, the authors performed monosynaptically-restricted transsynaptic rabies tracing and identified numerous cortical and subcortical projections. Of particular interest, neurons from the preoptic area of the hypothalamus (POA) in addition to terminating on PAG-USV neurons also project to AMGc/m-PAG neurons. Imaging the terminals of these neurons revealed elevated activity during vocalization-promoting contexts and optogenetically stimulating them resulted in evoking USVs. Together, these experiments further identify and quantify a circuit incorporating external factors (e.g. predatory factors, social interactions) in the drive to produce vocalizations.

      The authors are commended for use of male and female mice, demonstrating that even though they produce USVs in different social contexts, AMGc/m-PAG neurons share a function in suppressing USV production in both sexes. They do this convincingly with a variety of methodologies while incorporating appropriate controls (e.g. light-only and GFP-control in optogenetic experiments). The experiments are performed in a logical order and the data generated is elaborate.

      We appreciate the reviewer’s commendations and for their appreciation of the convincing insights provided by our study. We provide detailed responses to their recommendations in the following section. We hope the reviewer finds these revisions satisfactorily address their concerns.

      Reviewer #2 (Public Review):

      The existence of PAG-USV-producing neurons has been recently established, alongside two independent pathways, POA->PAG, and AMG->PAG, that promote and inhibit the production of ultrasound vocalizations in female and male mice, respectively. Because vocalizations can be modulated in a variety of contexts, such as in the presence of a predator, the authors first show that the AMG->PAG pathway is activated in situations where mice stop vocalizing, such as in the presence of a predator or aggressive conspecifics, and can inhibit natural vocalizations in contexts where females vocalize (extending to their previous findings in male mice). Interestingly, AMG->PAG neurons also receive input from POA neurons that are known to promote vocalizations via their connection to PAG interneurons that inhibit PAG-USV-producing neurons. This POA->AMG and PAG pathway is inhibitory and therefore its capacity to promote vocalizations via these two parallel pathways might be achieved by its inhibition of AMG and PAG neurons that inhibit the PAG-USV producing neurons. While these results hint at possible mechanisms that could underlie the hierarchical control of vocalization, and how different external signals impinge on existing pathways to produce behavior flexibility, the study is missing important elements to draw such conclusions. Overall, the study is also missing important information on how experiments were performed.

      We appreciate the reviewer’s efforts to evaluate our manuscript and provide constructive feedback. In the following section, we have responded to all the reviewer’s comments and concerns and provide all but one of the previously missing elements and information. We also maintain that the results and additional analysis we provide in this manuscript go beyond merely hinting at possible mechanisms, and instead provide explicit synaptic mechanisms by which vocal-promoting and vocal suppressing signals interact in the mouse’s brain.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors worked towards a better understanding of the functional diversification of flavodoxins among diatoms, and this represents a quantum contribution building on the initial findings of Whitney, Lins, Hughes, Wells, Chappelle, and Jenkins (2011), with the inclusion of metatranscriptomic and other data from field collections and on-deck incubation experiments, relatively new genomic and transcriptomic datasets, and the adoption of reverse genetics tools that are not yet widely used in T. pseudonana. They hypothesize that clade I flavodoxins play a role in mitigating oxidative stress, while additional clade II flavodoxins would respond according to canon, in response to low iron availability.

      The authors embarked on several field campaigns across environmental gradients where iron-responsive and oxidative stress-responsive flavodoxins were expected to show differential expression. The use of metatranscriptomics allowed taxa-specific assignment of relative transcript expression levels, and the results of both measurements across the environmental gradient and manipulative incubation experiments show the widespread taxonomic distribution of iron-responsive clade II flavodoxin. The fieldwork was well thought out, and biogeochemical trends comported to expectations. It's worth noting that the concomitant inclusion of geochemical data such as dissolved iron further strengthened the work. The authors also found clade I flavodoxins were not iron-responsive (as expected), but rather exhibited diel patterns in transcript abundance that suggest responses to photo-oxidative stress. Taken together, these field data are stunning.

      We thank the reviewer for this kind assessment.

      Lab experiments with five diatom species grown under varied iron and induced oxidative (H2O2) stress and transcript abundances for flavodoxin genes are reported. One reservation concerns the untoward and unknown effects of inducing outright iron starvation with the strong chelator, DFB (as opposed to achieving steady-state growth rate limitation from low iron by use of weak chelators such as EDTA). With DFB it is also difficult to predict sample timing (when cells have hit that "correct" and reproducible iron-limited space) when independent replicates are collected on different dates. Similarly, the use of DFB also makes it difficult to sample low and high iron cells at the same density or to maintain densities among replicate samples collected on different dates. pH and CO2 availability change with density unless special measures are taken.

      We agree with the reviewer that DFB is a strong iron chelator that may affect diatom physiology in inadvertent ways. We designed the DFB experiments to allow us to screen multiple diatoms for whether they transcribed clade I and II flavodoxins in a short-term response to iron limitation.

      We added the logic behind this experimental design (L177-179):

      “In order to screen multiple diatoms for whether they transcribed clade I and II flavodoxins in response to iron limitation, we used the strong iron-chelator desferrioxamine B (DFB) and enhanced short-term iron limitation.”

      Additionally, we now discuss the possible effect of DFB in our discussion (L395-410):

      “Notably, we used the strong iron chelator DFB to enhance iron limitation in a variety of diatoms, as previously described (Andrew et al., 2019; Kranzler et al., 2021; Lampe et al., 2018; Timmermans et al., 2001; Wells, 1999), while recognizing that undesirable effects of DFB, that are not related to iron-limitation per se cannot be ruled out. Here, DFB was used in experiments designed to test whether transcription of the two flavodoxin clades differentially responded to iron limitation. The results from T. oceanica, and T. pseudonana agree with the literature, in which DFB was not added. In T. oceanica only the expression of one clade II flavodoxin was induced (Figure 2B-C, as in Lommer et al., 2012). The minor induction in mRNA of T. pseudonana clade I flavodoxin in response to iron limitation was detected in both long- and short-term adaptation to low iron, without added DFB (Goldman et al., 2019; Thamatrakoln et al., 2012). This flavodoxin seems to have diel regulation, and the observed induction might be specific to the circadian time and the setting of the diel cycle (Goldman et al., 2019).”

      Based on the reviewer comments, we realized that our transcriptome sampling protocol was not clear. Because the diatom species have different growth rates, as well as different rates of growth-inhibition by iron limitation, we adjusted the sampling day for each species based on cell counts and photosynthetic efficiency. Importantly, the 9 samples (triplicates of 3 conditions) of each species were sampled together, at the same date and time. We also ensured that the biological replicates of each species and treatment had similar cell density at the time of harvest.

      We clarified these concerns in the Results section (L188-206):

      “For each diatom 6 replicates were grown in iron-replete conditions and 3 replicates in iron-limiting conditions until the low-iron cultures displayed a decrease in maximum photochemical yield of photosystem II (Fv/Fm), 3-6 days (depending on species, Figure 2 -figure supplement 1A-C, Figure 2A, supplementary file 1c), indicative of iron limitation, at which point transcriptome samples were collected for both the iron-limited and iron-replete conditions. Three of the iron-replete replicates were exposed to oxidative stress, mimicked by a lethal dose of H2O2, and transcriptome samples were collected about 1.5 h after exposure, when the cell phenotype (Fv/Fm or cell abundance) was unaltered from control.”

      In the Materials and Methods section (L542-545)

      "Cells were harvested by filtration onto 0.22 µm filters. Full details of the number of cells harvested per treatments, per species, and samples that failed library preparation are indicated in supplementary file 1c. The 9 samples of each diatom species were sampled together, at the same date and time. Filters were snap-frozen…”

      A second set of lab experiments involved the (non-trivial) establishment and use of "knock out" clones of the clade I flavodoxin gene in the model diatom T. pseudonana to test the oxidative stress hypothesis. This is an exciting idea and the data suggest this flavodoxin may confer resistance to oxidative stress. The conclusion would be greatly strengthened if different phenotypes could be observed between WT and KO clones in response to environmentally relevant oxidative stress (such as supra-optimal irradiance), rather than exogenous H2O2 addition.

      Based on the reviewer suggestion, we conducted a preliminary experiment with irradiation of up to 500 µE. As with the light level originally tested, there were no differences in growth rate or Fv/Fm between the WT and KO lines. We agree that future study of these knock-lines a series of much higher irradiation levels, photosynthetic-inhibitors, and other environmental stresses is interesting, but it is out of the scope of the current study.

      We now also mention this in the revised manuscript (L417-419):

      “Future studies in which the oxidative stress is driven by other environmental conditions as supra-optimal irradiation, UV radiation or biotic interactions are needed to further support the role of clade I flavodoxins in oxidative stress.”

      We clarify that our use of exogenous H2O2 additions was based on previous studies with Phaeodactylum and T. pseudonana that indicate that exogenous addition of micromolar range of H2O2 is representative for other oxidative stress-responses (Graff van Creveld, 2015, Volpert 2018, Mizrachi 2019) (L185-188):

      “Oxidative stress was induced by the lowest lethal dose of H2O2 (200-250 µM), as similar treatment was shown to be representative to other environmentally-relevant oxidative stressors in T. pseudonana and Phaeodactylum (Graff van Creveld et al., 2015; Mizrachi et al., 2019; Volpert et al., 2018).”

      The relationship between the experimental conditions and results in Figure 3C and Supplemental Figure 3H was not clear.

      Figure 3C summarize parts of Figure S3H information, Figure S3D-I present the individual clones, while Figure 3 only shows WT vs Flav-KO.

      According to the reviewer comments, we modified Figure S3H (it is now Figure S3I), and specify this relationship in the legend:

      “H-I. Percentage of Sytox Green-positive (dead) cells, measured by flow cytometry 24 h after treatment with H2O2 treatment. Orange and gray box plots represent a Flav-KO and WT respectively, single measurements are marked, color-coded by the individual colonies. H. Results of a single dose-response experiment. I. Results from additional experiments, experiments marked with an asterisk are summarized in main Figure 3C.”

      In the introduction, the authors suggest that Fe-S-containing proteins are particularly sensitive to damage via oxygen and ROS and that reliance on ferredoxin (Fd) for electron shuttling carries an enhanced sensitivity to the ROS generated during photosynthesis. References would be helpful here. Fe-S cluster-containing proteins are not monolithic regarding their behavior or susceptibility towards ROS. My limited understanding is that (i) several 4Fe-4S cluster proteins (such as aconitase, isopropylmalate isomerase) are particularly sensitive but that (ii) this is less so for canonical 2Fe-2S cluster ferredoxins; (iii) in some phototrophs Fd catalyzes the reduction of molecular oxygen to superoxide, as part of a mechanism that keeps the electron transport chain less reduced under extremely high light. Thus, ferredoxins may not necessarily be susceptible to in vivo ROS-mediated damage.

      Thank you for these comments.

      We modified our original sentence (L37-39):

      “Moreover, iron-sulfur-containing proteins are particularly sensitive to damage via oxygen and reactive oxygen species (ROS).”

      Corrected sentence:

      “Moreover, iron containing proteins are sensitive to damage via oxygen and reactive oxygen species (ROS), and Fd is down-regulated in response to oxidative stress (Singh et al., 2010, 2004).”

      Reviewer #2 (Public Review):

      In their manuscript, Van Creveld et al. set out to demonstrate divergent functions for two clades of flavodoxin in diatoms. To achieve their goals, the authors combined metatranscriptomic results originating from three separate research cruises in the North Pacific Ocean with laboratory experiments with a clade I flavodoxin knock-out mutant in the diatom Thalassiosira pseudonana. Overall, their field study confirmed that Clade II flavodoxin is mostly up-regulated under iron limitation in most diatoms that were represented in their metatranscriptomic data (Figure 5 A-F). Their field study also demonstrated that clade I flavodoxin is expressed at levels that are several orders of magnitude lower than clade II flavodoxin (figure 5H). The lower expression of clade I flavodoxin was also observed in laboratory culture experiments (Figure 2). The laboratory experiments also demonstrated that the clade I flavodoxins were responsive to iron limitation in some of the species studied (Their Figure 2C), such that the assignment of function based solely on the clade I and clade II flavodoxin classification may not always be straight forward, and that exceptions will likely be found as more diatom species are studied.

      In their quest to determine whether Clade I flavodoxin plays a role in adaptation to oxidative stress, the authors created several knock-out mutants where the clade I flavodoxin is not functional. These mutant strains responded to iron limitation in the same way as the WT strains. However, the mutant strains defective in the clade I flavodoxin were more slightly more sensitive to oxidative stress (created by exposure to lethal doses of hydrogen peroxide) than the wild-type strains. The results of the oxidative stress challenges would have been stronger if a broader concentration range of hydrogen peroxide had been used in the experiments leading to a dose-response curve for both the mutant and wild-type strains.

      Thank you for this suggestion. We now tested a broader range of H2O2 concentrations on the WT and KO strains and added a new Figure S3H, which includes responses to 0, 25, 50, 75, 100, 150, 200, 250 µM H2O2.

      The supplemental information provided in the main manuscript holds a lot of important information. Take for example Figure S4 showing the placement of reads for Clade I and Clade II in a Maximum-likelihood tree for flavodoxin in the North Pacific Ocean. The results show that clade II flavodoxin is much more commonly found in the transcripts than clade I flavodoxin.

      Perhaps different results would have been obtained by conducting a similar sampling of metatranscriptome in the Atlantic Ocean that is less subject to iron limitation.

      We agree completely and would love to analyze metatranscriptomes from the Atlantic Ocean in the future.

      Overall, the authors have provided results that support a role for Clade I flavodoxin in alleviating oxidative stress in Thalassiosira pseudonana, however, whether or not this role is universal for clade I flavodoxin in other diatom species will require further studies.

      We agree with this assessment that additional experiments with additional diatoms is a fruitful research area into the future.

    1. Author Response

      Reviewer #1 (Public Review):

      In their study Mas Sandoval and colleagues estimate, from human genomic data, two important parameters that measure how intermarriages have been affected by social stratification in the Americas: sex-biased admixture (SB), which refers to sex differences in the chances to intermarry with another ethnic group, and ancestry-based assortative mating (AM), which refers to the higher probability of partners to intermarry when they carry similar genetic ancestries. To do so, the authors train a deep neural network (DNN) with simulations of admixture with non-random mating and use ancestry tract length distributions to infer the two parameters. They show that their approach estimates SB and AM parameters with a relatively good accuracy in a number of scenarios. When applying the DNN to empirical data, they find solid evidence that social stratification has constrained the admixture processes in the Americas for the last centuries.

      In contrast with the vast majority of population genetic studies, which assume random mating, this study assesses if mating has been random or not in American populations. Furthermore, the study is very valuable because it leverages, for the first time, a deep learning approach and local ancestry inference to co-estimate the extent of SB and AM from genomic data.

      One limitation of the study, however, is that it assumes that (i) the admixture date in the simulations is known and equals 19 generations and (ii) admixture started at the same time in all admixed American populations. The authors also implicitly assume that the variance of the difference between male and female ancestry proportions only depends on AM, and not admixture timing. This may be problematic, as it has been shown that linkage disequilibrium between local ancestry tracts depends both on AM and admixture timing (Zaitlen et al., Genetics 2017).

      To clarify the assumption of fixed admixture date, we have added the following sentence in the results section (line 170) where the model is firstly described: “In both models we assume a continuous admixture process that starts 19 generations ago, knowing that the populations analysed trace the first contact of Native American and European populations in the first half of 16th century and assuming a generation time of 26 .9 years (Wang et al., 2023). In contrast with the approaches that aim to find an admixture date assuming random mating, we assume that the admixture process starts with the contact, and it is continuous and modulated with the mating parameters.”

      We thank the reviewer for such an important reference we had not included in our manuscript, whose findings support the basis of our approach. It is now included on line 70 to justify the analysis of the length of the ancestry tracts: Herein, we argue that the tract length information can measure the non-randomness of mating associated with genetic ancestry and, therefore, it can also monitor the permeability of socioeconomic and cultural barriers between subpopulations with different genetic ancestries (Zaitlen et al., 2017)

      This is also suggested by the authors' results, showing that AM estimates are much lower in admixed Americans under the two-pulse model, relative to the one-pulse model, i.e., when admixture extends over time. Estimates of AM in admixed Americans may thus be biased, if admixture actually started less (or more) than 19 generations ago.

      We evaluated the resemblance of the footprints left by either assortative mating or gene flow, by testing how a neural network trained on models with gene flow due to a second migration pulse predicts migration size on data generated by models without a second migration pulse but assortative mating only . We then tested how neural networks trained on models with assortative mating detect assortative mating from data with no assortative mating but only migration. Results are summarised in Figures 4 – supplement 1 – supplement 2 and show a strong correlation of the predicted size of the second migration pulse and the simulated level of assortative mating. Parallelly, there is also a strong correlation between the predicted assortative mating level and the size of the second migration pulse. Below, we respond to the reviewers in more detail regarding this question.

      Another potential limitation concerns local ancestry inference. The authors assume that RFMix makes no errors when inferring ancestry tracts. This can be a concern, as recent studies have shown that RFMix has reduced accuracy compared to other methods (Hilmarsson et al., bioRxiv 2022).

      In response to this comment, we performed a local ancestry analysis with Gnomix and generated the tract length profile according to the results obtained. One possible issue shared by Gnomix and RFMix is that they may infer a higher fraction of short tracts (at the expense of breaking longer ones). This issue was reported by Gravel et al. (2012). In this study, authors decided to filter out the short tracts because these tracts showed a high rate of false positives and false negatives. Therefore, we conducted an experiment to test if filtering out the shortest tract length window (i) improve the accuracy of the predictions of the simulated test data through the Mean Squared Error (MSE), ii) decrease the uncertainty of the estimations, and (iii) increase the correlation between Gnomix and RFMix-based estimates through the generalised variance.

      We also tested a modification of the tract lengths profile by dividing (or not) the tract lengths profile by the total amount of tracts in either the Autosomes or the X chromosome. Our goal was to force the neural network to focus on the profile shape rather than on the absolute value of tracts at each window to mitigate the possible bias in the tract length profile. Our experimental set-up consisted of three combinations of modifications of the tract length profile, in addition to the non-modified one.

      In Figures 4 supplement 3 – supplement 7, we show the predicted mating parameters using the modifications of the tract length profile outcoming from the local ancestry inference. Each point represents a prediction using RFMix and Gnomix tract length profiles (x and y axis, respectively) as input for each of the 1000 trained neural networks with the same architecture. We evaluated the uncertainty of the estimations for both Gnomix and RFMix and the correlation between them through the Generalised Variance. The Generalised Variance is the determinant of the covariance matrix, which increases with low values of covariance of the bivariate distribution and high values of the respective variances.The estimations of the parameters based on the tract length profile normalised by dividing by the total amount of fragments in Autosomes or X chromosome had both low values of Generalised Variances in the Gnomix-RFmix bivariate distribution of predicted parameters and low values of MSE in the prediction of simulated test data. These results indicate that by normalising the tract length profile by the total amount of fragments, the distribution is still informative and less sensitive to possible biases introduced by errors in the local ancestry analysis .

      Therefore, we present the results obtained from this RFMix profile in the main figures and tables, while showing the other predictions in the supplementary figures.

      In addition, the authors do not report a measure of uncertainty for the estimation of SB and AM, which is another important weakness. Interpretation of parameter estimates is limited if no measures of uncertainty are provided.

      We now provide the 95% CI for each parameter obtained from the distribution of predicted parameters from the 1000 trained neural networks, for both RFMix and Gnomix for the tract length profile.

      Finally, the authors compare the likelihood of two competing models, assuming a single or two admixture pulses, but do not determine the accuracy of their model choice procedure.

      We now include the confidence intervals of the composite likelihood by replicating the test for each of the 1000 bootstrapped tract length profiles for each population. None of the 95% confidence intervals includes both negative and positive results and all of them support either the one pulse or the two pulses model, except for the sub-Saharan ancestry in the Columbian (CLM) population.

      Overall, besides these methodological limitations, I expect that the study by Mas Sandoval and colleagues could be of great and broad interest for the scientific community studying population genetics, anthropology, sociology and history.

      Reviewer #2 (Public Review):

      This paper introduces a method to quantify how genetic ancestry drives non-random mating in admixed populations. Admixed American populations are structured by racial, gender, and class hierarchies. This has the potential to cause both ancestry-related assortative mating, in which the ancestry of mates tends to be correlated, and ancestry-related sex bias, in which individuals have a preference for mates with a particular ancestry composition. By applying their method to several African American and Latin American populations, Sandoval et al. further our understanding of ancestry-based population structure in this region more broadly.

      Strengths

      As many others have recently done, Sandoval et al. leverage the ability of a neural network to predict demographic parameters from high-dimensional population genomic data. Sandoval et al. first develop a clever probabilistic model of mating by defining the probability of a male and female mating as a function of the difference in ancestry between the individuals. They use this model to simulate population genomic data under various demographic scenarios, and then train a neural network on these simulated data. Finally, they apply the neural network to empirical data and learn the parameters of the underlying probability distribution, which can be related back to assortative mating and sex bias.

      One clear strength of this paper is their ability to jointly assess assortative mating and sex bias, as well as their ability to apply their model to multiple contemporary admixed populations.

      Importantly, the authors couch their results in an intersectional understanding of populations and consistently refer to research from historians and other social scientists throughout their paper, which reflects a very thoughtful awareness of the interdisciplinary nature of this research.

      Weaknesses

      The definition of assortative mating is conceptually confusing - in the text, assortative mating is introduced as genetic similarity between mates, i.e. positive assortative mating. However, based on the definition of assortative mating in their model, a population can have high assortative mating for a particular ancestry component even when there is non-zero sex bias for that component (e.g. males with low Native American ancestry are more likely to mate with females with high Native American ancestry). Fundamentally, this scenario cannot reflect positive assortative mating; rather, it reflects negative assortative mating (i.e. there is structured genetic dissimilarity between mates). However, the authors do not discuss the fact that the interpretation of the assortative mating parameter changes with the value of the sex bias parameter.

      We acknowledge that our definition of assortative mating requires more clarity. We now define it on line 155 as: The AM parameter measures the non-randomness of mating associated to a genetic ancestry. This includes both positive assortative mating -genetic similarity between mates- (when SB is zero) and negative assortative mating -genetic dissimilarity between mates- (when SB is not zero). This approach allows accounting for the male-female way of negative assortative mating through SB parameter.

      In addition, the results of the inference in ASW are difficult to interpret. They find that males of high African ancestry are more likely to mate with females of low African ancestry. This result seems counterintuitive given the body of literature that suggests sex-biased admixture in African Americans has greater male European and female African contributions. The authors do not suggest potential explanations for this observation.

      We agree that results regarding the ASW population can be confusing. Our hypothesis to explain such results is that the sex bias parameter captures both sex-biased migrations and sex-biased admixture. Therefore, it is difficult to accommodate the complex genetic history of ASW. We have extended the discussion on this aspect as follows on line 380:

      In addition, African American populations might have a complex genetic history involving on one hand male-biased sub-Saharan migration and on the other hand an admixture femalebiased in the sub-Saharan ancestry. However, our current model can only accommodate this demographic scenario with a single sex-bias parameter, and the results regarding this population should be interpreted with caution.

      Lastly, the authors have not done any simulations to assess how accurate parameter estimates are if the demographic model is misspecified, which weakens the interpretability of the results.

      We have performed a new analysis where we vary AM to generate tract length profiles to predict GFR, and viceversa. The results of this analysis are shown in the new figure 4Supplement 1. Results show how the footprint in the genome of the admixing populations of assortative mating and multiple pulse migration is similar. In the discussion we argue that both One Pulse and Two pulse models must be considered because they are supported by results obtained using X chromosome and Autosomes, respectively. We discuss how accounting for migration reduces AM values and how the resulting admixture dynamics resemble in both cases.

    1. Author Response

      Reviewer 1 (Public Review):

      1) In Figure 2, electron microscopy images represent n=1 cell, making it hard to know how generalizable the mitochondrial phenotypes are. It would be useful to see a quantitative summary of a larger dataset indicating how frequently the mitochondrial defects are seen.

      As requested, we performed quantitative analysis of mitochondrial ultrastructure in a larger dataset (n=163 analyzed in WT and n=206 in the KO) confirming that this finding is very consistent. This additional quantitative analysis that we included in the revised manuscript confirms a very significant and diffuse alteration of mitochondrial ultrastructure in Parl-/- vs WT spermatocytes (p=0.0002).

      2) In Figure 3, representative images are shown for a single field from n=1 animal. It is hard to decisively conclude that the phenotype of Pink1-/-;Pgam5-/- and Ttc19-/- testes is completely normal based on this limited data. There may be other tubules outside the field of view that are abnormal, or more subtle changes in cell ratios. This conclusion would be significantly strengthened by cell counting (e.g. # round spermatids per Sertoli cell per tubule and # spermatocytes per Sertoli cell per tubule) or other quantitation. Likewise, the similarities in phenotype between Parl-/-, Parl-/-;Pink2-/-, and Parl-/-;Pgam5-/- should be more thoroughly documented. At least some additional images should be shown.

      The goal of figure 3 is to indicate that WT, Pink1-/-;Pgam5-/- and Ttc19-/- have no gross morphological abnormality and have preserved sperm production in sharp contrast with Parl/-, Parl-/-;Pink1-/-, and Parl-/-;Pgam5-/- and the TKO that show total lack of sperm in the tubular lumen, indicating that the loss of Parl alone or in combination drives this phenotype. To strengthen these conclusions we performed additional work. We stained testis sections from all strains with an antibody for AIF-1, a marker of post-mitotic spermatids/spermatozoa included in Fig3-figure supplement 1. This additional experiment clearly confirms that production of differentiated germ cells occurs only in WT, Pink1-/-;Pgam5-/- and Ttc19-/-, but not in Parl-/, Parl-/-;Pink1-/-, and Parl-/-;Pgam5-/-. These results are consistent with the reproductive capacity of these mouse lines (the first group is fertile, the second is infertile). We acknowledge we cannot rule out minimal subclinical differences in reproductive fitness between the fertile mouse groups, but this is beyond the goal of our study.

      3) In Figure 4, it looks like there is a significant decrease in CIV-driven respiration in Parl knockouts, but the text describes this as "did not significantly enhance" - that is, the absence of an increase. This result is difficult to interpret without further explanation.

      We recognize this might be confusing but it is specified in the text that CIV driven (TMPD+ascorbate) respiration- relying on endogenous cytochrome c- is diminished (line 195) in Parl-/- testis mitochondria. This test reflects cytochrome c oxidase respiratory capacity/activity. We performed then an additional experiment just after the previous where we add exogenous cytochrome c in the cuvette to test the integrity of the outer mitochondrial membrane and checked if CIV-driven respiration increases or not after,compared to before, the addition of cytc. Exogenous cytochrome c does not cross intact mitochondrial outer membranes, so the test is performed to verify the good quality of mitochondrial preparations and/or pathological changes by looking if of the outer membrane integrity, not the function of CIV. CIV driven respiration increases only modestly after compared to before the addition of cytc and to a similar extent in both WT and Parl-/- indicating a good quality of the mitochondrial preparations and that the outer mitochondrial membrane of these mitochondria is overall well preserved in both WT and KO.

      4) In Figure 5B, there is some variation in band intensity between replicates. Quantifying the band intensity relative to the loading control would help to increase confidence in the conclusion that coQ levels are reduced.

      We performed this quantification, as suggested by the reviewer, and added the quantification in figure 5B. Quantification of the band intensity relative to the loading control confirms a significant difference between WT and KO. Moreover, we performed quantitative immunofluorescence of COQ4 in SCP-1 positive cells included now in Fig 5-figure supplement 1, which confirms a significantly decreased expression of COQ4 in Parl-/- primary spermatocytes.

      5) GPX4 is not a Parl substrate, and no explanation is provided for why it might be reduced in Parl-/- testes. This makes the result and model difficult to interpret.

      We thank the reviewer for pointing this out. We acknowledged this limitation in the discussion. We mentioned in the discussion that decreased GPX4 levels have been observed in other conditions (chemical inhibition, pathological conditions, etc.) and no mechanism has so far been demonstrated to our knowledge, but some evidence raises a possible link with CoQ deficiency that we discussed. Potential mechanisms including protein degradation are likely although unproven. This remains an important and intriguing issue to address in future studies.

      6) Since Parl knockout induces necrosis in the brain, necrosis could be a contributing factor to cell death in spermatocytes alongside ferroptosis. No data is presented that can exclude this possibility.

      Ferroptosis is actually considered, by some authors, a form of regulated necrosis (Seibt TM FRBM 2019). Therefore, we can affirm that PARL deletion leads to regulated necrosis in testis via ferroptosis through specific ferroptosis pathways that do not appear to be activated in the brain, or at least not overtly. Importantly, there is no recognized marker or specific molecular pathway for generic «accidental» necrosis that can be tested to differentiate between the 2 different cell death modalities.

      7) The severe spermatogenesis phenotype implies that Parl knockout males should be infertile, but the fertility status is not described in the manuscript. It may be difficult to test fertility in these animals due to the neurodegeneration phenotype; if so, this can be clarified. If it is feasible to test fertility, demonstration of a fertility phenotype would significantly strengthen the conclusion that loss of Parl leads to spermatogenic arrest.

      We specify in the text that Parl-/- mice are sterile due to total lack of sperm production caused by arrested spermatogenesis, as evidenced by detailed histological analysis and AIF1 staining. This is not due to the neurodegeneration since Parl-Ncre knockout have normal production of sperm as presented in the paper. Fertility in Parl-/- cannot be tested in vitro since these mice have no sperm due to the complete block of spermatogenesis, nor in vivo since they die young due to neurodegeneration. With these limitation Parl-/- males and WT females are kept together and in no single exception since the beginning of the colonies a pregnancy has ever been observed. Parl-/- mice are sterile.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors tried to measure the accuracy of the decision-making of honey bees by carrying out behavioural experiments in which they trained the bees to forage on artificial flowers of 5 different colours that offered different levels of reward. Subsequently, the bees' decision-making behaviour was tested with flowers of the same or different colours, with no reward present. The authors found that bees tend to approach a flower only when they are highly certain of a reward, and these decisions are made quickly. The majority of flowers were rejected by the bees. Based on the results of the tests, the authors created a model to identify what circuit elements or connections would be necessary to mimic the bees' decisions. This model could be potentially used for robotics.

      The study is well supported by the signal detection theory and the experiments are well designed which is a major strength. However, the methods are not completely clear, so would be better to make a clearer description. Another weakness is the lack of clear explanations of the importance and relevance of the model.

      Given the experimental design was optimal, the authors could potentially achieve the aims of this study.

      Thank you for expressing your interest and providing constructive inputs. Based on your suggestions, we have thoroughly revised our manuscript to offer a more comprehensive explanation of the rationale behind our approach, as well as its comparison to existing knowledge and methods in the field. We believe that these revisions will significantly enhance the comprehensibility of our study and facilitate a better understanding of our findings.

      Reviewer #2 (Public Review):

      By elegantly designing experiments, MaBouDi et al. elucidated honeybee's behavioral strategy to quantitatively associate sensory cues with valences. The description is simple and concise enough to understand the logic. Particularly, the authors clearly demonstrated how sensory evidence and reward likelihood quantitatively affect the decision-making process and animals' response time. Their behavioral characterization approach and proposed model could also be helpful for studies using higher animal species. I have a few doubts regarding the definition of rejection behavior and the structure of the model that is critical to lead their main conclusions.

      Thank you for your interest and valuable feedback. We greatly appreciate your input, and as a result, we have thoroughly reviewed your comments and implemented significant revisions to our manuscript. We have taken care to provide more comprehensive explanations of our methods, results, and the proposed model in order to enhance the overall comprehensibility of our study. Our intention is to ensure that readers can better understand our findings through these revisions.

    1. Author Response

      Reviewer #1 (Public Review)

      Using in vitro assays that take advantage of thymic slices, with or without the ability to present pMHC antigens, the authors define an early period in which CCR4 expression is induced, which induces their migration to the medulla and likely encounter with cDC2 and other APCs. Notably, the timing for CCR4 expression precedes that of CCR7 and illustrates the potential role for this early expression to initiate the movement of post-positive selection thymocytes to the medulla. The evidence for supporting a role for CCR4, as well as CCR7, in sequential tolerance induction is provided using multiple approaches, and although the observed changes amount to small percent changes, the significance is clear and likely biologically relevant over the lifespan of a developing T cell repertoire. Overall, the model provides a holistic view of how tolerance to self-antigens is likely induced during T cell development, which makes this work highly topical and influential to the field.

      We thank the reviewer for their comments and for highlighting the significance of identifying distinct roles for CCR4 and CCR7 in promoting medullary localization and inducing self-tolerance of thymocytes at different stages of T-cell development.

      Reviewer #2: (Public Review )

      This manuscript describes that CCR4 and CCR7 differentially regulate thymocyte localization with distinct outcomes for central tolerance. Overall, the data are presented clearly. The distinct roles of CCR4 and CCR7 at different phases of thymocyte deletion (shown in Figure 6C) are novel and important. However, the conclusion that expression profiles of CCR4 and CCR7 are different during DP to SP thymocyte development was documented previously. More importantly, the data presented in this manuscript do not support the conclusion that CCR7 is uncoupled from medullary entry. Moreover, it is unclear how the short-term thymus slice culture experiments reflect thymocyte migration from the cortex to the medulla.

      We thank the reviewer for pointing out the significance of our finding that CCR4 and CCR7 regulate different phases of thymocyte deletion. We agree that prior reports, including our own (Cowan et al. 2014, Hu et al., 2015) have shown that CCR4 and CCR7 are expressed by different post-positive selection thymocytes. However, the expression data we present here provides a higher resolution perspective on the specific thymocyte subsets that express these two receptors, as well as the different timing with which the receptors are expressed after positive selection. These data, coupled with chemotaxis assays of the granular thymocyte subsets responding to CCR4 versus CCR7 ligands, and 2-photon imaging data showing that CCR4 and CCR7 are required for medullary accumulation of distinct thymocyte subsets, are critical for delineating the unexpectedly distinct roles of these two chemokine receptors in promoting medullary entry and central tolerance.

      The reviewer raises an important question about our conclusion that CCR7 is “uncoupled” from medullary entry. We think there was likely a misunderstanding of our intended meaning, as we did not mean to imply that CCR7 does not promote medullary entry of thymocyte subsets; we have modified the wording of the abstract to replace “uncoupled” to clarify. As we detail in the Introduction, the role of CCR7 in directing chemotaxis of single-positive thymocytes towards the medulla and inducing their medullary accumulation is well established (Ehrlich et al., 2009; Kurobe et al., 2006; Kwan & Killeen, 2004; Nitta et al., 2009; Ueno et al., 2004). Instead, our data demonstrate that 1) the most immediate post-positive selection thymocyte subset (DP CD3loCD69+) does not require CCR7 for medullary entry, and 2) the next stage of post-positive selection thymocytes (CD4SP SM) express CCR7, but CCR7 recruits these cells only modestly into medulla. In contrast, CCR7 promotes robust medullary accumulation of more mature thymocyte subsets (CD4SP M1+M2), in keeping with the well-known role of CCR7 in promoting thymocyte medullary localization. We think these findings are highly significant for the field because currently, there is a widely held assumption that post-positive selection thymocytes that do not express CCR7 are located in the cortex, while those that express CCR7 are located in the medulla. Our data show that neither of these assumptions is true: CCR4 drives medullary accumulation of cells that do not yet express CCR7, and the earliest post-positive selection cells that express CCR7 continue to migrate in both the cortex and medulla. These findings form the basis of our statement that CCR7 expression is “not synonymous with” medullary localization. The finding that thymocytes do not robustly accumulate in the medulla in a CCR7-dependent manner until more the mature SP stages has important implications for central tolerance, as localization of thymocytes in the cortex versus medulla will impact which APCs and self-antigens they encounter when testing their TCRs for self-reactivity.

      The reviewer also raised concerns about whether short-term thymus slice cultures reflect physiological thymocyte migration. Short-term live thymic slice cultures have been widely used to investigate the development, localization, migration, and positive and negative selection of thymocytes, as they have been shown to faithfully reflect these in vivo processes, including confirming the role of CCR7 in inducing chemotaxis of mature thymocytes from the cortex into the medulla (Au-Yeung et al., 2014; Dzhagalov et al., 2013; Ehrlich et al., 2009; Lancaster et al., 2019; Melichar et al., 2013; Ross et al., 2014). However, we acknowledge that thymic slices are not equivalent to intact thymuses and have now discussed limitations of this system in our revised Discussion.

      Comment 1: Differential profiles in the expression of chemokine receptors, including CCR4, CCR7, and CXCR4, during DP to SP thymocyte development were well documented. Previous papers reported an early and transient expression of CCR4, a subsequent and persistent expression of CCR7, and an inverse reduction of CXCR4 (Campbell, et al., 1999, Cowan, et al., 2014, and Kadakia, et al. 2019). The data shown in Figures 1, 2, and 3 are repetitive to previously published data.

      The expression profile of CCR4, CCR7 and CXCR4 on thymocytes has been documented previously in the studies cited above and in our prior publication (Hu et al., 2015). Campbell et al. (Campbell, Haraldsen, et al., 1999) investigated chemotactic effects of chemokines, but did not directly address expression of chemokine receptors by thymocyte subsets. Cowan et al. (Cowan et al., 2014) examined the expression of CCR4 versus CCR7 on DP and CD4SP thymocytes. However, our data provide a more detailed analysis of expression of these distinct chemokine receptors by subsets of DP, CD4SP, and CD8SP thymocyte subsets along the trajectory of differentiation after positive selection, using a gating scheme inspired by a study published after the above-cited papers (Breed et al., 2019). Our more nuanced evaluation of CCR4 versus CCR7 expression sets the stage for finding that they play distinct roles in promoting medullary entry and central tolerance of early- versus late-stage post-positive selection thymocytes. Without examining CCR4 and CCR7 expression patterns by distinct thymocyte subsets in detail, we would not have made the unexpected observation that although CCR7 is expressed at high levels by many CD4SP SM thymocytes, it does not induce strong chemotaxis or medullary accumulation of this subset, relative to its role in more mature SP thymocyte subsets. This finding has important implications for which APCs thymocytes encounter as they are tested for self-reactivity to enforce central tolerance. As we were working on these studies, Kadakia et al. reported that extinguishing CXCR4 expression was important for enabling medullary entry (Kadakia et al., 2019). Thus, we thought it was important to place CXCR4 in the context of CCR4 and CCR7 expression on thymocyte subsets in our study, and in doing so found another example of asynchronous chemokine receptor expression and function, further indicating that expression of a chemokine receptor alone is not a reliable marker of functional activity or thymocyte localization, as cells migrate dynamically between the cortex and medulla.

      Through more extensive gating and simultaneous investigation of chemokine receptor expression and function, our data have provided new insights into how thymocytes respond to chemokine cues at different time points during their post-positive selection development. Moreover, our refined gating scheme (Figure 1) can be used to distinguish thymocyte subsets at different development stages without relying on chemokine receptor expression, thus providing an unbiased way of investigating chemokine receptor expression at different developmental stages.

      Comment 2: The manuscript describes the lack of CCR7 at early stages during DP to SP thymocyte development (Figure 1-3). However, CCR7 expression is detected insensitively in this study. Unlike CCR4 detection with a wide fluorescence range between 0 and 2x104 on the horizontal axis, CCR7 detection has a narrow range between 0 and 2x103 on the vertical axis (Figure 1C, 1D, 4B, 4C, 6B, S2, S3), so that flow cytometric CCR7 detection in this study is 10-times less sensitive than CCR4 detection. It is therefore likely that the "CCR7-negative" cells described in this manuscript actually include "CCR7-low/intermediate" thymocytes described previously (for example, Figure S5A in Van Laethem, et al. Cell 2013 and Figure 6 in Kadakia, et al. J Exp Med 2019).

      We provide new data to address the possibility that we were failing to detect low levels of CCR7 expression on early post-positive selection DPs (CD3loCD69+). We agree that CCR7 immunostaining of mouse cells is known to be more challenging than immunostaining of other chemokine receptors, including CCR4 and CXCR4. CCR7 immunostaining needs to be carried out at 37°C, which we did throughout our studies. We provide new data comparing CCR7 expression by Ccr7+/+ versus Ccr7-/- thymocyte subsets (Figure 1—figure supplement 2A-B), which confirm that CCR7 is not expressed at detectable levels by CD3loCD69+ DP cells above the background seen in CCR7-deficient cells. As thymocytes transition to theCD4SP SM stage, low/intermediate to high expression of CCR7 can be detected (Figure 1—figure supplement 2A). To further test whether we were failing to detect low levels of CCR7 by post-positive selection DPs, we incubated thymocytes at 37°C for up to 2 hours prior to immunostaining for CCR4 and CCR7, as a prior study indicated in vitro culture would enable increased cell surface expression of CCR7 by alleviating ligand-mediated CCR7 internalization (Britschgi et al., 2008). However, we did not observe increased CCR7 (or CCR4) expression by any thymocyte subset incubated at 37°C (Figure 1—figure supplement 2C-D). Lack of expression of CCR7 by CD3loCD69+ DP cells is consistent with their failure to undergo chemotaxis to CCR7 ligands in vitro, and initial expression of CCR7 by CD4SP SM is consistent with their chemotaxis towards CCR7 ligands in vitro (now show in greater detail in Figure 2—figure supplement 1), albeit at a much lower migration index than subsequent thymocyte subsets.

      Comment 3: Low levels of CCR7 expression could be functionally evaluated by the chemotactic assay as shown in Figure 2. However, the data in Figure 2 are unequally interpreted for CCR4 and CCR7; CCR4 assays are sensitive where a migration index at less than 1.5 is described as positive (Figure 2A and 2B), whereas CCR7 assays are dismissal to such a small migration index and are only judged positive when the migration index exceeds 10 or 20 (Figure 2C and 2D). CCR7 chemotaxis assays should be carried out more sensitively, to equivalently evaluate the chemotactic function of CCR4 and CCR7 during thymocyte development.

      We thank the reviewer for his insight about the possibility that we could have overlooked CCR7-mediated chemotaxis at lower migration indexes. When data from the chemotaxis assays were evaluated separately for each thymocyte subset, CCR7-mediated chemotaxis of CD4SP SM and subsequent DP CD3+CD69+ co-receptor reversing thymocytes could be detected. However, DP CD3loCD69+ thymocytes still did not undergo CCR7-meidated chemotaxis, but were responsive to the CCR4 ligand CCL22 (Figure 2—figure supplement 1).

      We did not detect CCR7-mediated chemotaxis of CD4SP SM and DP CD3+CD69+ subsets in our previous analysis because their lower-level chemotactic index relative to mature thymocytes did not reach statistical significance when chemotaxis of all subsets were compared simultaneously (Figure 2D). We note that the magnitude of difference in the responsiveness of CD4SP SM cells compared to mature CD4SP and CD8SP M1 & M2 thymocytes (Figure 2D) is likely physiologically important as CCR7 deficiency results in severely reduced medullary accumulation of CD4SP M1+M2 cells, but only a very mild reduction in medullary accumulation of CD4SP SM cells, which is only detected with our new paired analyses in Figure 5C. We feel these new analyses provide important new insights and thank the reviewer for this suggestion.

      Comment 4: Together, this manuscript suffers from the poor sensitivity for CCR7 detection both in flow cytometric analysis and chemotactic functional analysis. Conclusions that CCR7 is absent at early stages of DP to SP thymocyte development and that CCR7 is uncoupled from medullary entry are the overinterpretation of those results with the poor sensitivity for CCR7. The oversimplified scheme in Figure 3D is misleading.

      We agree that the scheme in Figure 3D, as previously constructed, did not ideally display the difference in scale between thymocyte responses to CCR7 ligands versus CCR4 and CXCR4 ligands (as detected in vitro). Thus, we have now modified the schematic to include the mild response to CCR7 ligands that we observed in CD4SP SM thymocytes (comment 3) and to emphasize the higher chemotactic response of mature thymocytes to CCR7 ligands than of DPs and CD4SP SM to CCR4 ligands. Likewise, we have modified the manuscript to clarify the importance of CCR7 expression in the medullary entry and accumulation of mature thymocyte subsets.

      We respectfully disagree that the sensitivity of CCR7 detection was poor in our flow cytometry and chemotactic analyses. Our CCR7 stains identified a range of CCR7 expression levels, from no expression by pre- and post-positive positive selection DP cells to high expression by CD4SP M1 cells, and we now provide new data confirming our ability to detect CCR7 expression (Figure 1—figure supplement 2), as described in response to Comment 3. Our chemotaxis assays detected CCR7 responses over a range of migration indexes from ~ 2 up to 100, showing our sensitive ability to detect CCR7-mediated chemotaxis in vitro (Figure 2 and Figure 2—figure supplement 1). In live thymic slices, we were also able to capture a range of biologic activities of CCR7, from mediating modest medullary accumulation of CD4SP SM cells to robust medullary accumulation of CD4SP M1+M2 cells (Figure 5A-C). Importantly, our results demonstrate that CCR7 is not the only chemokine receptor responsible for medullary entry and accumulation of thymocytes. Complex spatiotemporal regulation of thymocytes at distinct stages of development is achieved through tight orchestration of expression and signaling through multiple chemokine receptors, including CCR4, as shown by our data. However, our study does not negate an important role for CCR7 in mediating medullary entry of thymocytes, which we have clarified in the text.

      Comment 5: The short-term thymus slice culture experiments should be described more carefully in terms of selection events during DP to SP thymocyte development, which takes at least 2 days for CD4 lineage T cells and approximately 4 days for CD8 lineage T cells (Saini, et al. Sci Signal 2010 and Kimura, et al. Nat Immunol 2016). The slice culture experiments in this manuscript examined cellular localization within 12 hours and chemokine receptor expression within 24 hours (Figures 4, 5) even for the development of CD8 lineage T cells (Figure S2), which are too short to examine entire events during DP to SP thymocyte development and are designed to only detect early phase events of thymocyte selection.

      Experiments in Figures 4 and 5 were indeed designed to capture behaviors of thymocytes relatively early after introduction onto thymic slices. Figure 4 (and Figure 4—figure supplement 1) shows that the timing of CCR4 versus CCR7 expression after positive selection is dramatically different: CCR4 is expressed within hours of positive selection, concomitant with medullary entry, while CCR7 expression takes several days in the slices (sufficient time for CD8SP development, Figure 4—figure supplement 1). Figure 5 shows that medullary accumulation of CD4SP M1+M2 cells occurs robustly in the medulla of thymic slices within a couple of hours after introduction into the slices, and this localization is CCR7 dependent, while CCR4 induces more mild medullary accumulation of post-positive selection DPs. As indicated by the reviewer, it has been shown that it takes days for DP thymocytes to develop into mature CD4SP and CD8SP cells (Kimura et al., 2016; Lutes et al., 2021; Saini et al., 2010), as recapitulated in the thymus slice system (Figure 4—figure supplement 1) (Lutes et al., 2021). The relatively short time frame of our time-course experiments (up to 12 hours after addition of pre-positive selection thymocytes to positively selecting thymic slices) allowed us to detect expression of CCR4 within a few hours after positive selection and to determine that this timing correlated with medullary entry. Thus, the 12-hour time-course was important for temporal resolution of chemokine receptor expression and medullary localization after initial stages of positive selection.

      Comment 6: It is unclear what the medullary density alteration measured in the thymus slice culture experiments represents. Although the manuscript describes that the increase in the medullary density reflects the entry of cortical thymocytes to the medulla (Figure 4E and S2E), this medullary density can be affected by other mechanisms, including different survival of the cells seeded on the top of different thymus microenvironments. Thymocytes seeded on the medulla may be more resistant to cell death than thymocytes seeded in the cortex, for example, because of the rich supply of cytokines by the medullary cells. So, the detected alterations in the medullary density may be affected by the differential survival of thymocytes seeded in the cortex and the medulla. Also, the medullary density is measured only within a short period of up to 12 hours. The use of MHC-II-negative slices and CCR4- or CCR7-deficient thymocytes in the thymus slice cultures may verify whether the detected alteration in the medullary density is dependent on TCR-initiated and chemokine-dependent cortex-to-medulla migration.

      We thank the reviewer for pointing out these possibilities. The purpose of the positive selection timing experiment (Figure 4) was to establish the early correlation between receiving a positive selection signal, upregulating CCR4, and migrating into the medulla. In this system, cells only enter only the cortex in the first hour after migration in the slice, consistent with prior studies of localization of pre-positive selection thymocytes to the cortex (Ehrlich et al., 2009; Ross et al., 2014); subsequently, they move into the medulla. Because CCR7 is widely accepted to be essential for medullary entry, we feel it is important to demonstrate the disconnect between the timing of medullary entry and CCR7 expression in multiple ways. The timing experiment design utilized MHCII-/- and β2m-/- slices to show that positive selection was necessary for expression of CCR4. To test whether CCR4 or CCR7 were required for medullary entry of early post-positive selection DPs, we evaluated medullary accumulation of this subset from WT, Ccr4-/-, Ccr7-/-, and Ccr4-/-Cc7-/- mice. This experiment provided a more robust means of determining the extent to which CCR4 deficiency impacted medullary localization of a large cohort of cells that had passed positive selection (Figure 5), and again showed that the post-positive selection thymocytes, which express CCR4 but not CCR7, accumulate in the medulla in a CCR4-dependent manner. We note that in Figure 5, we show that all Ccr4-/-Ccr7-/- thymocyte subsets imaged have medullary:cortical density ratios of ~1, indicating an even distribution across cortex and medulla, which is highly consistent with an essential role for these two chemokine receptors in cooperating to mediate medullary accumulation of different stages of developing T cells.

      The reviewer makes an interesting point that survival cues could differ in the cortex versus medulla. However, if thymocytes lacking one or both chemokine receptors had impaired survival because they didn’t enter a region of the thymus efficiently to receive survival cues, we would expect to detect increased apoptosis in Ccr4-/-, Ccr7-/-and Ccr4-/-Cc7-/- thymocytes. However, we found that chemokine receptor deficiencies resulted in diminished apoptosis of different thymocyte subsets (Figure 6). This finding is more consistent with reduced negative selection of these subsets due to reduced clonal deletion. We nonetheless discuss this possibility in our revised manuscript, as it important to consider that chemokine-mediated migration of thymocytes into different microenvironments could alter their access cytokines and other pro-survival cues.

      Reviewer #3 (Public Review)

      In this manuscript, Li et al. examine how the expression of the chemokine receptor CCR4 impacts the movement of thymocytes within the thymus. It is currently known that the chemokine receptor CCR7 is important for developing thymocytes to migrate from the cortical region into the medullary region and CCR7 expression is therefore often used to define medullary localization. This is important because key developmental outcomes, like enforcing tolerance to self-antigens amongst others, occur in the medullary environment. The authors demonstrate that the chemokine receptor CCR4 is induced on thymocytes prior to expression of CCR7 and thymocytes exhibit responsiveness to CCR4 ligands earlier in development. Using elegant live confocal microscopy experiments, the authors demonstrate that CCR4 expression is important for the entry and accumulation of specific thymocyte subsets while CCR7 expression is needed for the accumulation of more mature thymocyte subsets. The use of cells deficient in both CCR4 and CCR7 and competitive migration/accumulation experiments provide strong support for this conclusion. The elimination of CCR4 expression results in decreases in apoptosis of thymocyte subsets that have been signalled through their antigen receptor and are responsive to CCR4 ligands. As expected, more mature thymocyte subsets show decreased apoptosis when CCR7 is absent. Distinct antigen-presenting cells in the thymus express CCR4 ligands supporting a model where CCR4 expressing thymocytes can interact with thymic antigen-presenting cells for induction of apoptosis. The absence of CCR4 results in an increase in peripheral T cells that can respond to self-antigens presented by LPS-activated antigen-presenting cells providing further support for the model. Collectively, the manuscript convincingly demonstrates a previously unappreciated role for CCR4 in directing a subset of thymocytes to the medulla.

      We thank the reviewer for appreciating the novelty of the finding that CCR4 directs distinct subsets of thymocytes into the medulla relative to CCR7, as supported by multiple lines of evidence.

    1. Author Response

      Reviewer #1 (Public Review):

      The sustainability of vaccination programs is subject to multiple threats, from a pandemic like COVID-19 to political changes. The present study assesses different strategies, including gender-neutral vaccination, to better respond to threats in HPV national immunization programs. The authors showed that vaccinating boys against HPV (compared to vaccinating girls alone), would not only prevent more cases of cervical cancer but also limit the impact of disruptions in the program. Moreover, it would help attain the goal set by the World Health Organization of eliminating cervical cancer as a public health problem sooner, even in the case of disruptions.

      Strengths and weaknesses: I found the manuscript well-written and easy to read. Decision-makers may find the results helpful in policy development and other researchers may use the study as an example to investigate similar scenarios in their local contexts. Nevertheless, there are some limitations. First, it should be considered that the present study is only applicable to India and other countries with a similar HPV context. Second, because it is a study based on a mathematical model, errors might arise from the assumptions considered for its construction. It also relies on the quality of the data used to construct and calibrate the model.

      Models are important tools for decision-making, they allow us to assess different scenarios when obtaining real-world data is not feasible. They also allow to carried-out multiple sensitivity analyses to test the strengths of the results. The study carries out a necessary assessment of different vaccination strategies to minimize the impact on cervical cancer prevention due to disruptions in the HPV immunization program. By using a mathematical model, the authors are able to assess different scenarios regarding vaccination coverage rates, disruption time, and cervical cancer incidence. Therefore, decision-makers can consider the scenario which best represents their current situation.

      The present study is not only valuable for decision-making, but also from a methodological point of view as future research can be conducted exploring more in deep the impact of vaccination disruptions and prevention measures.

      The conclusions of this paper are mostly well supported by data, but some aspects of the methodology need clarification; furthermore, some aspects of the calculations can be improved. It would be more informative, and better for comparisons between the four scenarios, to have relative measures instead of the absolute numbers of cases prevented.

      We thank the reviewer for the kind acknowledgement of the merits of the paper. We have tried to address the suggestions and questions as much as possible in the revised manuscript.

      We agree to the points of weaknesses raised by the reviewer regarding the applicability of our study results is limited to other countries and the possible errors arising from a using a mathematical model. We have added more elaborate discussion of these points in the manuscript, as follows: - Page 15 lines 310-312: “Extrapolation of the results of this study to other populations will be limited to those sharing similar patterns of demography, social norms, and cervical cancer epidemiology as India.” - Page 17 lines 361-363: “…, within the limitations of our model, the modelbased estimates show that shifting from GO to GN vaccination may improve the resilience of the Indian HPV vaccination programme while also enhancing progress towards the elimination of cervical cancer.”

      Furthermore, we have tried to clarify the rationale, advantages, and limitations of the measure of resilience we have adopted.

      Reviewer #2 (Public Review):

      This study evaluated the effect of population-based HPV vaccination programs in India which is suffering from the disease burden of cervical cancer. The authors used model simulations for estimating the outcomes by adopting the latest available data in the literature. The findings provide evidence-based support for policymakers to devise efficient strategies to reduce the impacts of cervical cancer in the country.

      Strengths.

      The study investigated the potential impact of cervical cancer elimination when HPV vaccination was disrupted (e.g., during the COVID-19 pandemic) and for meeting the WHO's initiatives. The authors considered several settings from the low to high effects of vaccination disruption when concluding the findings. The natural history was calibrated to local-specific epidemiological data which helps highlight the validity of the estimation.

      Weaknesses.

      Despite the importance and strengths, the current study may likely be improved in several directions. First, the study considered the scenario of using a recently developed domestic HPV vaccine but assuming vaccine efficacy based on another foreign HPV vaccine that has been developed and used (overseas) for more than 10 years. More information should be provided to support this important setting.

      Second, the authors are advised to discuss the vaccine acceptability and particularly the feasibility to achieve high coverage scenarios in relatively conservative countries where HPV vaccines aim to prevent sexually transmitted infection. Third, as the authors highlighted, the health economics of gender-neutral strategies, which is currently missing in the manuscript, would be a substantial consideration for policymakers to implement a national, population-based vaccination program.

      We thank the reviewer for the kind acknowledgement of the merits and strengths of the paper.

      We have tried to address the reviewer’s three points of weaknesses as comprehensively as possible in the revised manuscript.

      Regarding the first two points of weaknesses, we have provided more background information about the current situation of HPV introduction and screening in India (see the more specific replies below for where changes have been made), and some data of observed coverage in India in the states where HPV vaccination has been introduced.

      Regarding the reviewer’s third point about the health economics of genderneutral strategies, we agree fully that it is an important aspect to consider for the local policymakers. However, a health economic assessment is out of the scope of the present paper. In the present paper, we are interested in highlighting the potential health benefits on GN HPV vaccination. Given the current context of HPV vaccination in India we think it is too early to provide a realistic assessment of the health-economic balance of GN vaccination. Please note that one manuscript (de Carvalho et al., MedRxiv, doi: https://doi.org/10.1101/2023.04.14.23288563) based on the same modelling exercise and reporting a health economic assessment of girls-only (routine and catch-up) HPV vaccination in India is currently submitted for peer-review.

      Reviewer #3 (Public Review):

      The authors put together a rigorous study to model the impact of HPV vaccine programme disruptions on cervical cancer incidence and meeting WHO elimination goals in a low-income country - using India as an example. The study explores possible scenarios by varying HPV vaccination strategies for 10-year-old children between a) increasing vaccine coverage in a girls-only vaccination programme and b) vaccinating boys in addition to girls (i.e a gender-neutral vaccination programme).

      The main strength of this study is the strength of the modelling methodology in helping to make predictions and in contingency planning. The study methodology is rigorous and uses models that have been validated in other settings. The study employs a high level of detail in calibrating and adapting the model to the Indian context despite poor data availability. The detailed methodology allows future studies to employ the model and techniques with locally-contextualised parameters to study the potential impact of HPV vaccine programme disruptions in other countries.

      The work in this field can begin to help lower-income countries explore varying HPV vaccination strategies to reduce cervical cancer incidence, keeping in mind the potential for future supply chains or other related disruptions. However, the scenarios could be better sculpted to model potentially realistic scenarios to guide policymakers to make decisions in situations with limited vaccine supplies - in other words comparing scenario alternatives based on a fixed number of vaccines being available. Using comparative alternatives will help policymakers grapple with the decisions that need to be made regarding planning national HPV vaccination programmes. The results could afford to provide readers with a clearer measure of vaccine strategy 'resilience'.

      In all, the authors are able to successfully explore the potential impact of varying HPV vaccination strategies on cervical cancer cases prevented in the context of vaccine disruptions, and make valid conclusions. The results produced are rich in information and are worthy of deeper discussion.

      We thank the reviewer for the kind acknowledgement of the merits and strengths of the paper.

    1. Author Response

      Reviewer #3 (Public Review):

      The strongest aspects of this study are the structural analysis of the 90 residue KER domain. This is an important advance, discovering a founding member of a novel class of DNA binding motifs, termed a SAH-DBD (single alpha helix-DNA binding domain). Interestingly, they define a subregion of KER (termed "middle-A", residues 155-204 of Cac1) that has nearly the same DNA binding affinity and confers similar in vivo phenotypes as the full KER domain.

      This study also shows that the biological role of KER partially overlaps compensatory factors in vivo, both within the same Cac1 protein subunit (e.g. the WHD domain) and also with other proteins acting in parallel (e.g. Rtt106). That is, the presence of either WHD or Rtt106 renders the drug-resistance and silencing assays employed here insensitive to loss of the KER domain.

      However, the drug resistance and gene silencing phenotypes are inherently indirect measures of the most important claim of this work, that KER is a molecular ruler for DNA for the purpose of ensuring sufficiently large templates deposition of histone H3/H4 cargoes. Therefore, this study would be of greater impact if the authors more directly tested this measurement idea in assays that directly assess histone deposition. There are multiple options. Since the authors have in hand recombinant wild-type and mutant CAF-1 complexes, one could examine the number and/or spacing of nucleosomes formed during in vitro deposition reactions. Complementary in vivo experiments using the authors' existing mutant strains could be based on the finding that CAF-1 is particularly important for histone deposition onto nascent Okazaki fragments during DNA replication (Smith and Whitehouse, 2012; pmid: 22419157), and that the spacing pattern of nucleosomes on this DNA is greatly perturbed in cac1-delete cells.

      Thank you for the suggestion of approaches to obtain data that more directly addresses changes in nucleosome assembly due to CAF-1 KER mutants. We considered using an in vitro nucleosome assembly assay, such as the reconstitution of nucleosomes onto gapped DNA using purified components developed by Kadyrova et al., 2013 (doi: 10.4161/cc.26310). However, they found defects only in the amount of nucleosome assembly and not changes in nucleosome spacing without CAF-1. In addition, we didn’t have the system set up and knew that it would be unlikely to produce data in the time needed for a revision of the manuscript, or even show spacing changes in nucleosomes at all. Therefore, we chose an assay system in yeast that already has been used to assess the impact of CAF-1 DNA binding mutants on nucleosome assembly (Smith and Whitehouse, 2012; pmid: 22419157 and Mattiroli et al., 2017 doi: 10.7554/eLife.22799). This approach, developed by Smith and Whitehouse, uses a degradable Ligase I system in yeast, which reveals Okazaki fragment lengths, and shows a defect when CAF-1 activity is knocked out (Smith and Whitehouse, 2012). This assay also showed that mutations or deletions in the Cac1 WHD DNA binding domain, led to increased lengths of Okazaki fragments (Mattiroli et al., 2017). As the WHD DBD impacts Okazaki fragment lengths, we reasoned that mutations in the KER DBD might also.

      We generated numerous new yeast strains that included the degradable Ligase I system and collaborated with Dr. Duncan Smith of (Smith and Whitehouse, 2012; pmid: 22419157) to detect nascent Okazaki fragments in various CAC1 mutants in strains that were RTT106 or rtt106∆. We found that the Okazaki fragment lengths from cac1∆ yeast were larger and less discrete than from CAC1 yeast (as Dr Smith published previously) and that the Okazaki fragments from the cac1∆ rtt106∆ strain were barely detectable, presumably because they were too long to be resolved on the gel. However, the assay did not have sufficient resolution to detect changes between the Okazaki fragment length distribution between wild type CAC1 or the ∆KER, ∆middle-A and 2xKER mutants of CAC1, in either the RTT106 or rtt106∆ background. Therefore, we were unable to detect direct effects of the KER mutants on Okazaki-fragment lengths. We considered using the combination of KER mutants with the WHD mutants, but as this would not directly assess the effects of the KER mutants and CAF-1 proteins lacking the KER and the WHD don’t bind to DNA (Figure 3 in Mattiroli et al., 2017), we didn’t pursue it. As the complete deletion of the KER, shortening of the KER and lengthening of the KER did not give detectable changes in this assay, we also did not pursue the other mutants tested in the manuscript. Although, we are disappointed the experiment did not reveal effects that we had hoped for, this experiment provides support for the redundant functions of CAF-1 and Rtt106 in nucleosome assembly, which has not been shown using this assay. As such, we have added Figure 1-figure supplement 1g and text to the results section, methods section and strain table. We have included Prof. Duncan Smith and his student Anne Seck as authors.

      Added text lines 195 to 207: “Finally, to assess the impact of deleting the KER more directly on nucleosome assembly in vivo, we examined histone deposition onto nascent Okazaki fragments during DNA replication as we have shown previously that the length of Okazaki fragment lengths are determined by histone deposition into nucleosomes and is disrupted upon deletion of CAC1 (Smith and Whitehouse, 2012). We compared CAF-1 mutants in the WT yeast background and in yeast lacking Rtt106. We found that the Okazaki fragment length distributions of the ∆KER mutant was indistinguishable from that of WT while that of cac1∆ was disrupted (Figure 1-figure supplement Figure 1-figure supplement 3g). That we did not detect effects on Okazaki-fragment lengths for the yCAF-1 mutants lacking the intact KER is consistent with the results of the viability and silencing assays for KER mutants, which also retained the WHD. Strikingly, the Okazaki fragments from rtt106∆ cac1∆ yeast were highly disrupted (Figure 1-figure supplement Figure 1-figure supplement 3g) further highlighting the redundancy between Rtt106 and Cac1 for assembling histones onto newly replicated DNA. Therefore, t”

    1. Author Response

      Reviewer #3 (Public Review):

      The authors investigated the mechanism of transport of the GLUT5 sugar porter using enhanced sampling molecular dynamics simulations and biochemical analysis. The results suggest a possible general mechanism by which binding to a transported substrate stabilizes an occluded intermediate conformation between outward and inward-facing states of the alternating access conformational change of the protein, thereby enabling transport.

      The authors also identified key elements of this transition, associated with residues involved in sugar binding, and through elegant biochemical experiments demonstrated how mutations of the latter affect the protein function, including mutations of gating residues that can recover the function of inactive mutants.

      The general computational methodology used by authors is appropriate for addressing these questions and compared to other techniques has the advantage of bringing forth an unbiased molecular description of the transport process. The results are overall qualitatively in line with the proposed conclusions.

      A major weakness of this work is that, in contrast to previous studies with the same type of methodology, the authors do not report error analysis or careful statistical assessment of the computational results. Therefore, it is not clear whether the latter is solid or if they support the proposed conclusions. The computational data might generally benefit from an improved methodological design, such as including more degrees of freedom (or collective variables) in the description of the minimum free energy pathway, e.g. the salt-bridges.

      This has now been addressed in the essential revisions above.

      Another weakness is that some of the details of the computational analysis are not reported, therefore other investigators would not know how to reproduce the results.

      We have extended the methods section to include much more detail about the MSM construction and other computational analysis. Data files needed for reproduction are now found in a public repository with links provided in the Methods section.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript presents an inference technique for estimating causal dependence between pairs of neurons when the population is driven by optogenetic stimulation. The key issue is how to mitigate spurious correlations between unconnected neurons that can arise due to polysynaptic and other network-level effects during stimulation. The authors propose to leverage each neuron's refractory period (which begins at approximately random times, assuming Poisson-distributed spikes and conditional on network state) as an instrumental variable, allowing the authors to tease apart causal dependence by considering how the postsynaptic neuron fires when the presynaptic neuron must be muted (i.e., is in its refractory period). The idea is interesting and novel, and the authors show that their modified instrumental variable method outperforms similar approaches.

      We wish to thank the reviewer for this positive assessment.

      However, the scope of the technique is limited. The authors' results suggest that the proposed technique may not be practical because it requires considerable amounts of data (more than 10^6 trials for just 200 neurons, resulting in stimulation of more than 5000 times per neuron). Even with such data sizes, the method does not appear to converge to the true solution in simulations. The method is also not tested on any experimental data, making it difficult to judge how well the assumptions of the technique would be met in real use-cases. While the manuscript offers a unique solution to inferring causal dependence, its applicability for experimental data has not yet been convincingly demonstrated, and would, therefore primarily be of interest to those looking to build on these theoretical results for further method development.

      We thank the reviewer for this assessment and agree that the requirement for this many trials makes the estimators practically unsuitable for identifying causal interactions in large systems. However, in the revised manuscript, we can observe that the IV estimator can be beneficial after even a few thousand trials when introducing a newly improved error measurement (which we discovered thanks to these reviews). Moreover, we agree that this work will be of interest to the more theoretically oriented community for methodological improvements; we believe that the methods and causal inference framework will be interesting and useful for the wider neuroscience community. For example, considering the first (new) example in the introduction, even under two-photon single-neuron stimulation, the IV framework should be used to avoid bias amplification.

      Reviewer #2 (Public Review):

      Lepperød et al. consider the problem of inferring the causal effect of a single neuron's activity on its downstream population. While modern methods can perturb neuronal activity, the authors focus on the issue of confounding that arises when attempting to infer the causal influence of a single neuron while stimulating many neurons together. The authors adapt two basic methods from econometrics that were developed to address causal inference in purely observational data: instrumental variables and difference-in-differences, both of which help correct for unobserved correlations that confound causal inference. The authors propose an experimental procedure where neurons have spike times measured with millisecond precision and a subset of neurons are optogenetically activated. As an instrumental variable, the authors propose using the refractoriness of a stimulated neuron, resulting in absent or delayed spiking which can be used to infer its causal effect in otherwise matched conditions.

      Based on this, they develop a collection of estimators to measure the pairwise causal relationship of one neuron on another. By simulating a variety of small networks, the authors show that, provided enough data is present, the proposed causal methods provide estimates that better match underlying connectivity than methods based on ordinary least squares or naive cross-correlograms (CCHs). However, the methods proposed require extensive data and highly targeted stimulation to converge.

      Strengths:

      The value of the paper comes from its attempt to find neuroscience applications for methods from fields where causal analysis of observational data is required. Moreover, as the field develops improved methods of measuring anatomical neuronal connectivity using molecular, physiological, and structural approaches, the question of the causal influence of one neuron's spiking on another remains vital. The authors thoughtfully lay out the necessary conditions - and difficulties - required to establish this type of causal functional influence and suggest one potential approach. The collection of models tested highlighted both the strengths and difficulties of the suggested approaches.

      We wish to thank the reviewer for the positive feedback, we are delighted to share your view that obtaining methodology for estimating causal influence is vital.

      Weaknesses:

      1) I found the paper's introduction to its analysis techniques to be very confusingly written, particularly as it is designed to bridge fields. It is vital that the ideas are communicated more clearly. Some topics are explained multiple times, even after being used previously, other ideas and notations are introduced and immediately dropped (e.g. the "do operator", the ratio of covariances in the introduction to instrumental variables), and still others are introduced with no clear explanation (e.g. the weight term w, the "|Y->Y-Y*" notation, and the notation in the methods with "Y(Z=0)").

      We thank the reviewer to point out this lack of clarity and we extensively rewrote the paper to make it more accessible. The do operator is used in the methods to define Y(Z=0), but is now removed from the introduction to reduce the number of concepts introduced early in the text. The w term is now defined from the generative model. The difference in differences notation is written out fully to be clear and a sketch of the method intuition is added to Figure 1.

      1) Of particular importance, the introduction of the Z,X, and Y variables in the first full paragraph on page five, it could be made much more clear that this method is pairwise: Z and X reference the spiking of one specific stimulated neuron at two time points and Y references one specific downstream neuron. 2) In the third paragraph of the same page, the authors refer to the "refractoriness of X" and "spiking of X onto Y", but this language confuses the neurons with variables in a way that took considerable time to unpack. 3) This was not helped by Figure 1b, which suggested that Z_i, X_i, and Y_i applied to all neurons and merely reflected time points around stimulation. 4) Similarly, the introduction of the Y* variable in the difference of differences method, which the authors view as one of the main contributions, is given little clear explanation or intuition. I assume "shifted on window-size left" means measuring the presence of spiking at the same time step as X, but I see no clear definition of this. 5) The confusion about variables remains when, in Figure 1d, a "transmission probability" goes below 0 and above 1.

      1) Thank you for pointing out this lack of clarity, the suggested explanation of the variables XYZ is adopted.

      2) The language is clarified such that variables and neurons are separated.

      3) Figure is fixed such that variables refer to the neurons they represent.

      4) We have now improved the explanation of DiD with a figure for intuition.

      5) We have now redefined the “transmission probability” to effective connectivity to reduce confusion.

      I also found the network models studied after the first section and the relevant variables difficult to understand with the detail necessary to interpret the results. For example, the cartoon in Figure 2a does not seem to match the text description. I see no explanation for the external "excitatory confounder" and "inhibitory confounder" terms, nor what is done to control the (undefined) \sigma_max/\sigma_min term. I don't see anything in the methods about distinct inhibitory and excitatory neurons either. Further, the violin plots (e.g. Fig 2d) seem quite noisy (e.g. is Br, DiD really bimodal?), and it is not clear what distribution is being covered by them. If this is computational simulations, I would imagine more samples could be generated. The same vagueness issues hold for the networks in section 2.4 and 2.7.

      We have now clarified the implementation of the excitatory and inhibitory confounder and how we distinguish between excitatory and inhibitory neurons and defined the condition number. The violin plots were removed in Fig 2 since the large variance represented changes across external drive which produced largely incomparable statistics. To illustrate variance, we now show the standard deviation of the absolute error in line plots 2e and 2g.

      2) Broadly speaking, the causal estimates appear better in the sense of having smaller errors, but it's not clear to me if they are actually good or not. What does an error of 0.4 mean in terms of your ability to estimate the causal structure, and what exactly does the Error(w{greater than or equal to}0) notion refer to? It would be useful to see actual reconstructions of ground truth versus causally inferred connectivity to better understand the method's strengths and weaknesses.

      To improve clarity, we have added a paragraph in the text before figure 2 explaining a new error measure. Since the estimators give the transmission probability and not the inferred connection strength directly, we previously computed a regressed error as in Das & Fiete 2020. This error measure is equivalent to the sine of the angle between $W$ and $\hat{W}$. This error measure is not ideal and gives an indirect population measure with deviations scaled during the error regression. Upon further reflection, we realized that we could define the error directly using our definition of effective connectivity on the generative model to obtain a much cleaner and more interpretable measure. This further led us to remove one of the proposed methods (brew) as it did not perform well under this new error measure. All error measurements are updated in all figures. Error(w{greater than or equal to}0) means that we only look at positive weights; now clarified in the text

      3) I found the section on optogenetic modeling to be unsatisfying in its realism. The general result that 1 photon excitation hits a wide collection of neurons is undisputed, but the simulation does not account for a number of key factors - optogenetic receptor expression is distributed across the axons and dendrites of a cell, not only soma, scattering in tissue greatly affects transmission, etc. Moreover, experiments that attempt to do highly targeted activation have other methods for exactly this reason, such as multiphoton activation or electrophysiology. The message of decreasing performance as a function of stimulus size is important, but I struggle with the idea of the model being "realistic".

      We thank the reviewer for pointing out this unsatisfactory comparison with realistic scenarios. To mitigate we have changed the wording, but kept the simulation as is. As the reviewer pointed out optogenetic receptor expression is distributed, and here we have assumed an expression that only affects soma (experimentally plausible according to Grødem et al 2023 (10.1038/s41467-023-36324-3)), scattering in tissue is included according to the Kubelka-Munk model.

      4) The authors spend a great deal of analysis of stimulation, but little time on measurement. It seems like this approach demands a highly precise measure of spike time to know if a neuron is firing or not at a given millisecond due specifically being in a refractory state. A stimulated but refractory neuron will still likely spike as soon as it can after the momentary delay, and given the noise in the network this difference might not be easily detectable in the delay-to-spike of the downstream neuron, even assuming one spike in the presynaptic neuron is likely to cause a spike in the downstream. It would be useful to see this aspect considered with the same detail as the rest of the study.

      We thank the reviewer for pointing out this. We have now added a paragraph discussing this: “As outlined in \citep{ozturk2000ill}, ill-conditioning can affect statistical analysis in three ways and therefore similarly in inverse connectivity estimates from measured activity. First, measurement errors such as a temporal shift in spike time estimate e.g. due to low sampling frequency, inaccurate spike sorting, or general noisy measurement due to animal movement etc. In the presence of ill-conditioning the outputs will be sensitive (unstable) to small input changes. If errors are included in some variables, the inference procedures will require information about the distributional properties of these errors. Second, optimized inference can give misleading results in the presence of ill-conditioning, caused by bad design or sampling.

      There will always exist a natural variability in the observations which necessitates the assessment of ill-conditioning before performing statistical analysis. Third, rounding errors can lead to small changes in input under ill-conditioning. This numerical problem is often not considered in neuroscience but will become evermore relevant when large-scale recordings require large-scale inferences.”

    1. Author Response:

      We would like to thank the reviewers for their thorough evaluation of the presented manuscript and herewith would like to address their comments and suggestions.

      This study was funded by a NSF-grant awarded to Prof. Celio. The animal experimentation license (including animal husbandry, breeding and experiments) that is required by law to perform animal experiments was also issued to Prof. Celio. Therefore, with the retirement of Prof. Celio, the funding for the project was discontinued and the animal license was terminated. We are thus unable to answer the reviewers’ open questions with follow-up experiments. We would however like to discuss some of the reviewers’ open questions or concerns and hope this might be insightful to the interested reader.

      Reviewer #1 (Public Review):

      “First, they reported that chemogenetic activation of Foxb1 hypothalamic cell groups led to tachypnea. The authors tend to attribute this effect to the activation of hM3Dq expressed in the parvofox Foxb1 but did not rule out the participation of the PMd Foxb1 cell group which may as well have expressed hM3Dq, particularly considering the large volume (200 nl) of the viral construct injected. It is also noteworthy that the activation of the Foxb1hypothalamic cell groups in this experiment did not alter the gross locomotor activity, such as time spent immobile state.”

      Because an AAV2 serotype was used for expression of the chemogenetic tools, the spread of viral infection was much more restricted to the injection site in chemogenetic animals than was observed the AAV5-based expression of optogenetic tools. The more restricted spread of viral infection with AAV2 serotypes has previously been shown by a range of other groups (e.g. see https://doi.org/10.3389/fnana.2019.00093). This limited spread of the AAV2 serotype in our chemogenetic animals, together with the absence of the very strong locomotor phenotype observed during optogenetic stimulation experiments makes us hypothesize, that the respiratory phenotype is largely attributable to the ParvafoxFoxb1 neurons.

      “In the second experiment, the authors applied optogenetic ChR2-mediated excitation of the Foxb1+ cell bodies' axonal endings in the dlPAG leading to freezing […]. Here it is important to consider that optogenetic ChR2-mediated excitation of the axonal endings is likely to have activated the cell bodies originating these fibers, and one cannot ascertain whether the behavioral effects are related to the activation of the terminals in the PAGdl or the cell bodies originating the projection.”

      We did not consider the possibility of backpropagation induced by optogenetic axon terminal stimulation at the time of experiments. We acknowledge that this is the major limitation of our optogenetic experiments that would have to be investigated with further animal experiments.

      Reviewer #2 (Public Review):

      “3) Fig. 5, a great effort has been made to illustrate the point that CCK and Foxb1 are differentially expressed. Why not just perform a double in situ experiment to directly illustrate the point?”

      We came across the publication in which the Cck-expressing PMd neurons’ control escape behaviors, only when we were drafting the manuscript. Because this was already after the retirement of Prof. Celio and we were not able to conduct further experiments involving animals, we leveraged on in silico methods and the publicly available high-quality dataset on the gene expression of the posterior hypothalamic area. The applied in silico method of dimensionality reduction and cluster assignment is well established and widely accepted. We believe in the quality of the dataset and the reliability of these in silico results but we agree with the reviewer that an alternative would have been to illustrate the expression patterns of Cck and Foxb1 by in-situ hybridisation.

      “4) Fig. 7 data on optogenetic stimulation on immobility and breathing, since not all mice showed the same phenotype, what is the criterion for allocating these mice to hit or no hit groups?"

      We defined the group allocation criteria in the section titled “Optogenetic modulation of Foxb1 terminal in the dlPAG induces immobility” as follows:

      “OnTarget_antPAG animals had the tip of the optic fiber implant located above the dlPAG at an anterior-posterior level AP-4.04mm (from bregma) or proxymal. The OffTarget group contains animals with fiber tips located below (i.e. ventral to) the dlPAG and/or located more distal than AP -4.04mm.”

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript of Parab et al. reports a beautiful phenotype analysis of the vascular brain/meningeal anatomy in a variety of reporter lines and mutants for Wnt/β-catenin signaling and angiogenic cues (Vegfaa, Vegfab Vegfc, Vegfd) during zebrafish development.<br /> The present study extends the previous work of the same Parab, Quick, and Matsuoka, that focused on fenestrated vessel formation in the zebrafish myelencephalic choroid plexus (mCP). Vegfs were shown to regulate fenestrated vessel formation in combination, but not individually, and with only little effect on neighboring non-fenestrated brain vessel development. The fenestrated endothelium is thus known to have specific angiogenic requirements.

      The scale of investigation has now changed, and fenestrated vessel formation has been examined throughout the brain, in both circumventricular organs (organum vasculosum of lamina terminalis) and other choroid plexuses (CPs) including the diencephalic CP and its interface with the pineal gland, the eye choroid (choriocapillaris), and the hypophysis vasculature. The original finding is that a regionspecific code of angiogenic cues controls fenestrated vessel formation. The authors show that fenestrated vessels form independently of Wnt/β-catenin signaling and BBB vascular development but require different combinations of Vegfa and Vegfc/d-dependent angiogenesis within and across brain regions. A previously unappreciated function of autocrine and paracrine Vegfc signaling is demonstrated in this brain region-specific regulation of fenestrated capillary development.

      Twenty-one different fish lines accurately genotyped and characterized and including a new Reck mutant, have been instrumental to conduct vascular pattern analysis, using confocal and stereomicroscopy imaging combined with transmission EM. High-quality illustration and robust quantification methods, previously validated, have been used. The study is well organized and reflects the high expertise and strong methodology of the investigators. Data are presented in nine dense figures and the contribution of angiogenic ligands to fenestrated vessel formation can hardly be studied more indepth.

      However, and this will be my only main concern, no information is provided on the regional diversity of angiogenic receptor expression that may correlate with the regional angiogenic factor code. Without asking for a spatial transcriptomic study, the combination of Vegfr-reporter lines or in situ hybridization with a combination of receptor probes would allow for generating a comprehensive set of ligand/receptor data relative to the regional angiogenic signaling pattern involved in fenestrated vessel formation.

      We appreciate this reviewer’s positive and encouraging comments highlighting both the quality and significance of our study. As we commented in response to the Essential Revisions point #1, we anticipate that a detailed expression analysis of all four Vegf receptors at different developmental stages during CP and CVO vascularization will be best addressed with new technologies combined with optimizations of existing tools/protocols. Thus, we have provided a paragraph of discussion on our perspectives for potential Vegf receptors involved in CP and CVO vascularization in the current study.

      We address each of the points raised by the reviewer below.

      Reviewer #2 (Public Review):

      Building on their previous studies, Parab et al used a larger collection of genetically modified zebrafish lines to map the precise expression domains of different VEGF isoforms in the brain and demonstrated that different combinations of VEGF isoforms differentially control the formation of fenestrated vessels at different locations in the brain.

      The authors used three Wnt signaling mutants to convincingly show wnt signaling is essential for parenchymal angiogenesis, but not required for fenestrated vessel development, such as those in choroid plexus, suggesting fenestrated vessel and barrier vessel are differentially regulated. The previous work from this group has established that VEGF isoforms are critical for myelencephalic choroid plexus development. In this study, they carefully documented the developmental vessel patterning in the diencephalic choroid plexus/pineal gland interface. They also documented the local expression pattern of VEGF isoforms with a set of BAC transgenic fish, together with the phenotype of a series of VEGF mutant fish, the data well support that different combinations of VEGF isoforms regulate fenestrated vessel development at different brain locations.

      Given a larger temporal and spatial domain, VEGFs are critical for all forms of vessel development, there are potential redundancy mechanisms to maintain hemostasis of VEGF signaling, in this study, no data is provided to address whether LOF of one form of VEGF affects the expression of other isoforms.

      This work provided detailed evidence of different isoform combinations of VEGF regulate formation/patterning of the fenestrated vessel at CP, OVLT, and NH in zebrafish. It will be interesting to follow in the mammalian system, how well these findings are conserved, for example, which isoform of VEGF is critical for vascular patterning during the developmental stages of the pineal gland? How VEGF isoforms participate in choroid plexus development at different ventricle regions and subsequence secretory function maintenance. However, these tasks are challenging without a good genetic tool to locally manipulate VEGF isoform expression during mammalian brain vessel development.

      We appreciate this reviewer’s favorable and encouraging comments highlighting both the quality and impact of our study. We also acknowledge the great importance of the points raised by the reviewer, including the Vegf redundancy mechanisms and also our results’ conservation in mammals.

      Reviewer #3 (Public Review):

      Parab et al. investigate the requirement of specific Vegf ligands during the embryonic development of new blood vessels in different brain regions. The authors implement their previously published experimental paradigm (Parab et al 2021 eLife) combined with new transgenic and mutant zebrafish lines to show that vegf ligands (vegfaa, vegfab, vegfc, and vegfd) are required in various combinations to drive angiogenesis in distinct brain regions. Specifically, they show that individual loss of different vegf ligands causes either undetectable or partial effects in angiogenesis, while combined loss of vegf ligands results in severe defects in brain region-specific angiogenesis. As different blood vessel types (i.e. arteries, veins, lymphatics) require specific angiogenic cues, this study provides interesting new data on how the combination of these signals drives brain region-specific vascular development.

      While the conclusions of the paper are generally well supported by the data, the authors overstate some of their findings, particularly with respect to the development of fenestrated capillaries. In this study, the authors use the zebrafish transgenic reporter line, plvap:EGFP, as an indicator of fenestrations. However, the authors do not provide any evidence of fenestrations of the blood vessels of the choroid plexuses or the cranial vessels used for quantification (Figures 1, 3, and 4). While expression of Plvap protein is often used as a marker for non-blood brain barrier endothelial cells, as Plvap is the major component of the diaphragms of fenestrated capillaries, plvap:EGFP expression alone does not indicate fenestrations. This is an important point because previous work has demonstrated that targeted deletion of Plvap does not cause a loss of fenestrations, but instead a loss of the diaphragms associated with fenestrations (Stan et al 2012 Dev Cell; Gordon et al 2019 Development). Similarly, Plvap expression alone does not necessarily indicate fenestrations as an expression of Plvap is not sufficient for fenestration formation. In fact, Plvap has initially been expressed in brain endothelial cells during initial angiogenesis to the brain without evidence of fenestrations, and subsequently, Plvap expression disappears during the maturation of the BBB. Thus, to conclude that specific vegf ligands are required for the development of fenestrated capillaries, transmission electron microscopy (TEM) should be used on the capillaries examined in this study or the language describing the results should be modified accordingly. Conversely, the authors did show TEM for the choriocapillaris (Figure 5A-C) but did not show plvap:EGFP expression in these vessels.

      Additionally, the authors' usage of the phrase "development of fenestrated vessels" suggests that the study was examining signals that regulate the formation of fenestrations and not angiogenesis of vessels that may become fenestrated as demonstrated here. Therefore, as Plvap expression does not necessarily equate fenestrations (and vice-versa), the title and some of the major claims of the study are somewhat overstated.

      We appreciate this reviewer’s constructive comments and suggestions to improve this study. We agree with the reviewer that the descriptions of our findings in the original manuscript were not strictly accurate in some aspects. We have now addressed the concern of the Tg(plvap:EGFP) reporter specificity by conducting additional molecular and functional characterizations of Tg(plvap:EGFP)+ vs Tg(glut1b:mCherry)+ brain vasculature, as we have commented in response to the Essential Revisions point #2. In addition, we have made substantial revisions in describing our findings, including 1) the change of the phrase "development of fenestrated vessels" into a more appropriate phrase and 2) the clarification of the primary focus of this manuscript on “angiogenesis/vascularization”. We believe that our revised manuscript now more clearly conveys the finding of signals involved in angiogenesis/vascularization of CP and CVO vascular beds.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We are very glad that the editor and reviewers found our paper of broad interest to the community of population, evolutionary, and ecological genetics. We thank them for their positive feedback and insightful comments and suggestions. We have revised our manuscript to address some of the issues raised by the review. The main change we made was providing a detailed discussion of limitations of simulated genomes, focusing on considerations one needs to make when selecting a demographic model. This can be found in a new section “Limitations of simulated genomes” (pages 9-10). We made a few additional adjustments in other parts of the text based on the reviewers’ suggestions. They are all listed in the detailed point-by-point response to reviewers comments and questions below.

      Editor:

      1) It was noted that demographic models (or genomic parameters) that are inferred based on certain aspects of the genomic data (eg., site frequency spectrum, haplotype structure) may not recapitulate other aspects of the data. In other words, any inferred demographic models are expected to reliably reproduce only some aspects of the genetic variation data but not necessarily all. It would be helpful to emphasize this limitation in the manuscript and to include a table summarizing the types of variation that the demographic models for the catalogued species were based on.

      This is a very important point, which we addressed in the revision by adding a section entitled “Limitations of simulated genomes”. This section discusses the considerations that one should make when selecting an inferred demographic model to implement in simulation. This includes the samples used in analysis, the method used for inference, as well as various filters. In this section we also point to the documentation page of the stdpopsim catalog, which provides information about each demographic model that can help users decide whether it is appropriate for their needs. We decided not to summarize this information in a succinct table in the manuscript because it is not straightforward to summarize the strengths and potential limitations of each model in a table. Instead, we will expand the summary provided for each demographic model in the documentation page to provide additional information. See response to the second reviewer’s comment on this topic for more details.

      2) It will make stdpopsim more user-friendly to include an automated module that can visualize a demographic model given the corresponding parameters (or simulation scripts).

      As mentioned in the response to the first reviewer’s comment on this subject, the documentation page of the stdpopsim catalog provides a brief summary for each demographic model, including a graphical representation. See response below for more details.

      Reviewer #1:

      In the introduction, the authors cite numerous efforts to generate high-quality reference genomes. That's not an issue in itself, but leading with this might send the message to some readers that it is these reference genome efforts that are driving the need for population genomics analysis and simulation tools, which is not really the case - why not instead give some citation attention to actual population genomics projects aiming to address the types of evolutionary questions this paper is concerned with? The reference genome citations would fit better in the section dealing with reference genomes, where they already appear.

      Indeed, the desire to answer complex evolutionary questions is the main motivation for sequencing these genomes and also for generating realistic genome simulations. The reason we chose to lead with the genome-sequencing efforts is that high quality genome data is an important prerequisite for obtaining parameters for chromosome-scale simulations. So, with that perspective, these efforts which we cite are the driving force behind expansion of stdpopsim in the near future. Thus, we decided to leave these citations in the introduction. To balance things out, we now start the introduction with a statement about board questions in population genetics. Moreover, after we list the genome sequencing efforts, we added a list of specific types of questions that can be addressed by these newly emerging genomes, with relevant citations. The beginning of the introduction now reads:

      “Population genetics allows us to answer questions across scales from deep evolutionary time to ongoing ecological dynamics, and dramatic reductions in sequencing costs enable the generation of unprecedented amounts of genomic data that can be used to address these questions (Ellegren, 2014). Ongoing efforts to systematically sequence life on Earth by initiatives such as the Earth Biogenome (Lewin et al., 2022) and its affiliated project networks, such as Vertebrate Genomes (Rhie et al., 2021), 10,000 Plants (Cheng et al., 2018) and others (Darwin Tree of Life Project Consortium, 2022), are providing the backbone for enormous increases in the amount of population-level genomic data available for model and non-model species. These data are being used, among other things, in inference of population history and demographic parameters (Beichman et al., 2018), studying adaptive introgression (Gower et al., 2021), distinguishing adaptation from drift (e.g. Hsieh et al., 2021), and understanding the implications of deleterious variation in populations of conservation concern (e.g. Robinson et al., 2023).”

      Something that would be useful for the stdpopsim resource in general, though not necessarily something for the paper, would be some kind of more human-friendly representation of the demographic models implemented in the curated library. Perhaps I'm not looking in the right place, but as far as I can tell, if I want to study the curated demographic models, I need to go into the Python scripts on the stdpopsim GitHub page (e.g.

      https://github.com/popsim-consortium/stdpopsim/tree/main/stdpopsim/catalog/BosTau). Here the various parameters and demographic events are hard-coded into the scripts. To understand the model being implemented, one thus needs to go dig into these scripts - something which is not necessarily very accessible to all researchers. Visual representations, such as the one for Anopheles gambiae in Fig 2. in the paper, are more widely accessible. I wonder if such figures could be produced for all the curated models and included in the GitHub folders alongside the scripts, perhaps aided by an existing model visualization software such as POPdemog. Again, I would not suggest that this is necessary for the paper, but if practically feasible I think it would be a useful addition to the resource in the longer term.

      This is a very good point. The stdpopsim catalog actually has a documentation page that provides a brief summary for each demographic model, including a graphical representation. This graphical representation is generated using demesdraw applied to the demographic model object implemented in the code. Thus, potential users do not have to dig through the Python code to figure out the details of the demographic model. We used a similar approach to generate the image of the demographic history of A. gambiae for Fig. 2 of the paper. The documentation page is an important part of the stdpopsim catalog, and we now added a link to it in section “Data availability”, and we mention it in key places in the manuscript, such as the caption of Fig 2.

      Reviewer #2:

      An important update to the stdpopsim software is the capacity for researchers to annotate coding regions of the genome, permitting distributions of fitness effects and linked selection to be modeled. However, though this novel feature expands the breadth of processes that can be evaluated as well as is applicable to all species within the stdpopsim framework, the authors do not provide significant detail regarding this feature, stating that they will provide more details about it in a forthcoming publication. Compared to this feature, the additions of extra species, finite-site substitution models, and non-crossover recombination are more specialized updates to the software.

      It would be helpful to provide additional information regarding the coding annotation (and associated distribution of fitness effects and linked selection) that is implemented in the current version of stdpopsim, but will be detailed in a forthcoming paper. This is not to take away from the forthcoming paper, but I believe this is the most important update to the software, and the current manuscript only brushes over it.

      We agree that implementation of selection in simulations is a significant addition to stdpopsim. However, our intention in this manuscript is to focus on the separate effort we made in the last two years to expand the utility of stdpopsim to a more diverse set of species. We think the manuscript stands firmly even without discussing in detail the new features that allow modeling selection. The main reason we briefly mention these features in sections “Additions to stdpopsim” and “Basic setup for chromosome-level simulations” is because the released version of stdpopsim contains implemented DFEs for a few species, and we did not want to completely ignore this. We thus added a brief comment at the end of the “Basic setup” section (page 8) mentioning the three model species for which the stdpopsim catalog currently has annotations and implemented DFE models. We think that a more detailed description of how these features and how they should be used is best left to the manuscript that the PopSim community is currently writing (preprint expected later this year).

      When it comes to simulating realistic genomic data, the authors clearly lay out that parameters obtained from the literature must be compatible, such as the same recombination and mutation rates used to infer a demographic history should also be used within stdpopsim if employing that demographic history for simulation. This is a highly important point, which is often overlooked. However, it is also important that readers understand that depending on the method used to estimate the demographic history, different demographic models within stdpopsim may not reproduce certain patterns of genetic variation well. The authors do touch on this a bit, providing the example that a constant size demographic history will be unable to capture variation expected from recent size changes (e.g., excess of low-frequency alleles). However, depending on the data used to estimate a demographic history, certain types of variation may be unreliably modeled (Biechman et al. 2017; G3, 7:3605-3620). For example, if a site frequency spectrum method was used to estimate a demographic history, then the simulations under this model from y stdpopsim may not recapitulate the haplotype structure well in the observed species. Similarly, if a method such as PSMC applied to a single diploid genome was used to estimate a demographic history, then the simulations under this model from stdpopsim may not recapitulate the site frequency spectrum well in the observed species. Though the authors indicate that citations are given to each demographic model and model parameter for each species, this may not be sufficient for a novice researcher in this field to understand what forms of genomic variation the models may be capable of reliably producing. A potential worry is that the inclusion of a species within stdpopsim may serve as an endorsement to users regarding the available simulation models (though I understand this is not the case by the authors), and it would be helpful if users and readers were guided on the type of variation the models should be able to reliably reproduce for each species and demographic history available for each species. It would be helpful to include a table with types of observed variation that the current set of 21 species (and associated demographic histories) are likely and unlikely to recapitulate well.

      This is a very important point, which we now address in the section “Limitations of simulated genomes”, which we added to the manuscript. In this section, we expand on this topic and discuss various things that will affect the way simulated genomes reflect true sequence variation. This includes the choice of demographic inference method, but also the analyzed samples, and various filters. The main message of this section is that one should consider various things when deciding to implement a demographic model in simulation (or selecting a model among those implemented in stdpopsim). We also cite studies (including Beichman, et al. 2017), which compared different approaches to demography inference. However, we note that the conclusions of these comparisons are not as straightforward as the reviewer suggests. In particular, methods that make use of the site frequency spectrum (such as dadi) should be able to capture some aspects of haplotype structure, because this information is encoded in the demographic history. Furthermore, a demographic history inferred from a single genome (e.g., using PSMC) should do a reasonable job approximating some aspects of the site frequency spectrum. In other words, the aspects of genetic variation not modeled well by a given demographic inference method are not always predicted in a straightforward way. This is why we avoid summarizing this information in a table in the manuscript. The 2nd paragraph of the “Limitations of simulated genomes” section addresses some of these subtle considerations. In particular, we suggest that considering a demographic model for simulation requires some familiarity with the inference method and the way it was applied to data. Regarding the demographic models currently implemented in stdpopsim, we provide some information about each model in the documentation page of the catalog. When selecting a demographic model from the catalog, users should make use of this documentation to guide their decision. This is mentioned in the 3rd paragraph of the “Limitations of simulated genomes” section. Following-up on this issue, we intend to review the documentation and make sure it provides sufficient information for each demographic model. See this GitHub issue.

      Reviewer #3:

      - p5, 2nd paragraph: I think many Biologists, myself included, will think of horizontal gene transfer mostly as plasmids being transferred among bacteria and adding extra genetic material, not as homologous bacterial recombination. This made me confused about modelling horizontal gene transfer in the same way as gene conversion. It may be helpful for some readers if you specify that you are modelling this particular type of horizontal gene transfer. Some explanation along the lines of what is in Cury et al (2022) would be enough.

      This is a good point. We modified the text in that sentence in the 2nd paragraph on page 5 to clarify that we are modeling non-crossover homologous recombination, and not incorporation of exogenous DNA (e.g., via plasmid transfer). The relevant part of the text now says:

      “In bacteria and archaea, genetic material can be exchanged through horizontal gene transfer, which can add new genetic material (e.g., via the transfer of plasmids) or replace homologous sequences through homologous recombination (Thomas and Nielsen, 2005; Didelot and Maiden, 2010; Gophna and Altman-Price, 2022). However, the initial version of stdpopsim used crossover recombination to stand in for these processes. Although we cannot currently simulate varying gene content (as would be required to simulate the addition of new genetic material by horizontal gene transfer), the msprime and SLiM simulation engines now allow gene conversion, which has the same effect as non-crossover homologous recombination.

      Following (Cury et al., 2022), we use this to include non-crossover homologous recombination in bacterial and archaeal species.”

      - p5, 3rd paragraph: When you say gene conversion is turned off by default, you could refer to table 1 and briefly mention the consequence of ignoring gene conversion.

      We agree that it is important to note that avoiding to model gene conversion may lead to faulty lengths of shared haplotypes across individuals. This is implied by the statement we make in the beginning of the 3rd paragraph on page 5, where we lay out the motivation for modeling gene conversion in simulation. Following the reviewer’s suggestion, we now added a statement about this in the end of that paragraph:

      “Note that ignoring gene conversion may result in a slightly skewed distribution of shared haplotypes between individuals (see Table 1)”

      -  p7, item 1 and p9, 1st paragraph: I am not sure what you mean by genetic map here, can you define this term? I am not sure if it is synonymous with gene annotations, a recombination map, or something else. The linkage map doesn't seem to make sense to me here.

      The term ‘genetic map’ referred to the recombination map whenever it was used in the manuscript. To avoid any confusion, we now removed all mentions of ‘genetic map’, and use ‘recombination map’ instead. The recombination map is relevant in item 1 of page 7 because in species with poor assemblies you will not be able to reliably estimate recombination maps, making chromosome-scale simulations less effective. In the 1st paragraph of page 9, we discuss the issue of lifting over coordinates from one assembly to another, and if you have a recombination map estimated in one assembly, you might need to lift it over to another assembly to apply it in your simulation.

      -  Table 1, last row, middle column: when you say "simulated population", I think it is a bit ambiguous. You mean "the true population that we are trying to simulate", but could be read as "the population data that was generated by simulation". I would delete the word simulated here.

      What we mean here is that the selected effective population size should reflect the observed genetic diversity in real genomic data. We realize that the previous wording was confusing, and changed this to the following:

      “Set the effective population size (Ne) to a value that reflects the observed genetic diversity”

      -  Figure 2, and other places when you refer to mutation and recombination rate (eg p11, last paragraph), can you include the units (e.g. per base pair, per generation)?

      Throughout the manuscript, rates are always specified per base per generation. In Figure 2, this is specified in the caption (3rd line). We added units in other places in section “Examples of added species” on pages 12-13, where they were indeed missing.

      -  p11, "default effective population size": can you use a more descriptive word instead of the default? Maybe the historical average? Also, what is this value used for in the simulations when there is a demographic model specified (as in the case of Anopheles)?

      We think that “default effective population size” is the most appropriate term to use here, since we are referring to the parameter in the species model in stdpopsim. It is correct that the value of this parameter should reflect the historical average size in some sense, but it is really unclear what this should be in the case of a species like Bos taurus, which experienced a very dramatic bottleneck in the recent past. We address this subtle, yet important, issue in the sentence preceding this one. If a demographic model is specified in simulation, it overrides the default effective population size, and its value is ignored (which is why we refer to it as ‘default’). We added a short sentence clarifying this in the 2nd paragraph of the “Bos Taurus” section (now page 12).

      “Note that the default Ne is only used in simulation if a demographic model is not specified.”

      -  p8, when you say "Such simulations are useful for a number of purposes, but they cannot be used to model the influence of natural selection on patterns of genetic variation.": You may want to bring up the discussion that many of these neutral parameters taken from the literature could have been estimated assuming genome-wide neutrality, and thus ignoring the effect of background selection. Therefore the parameter values might reflect some effect of background selection that was unaccounted for during their estimation.

      This is an important subtle point, which we now address in the section “Limitations of simulated genomes”, which we added to the revised manuscript. In that section, we discuss various limitations of simulations, focusing on inferred demographic models. We address the potential influence of the segments selected for analysis toward the end of 2nd paragraph in that section (page 9):

      “... all methods assume that the input sequences are neutrally evolving. This implies that technical choices, such as the specific genomic segments analyzed and various filters, may also influence the inferred model and its ability to model observed genetic variation.”

      Interestingly, background selection in itself typically does not have a strong effect on the inferred model. This is something that is examined in the forthcoming publication that presents simulations with natural selection in stdpopsim.

      -  Why are some concepts written in bold (eg effective population size, demographic model)? Were you planning to make a vocabulary box? I think this is a good idea given that you are aiming for a public that can include people who are not very familiar with some population genetics concepts.

      In the “Examples of added species” section, we use boldface fonts to highlight the model parameters that were determined for each species. We added a statement clarifying this in the beginning of this section (page 11), and made sure that all the relevant parameters were consistently highlighted throughout this section. In other sections, we use boldface fonts only for titles. A few cases that did not conform to this rule were removed in the current version. We did not intend on adding a vocabulary box, but considered this when revising the manuscript, due to the reviewer’s suggestion. However, we found it difficult to converge on a small (yet comprehensive) set of terms with accurate and succinct definitions. We think that the important terms are adequately defined within the text of the manuscript, providing sufficient information also for readers who are not expert population geneticists.

      - p4, 2nd paragraph: Are these automated scripts that are used to compare models publicly available? If you are suggesting that people use this approach generally when coming up with a simulation model (p8, penultimate paragraph), it would be helpful to have access to these automated scripts.

      The scripts are part of the public stdpopsim repository on GitHub, and may be used by anyone. Some components of these scripts are more easy to apply in general, such as comparing a demographic model with one implemented separately by the reviewer. This step, for example, is achieved by application of the Demography.is_equivalent method in msprime. Other parts of the comparison depend on the specific structure of python objects used by stdpopsim, so they are not likely to be useful when implementing simulations outside the framework of stdpopsim.

      -  p9, 1st paragraph, and p.12 2nd paragraph: instead of adjusting the mutation rate to fit the demographic model (and using an old estimate of the mutation rate), would it be ok to adjust the demographic model to fit the new mutation rate? E.g. with a new mutation rate that is the double of a previous estimate, would it be ok to just divide Ne by 2 such that Ne*mu is constant (in a constant population size model)? I imagine this could get complicated with population size changes.

      In principle, this could be done if you were simulating neutrally evolving sequences (without modeling natural selection). Since the coalescence is scale-free, then you can scale down all population sizes and divergence times by a multiplicative factor, and scale up migration rates and the mutation rate by the same factor, and you get the exact same distribution over the output sequences. However, making sure you get the scaling right is tricky and is quite error-prone. Especially considering the fact that you have to do this every time the mutation rate of a species is updated. Moreover, once you start modeling natural selection, this scale-free property no longer holds. Thus, the simple solution we came up with in stdpopsim is to attach to each demographic model the mutation rate used in its inference.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We sincerely thank all the editors and reviewers for taking the time to evaluate this study. Here is our point-by-point response to the reviewers’ comments and concerns.

      Reviewer #1 (Public Review):The study by Oikawa and colleagues demonstrates for the first time that a descending inhibitory pathway for nociception exists in non-mammalian organisms, such as Drosophila. This descending inhibitory pathway is mediated by a Drosophila neuropeptide called Drosulfakinin (DSK), which is homologous to mammalian cholecystokinin (CCK). The study creates and uses several Drosophila mutants to convincingly show that DSK negatively regulates nociception. They then use several sophisticated transgenic manipulations to demonstrate that a descending inhibitory pathway for nociception exists in Drosophila.

      […]

      Weaknesses:

      A minor weakness in the study is that it is unclear how DSK negatively regulates nociception. An earlier study at the Drosophila nmj shows that loss of DSK signaling impairs neurotransmission and synaptic growth. In the current study, loss of CCKLR-17D1 in Goro neurons seems to increase intracellular calcium levels in the presence of noxious heat. An interesting future study would be the examination of the underlying mechanisms for this increase in intracellular calcium.

      We thank the reviewer for the kind and very positive evaluation of our manuscript. We agree that this study has not elucidated the intracellular molecular pathway(s) downstream of CCKLR-17D1 that are involved in the regulation of the activity of Goro neurons, and we think that it would definitely be an interesting topic for future research.       

      Reviewer #1 (Recommendations For The Authors):

      The response latencies for the control yw larvae seem large, with many larvae appearing to be insensitive to the thermal stimulus. Is this just an effect of the yw genetic background? A brief discussion of this might be helpful.

      We thank the reviewer for pointing this out. We have also noticed that the yw control larvae tend to show longer response latencies than the other control strains, and in the revised manuscript, we have added the following sentence in the Result section (Lines 91–94):

      “We have noticed that the yw control strain, which was used by us to generate the dsk and receptor deletion mutants, showed relatively longer response latencies to the 42 °C probe compared to the other control strains used in this study. This may be attributed to the effect of the genetic background, although, presently, the cause for this difference is unknown.”

      Reviewer #2 (Public Review):_

      This is an exceptional study that provides conclusive evidence for the existence of a descending pathway from the brain that inhibits nociceptive behavioral outputs in larvae of Drosophila melanogaster. […] The study raises many interesting questions for future study such as what behavioral contexts might depend on this pathway. Using the CAMPARI approach, the authors do not find that the DSK neurons are activated in response to nociceptive input but instead suggest that these cells may be tonically active in gating nociception. Future studies may find contexts in which the output of the DSK neurons is inhibited to facilitate nociception, or contexts in which the cells are more active to inhibit nociception._

      Reviewer #2 (Recommendations For The Authors):I have no recommendations for the authors as this is a very complete and thoroughly executed study. The writing is crystal clear.

      We thank the reviewer for the kind and very positive evaluation of our manuscript. We are happy to know that our current manuscript was deemed to be clear and convincing by the reviewer.

      Reviewer #3 (Public Review):[…] Overall the authors use clean logic to establish a role for DSK and its receptor in regulating nociception. I have made a few suggestions that I believe would strengthen the manuscript as this is an important discovery.

      Major comments:

      1. It's not completely clear why the authors are staining animals with an FLRFa antibody. Can the authors stain WT and DSK KO animals with a DSK antibody? Also, can the authors show in supplemental what antigen the FLRFa antibody was raised against, and what part of that peptide sequence is retained in the DSK sequence? This overall seems like a weakness in the study that could be improved on in some way by using DSK-specific tools.

      We thank the reviewer for this query. We would like to clarify that we first tried the FLRFa antibody to visualize an RFamide-type neuropeptide other than DSK in Drosophila and found that the staining pattern is quite similar to that of anti-DSK, as shown by Nichols et al. [1]. According to the original paper describing the anti-FLRFa antisera [2] (already cited in the reviewed manuscript), the antigen used to raise it was the Phe-Met-Arg-Phe-NH2 peptide conjugated with succinylated thyroglobulin, and the study experimentally shows that the antibody well binds to peptides containing Met-Arg-Phe-NH2 or Leu-Arg-Phe-NH2 sequence and has 100% cross-reactivity to FLRFa. As DSK contains Met-Arg-Phe-NH2 sequence [3], the cross-reaction of this antibody to DSK is consistent with the description of the original study.    

      Although we were unable to use an antibody specific to DSK, our staining data with dsk deletion mutants and the expression pattern of DSK-2A-GAL4 corroborate each other (Figure 2 and Figure 2-figure supplement 1), which we believe provides compelling evidence for the specific expression of DSK in MP1 and Sv neurons, and for that DSK-2A-GAL4 is a reasonably effective tool to specifically manipulate DSK-expressing neurons.

      2. What is the phenotype of DSK-Gal4 x UAS-TET animals? They should be hyper-reactive. If it's lethal maybe try an inducible approach.

      We thank the reviewer for this question. Unfortunately, we have not attempted this experiment, although we agree that this would be a nice addition to further strengthen the study if TET worked well in the DSKergic neurons.

      3. Figure 9. This was not totally clear, but I think the authors were evaluating spontaneous (i.e. TRPA1-driven) rolling at 35C. The critical question is "does activating DSK-expressing neurons suppress acute heat nociception" and this hasn't really been addressed. The inclusion of PPK Gal4 + DSK Gal4 in the same animal kind of clouds the overall conclusions the reader can draw. The essential experiment is to express UAS-dTRPA1 in DSK-Gal4 or GORO-Gal4 cells, heat the animals to ~29C, and then test latency to a thermal heat probe (over a range of sub and noxious temperatures). Basically prove the model in Figure 10 showing ectopic activation or inhibition for each major step, then test heat probe responses.

      We thank the reviewer for suggesting ideas for alternative experiments to potentially strengthen our conclusion. Regarding experiments using heat probes, previous studies have demonstrated that (i) Blocking ppk1.9-GAL4-positive C4da neurons almost completely abolishes the larval nociceptive response to local heat stimulations [4]; (ii) Local heat stimuli above 39 °C readily activate C4da neurons and larval nociceptive rolling [5-9]; and (iii) Thermogenetically or optogenetically activating these neurons is sufficient to trigger Goro neurons and larval rolling [4, 10-12]. Thus, it has now been made clear that heat probes induce larval nociceptive rolling via excitation of the C4da pathway, and we believe that our experiments using thermogenetic activation of C4da neurons can be safely interpreted as an alternative to experiments using heat probes. Using heat probes demands a more complicated experimental set-up to be combined with CaMPARI imaging experiments, and this is another reason why we preferred to take the thermogenetic approach.

      We have also considered the experiment using Goro-GAL4 instead of ppk-GAL4. However, if dTRPA1 artificially activates Goro neurons far downstream of the neuronal mechanism by which MP1 activation suppresses Goro neuron activity, the effect of MP1 activation may be bypassed and masked. As we currently do not know the epistasis between dTRPA function and the effect of MP1 activation in modulating the activity of Goro neurons, we rather chose to activate C4da neurons by using ppk-GAL4, which likely resulted in more natural activation of Goro neurons than dTRPA1-triggered direct activations.

      4. It would also then be interesting to see how strong the descending inhibition circuit is in the context of UV burn. If this is a real descending circuit, it should presumably be able to override sensitization after injury.

      Indeed, this is an interesting avenue to explore in future studies to understand the type of situation in which the DSKergic descending system functions to control nociception.

      Reviewer #3 (Recommendations For The Authors):Overall this is a good story and the claims are generally supported with experimental evidence. The way to really improve this study would be to use more precise and definitive tools, like specific antibodies, specifically targeted genes, and better temporal control of the descending circuit to prove this is inducible sufficient to suppress acute thermal nociception and this occurs only via a descending pathway, etc. However this would be exponentially more work, and so the authors I guess need to weigh the cost-benefit of definitive proof vs. strong evidence for their claims. Overall I think this study will be the beginning of a new line of inquiry in the field that has the potential to guide our understanding also of mammalian descending pathways, and as such, this study is of value to the community.

      We appreciate the reviewer’s multiple interesting ideas for experiments that could have been performed to further reinforce our findings. We agree that some experiments that the reviewer suggested would potentially strengthen this work if supplemented. However, as aforementioned, in our humble opinion, we think that the experiments that the reviewer suggested are either outside the scope of this paper or have no significant benefits over the experiments that were already conducted, and hence are not essential to the present study.

      References

      1. Nichols, R. and I.A. Lim, Spatial and temporal immunocytochemical analysis of drosulfakinin (Dsk) gene products in the Drosophila melanogaster central nervous system. Cell Tissue Res, 1996. 283(1): p. 107-16.

      2. Marder, E., et al., Distribution and partial characterization of FMRFamide-like peptides in the stomatogastric nervous systems of the rock crab, Cancer borealis, and the spiny lobster, Panulirus interruptus. J Comp Neurol, 1987. 259(1): p. 150-63.

      3. Nassel, D.R. and M.J. Williams, Cholecystokinin-like peptide (DSK) in Drosophila, not only for satiety signaling. Front Endocrinol, 2014. 5.

      4. Hwang, R.Y., et al., Nociceptive neurons protect Drosophila larvae from parasitoid wasps. Curr Biol, 2007. 17(24): p. 2105-2116.

      5. Tracey, W.D., Jr., et al., painless, a Drosophila gene essential for nociception. Cell, 2003. 113(2): p. 261-73.

      6. Xiang, Y., et al., Light-avoidance-mediating photoreceptors tile the Drosophila larval body wall. Nature, 2010. 468(7326): p. 921-6.

      7. Burgos, A., et al., Nociceptive interneurons control modular motor pathways to promote escape behavior in Drosophila. eLife, 2018. 7.

      8. Honjo, K. and W.D. Tracey, Jr., BMP signaling downstream of the Highwire E3 ligase sensitizes nociceptors. PLoS Genet, 2018. 14(7): p. e1007464.

      9. Im, S.H., et al., Tachykinin acts upstream of autocrine Hedgehog signaling during nociceptive sensitization in Drosophila. eLife, 2015. 4: p. e10735.

      10. Ohyama, T., et al., A multilevel multimodal circuit enhances action selection in Drosophila. Nature, 2015. 520(7549): p. 633-9.

      11. Honjo, K., R.Y. Hwang, and W.D. Tracey, Jr., Optogenetic manipulation of neural circuits and behavior in Drosophila larvae. Nat Protoc, 2012. 7(8): p. 1470-8.

      12. Zhong, L., et al., Thermosensory and non-thermosensory isoforms of Drosophila melanogaster TRPA1 reveal heat sensor domains of a thermoTRP channel. Cell Rep, 2012. 1(1): p. 43-55.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We’d like to thank the three reviewers for reviewing in depth our work and providing insightful comments and suggestions.

      Reviewer #1 (Recommendations For The Authors):

      1) The evidence that MS023 is actually working in vivo in their last experiment (Fig 6) needs to be strengthened. This could be due to the timing of the experiment. Tail tips were collected 48 h after the final injection and analyzed by Western for ADMA and SDMA levels. They do see subtle changes, in the right directions, of SDMA and ADMA (but these changes are really not very obvious). Perhaps the inhibitor has already been largely metabolized two days after injection. Have they looked at MMA levels?

      We have quantified the ADMA and SDMA levels of Fig. S6. We have not measured MMA levels. The text has been edited as follows:

      “The average ADMA relative expression was 0.95 for vehicle treated mice and 0.83 for MS023 treated mice (p < 0.00041). The average SDMA relative expression was 0.92 for vehicle treated mice and 0.94 for MS023 treated mice (p < 0.17). These whole-body measurements as measured by tail biopsies show MS023 promotes the decrease of proteins with ADMA and a slight increase in proteins with SDMA. It is known that inhibition of type I PRMTs or PRMT1 deletion diminishes ADMA and increases SDMA due to substrate scavenging (Dhar et al, 2013).”

      2) The authors need to explain why they would expect an increase in SDMA levels in these mice after MS023-treatment. 

      We have edited the text as follows:

      “It is known that inhibition of type I PRMTs or PRMT1 deletion diminishes ADMA and increases SDMA due to substrate scavenging (Dhar et al, 2013).”

      3) In the discussion, it would be valuable to address the types of CRISPR-screens that could be performed in these MS023-expanded MSCs. They mention this as a benefit in the introduction, but to expand on this idea in the discussion.

      The idea here was not necessarily to perform a CRISPR screen on the MS023-treated cells (although it is an interesting idea), but rather to correct the genetic mutation by CRISPR-Cas9 to enhance the success of genetically corrected autologous cell transplantation. The addition of MS023 to MuSC in vitro would allow to expand the cells while maintaining their self-renewal potential, thereby providing the opportunity to correct the mutation on the dystrophin gene using technologies such as CRISPR prime editing (Mbakam et al., 2022 Mol Ther Nucleic Acids 30:272-285). Our results demonstrating that MS023 enhances cell engraftment suggest that this method could be used to improve autologous cell transplantation efficiency. We have edited the text in the discussion as follows:

      “Our findings suggest that type I PRMT inhibitors may have therapeutic potential for treating certain skeletal muscle diseases. For instance, to improve the efficacy of autologous cell therapy, the dystrophin-deficient MuSCs collected from DMD patient and corrected by CRISPR prime editing (Happi Mbakam et al, 2022) could be treated with MS023 to maintain their stemness and enhance their cell engraftment capacity.”

      4) Also, could they address the potential value of MSC culture and expansion using a combination of SETD7 inhibition and PRMT1 inhibition?

      Agreed. We have edited the text as follows:

      “These findings suggest that inhibiting methyltransferases can affect MuSC fate and perhaps a combination of Setd7 and MS023 inhibitors would provide a more favorable combination to promote the expansion of MuSCs while maintaining their stem cell-like properties.”

      Reviewer #2 (Recommendations For The Authors): 

      In figure 2 the authors show that upon removal of MS023, the cells differentiate more efficiently. In figure 5E-F they show that the mice that received MS023-treated cells had more GFP mature muscle fibers. However, in figure 5C-D these cells have the same capacities to terminally differentiate. This reviewer was wondering if these cells would differentiate faster? Have the authors look into this?

      The reviewer raises an interesting point. Our in vitro experiments shown in Supplemental Figure S1 indicate that MS023-treated cells are actively more cycling (more ki67+ cells) and are less committed to differentiation (less Pax7-MyoD+ cells), which would suggest that they would need to exit the cell cycle and differentiate faster to reach the same fusion capacity after 3 days of differentiation without MS023. Future experiments with a time course including earlier time points will be needed to confirm if these cells differentiate faster.

      Reviewer #3 (Recommendations For The Authors): 

      1) MS023 is a non-selective inhibitor of type I PRMTs. It has comparable IC50 values for PRMT1 and PRMT4 (CARM1), and lower IC50 values for PRMT6 and PRMT8. The authors argue that the cellular phenotype caused by MS023 is solely mediated via PRMT1, since the specific PRMT4-inhibitor TP-064 has no effects on MuSC expansion. TP-064 treatment was not used as a control for the transplantation and muscle strength measurement experiments. Are PRMT6 and PRMT8 expressed in MuSC and are thy inhibited by the applied concentrations of MS023? Kawabe et al reported that CARM1 methylates Pax7, thereby inducing Myf5 transcription during the asymmetric division of MuSC (PMID: 22863532). Is the expression of Myf5 reduced upon MS023 treatment? scRNAseq of MuSC 4-day after culture is too late to address this question, since the majority of the cells are already committed to differentiation. Staining for Myf5 using ex vivo cultured fibers or regenerating muscles in vivo should be used. 

      Indeed, we mention throughout the text that MS023 is a type I PRMT inhibitor. We have edited the text as follows suggesting the effect are most likely mediated by inhibition of PRMT1 in vivo.

      “Treatment of MuSCs with MS023 resulted in metabolic reprogramming of MuSCs, supporting a role for type I PRMTs as metabolic regulators. In vitro, MS023 has been shown to inhibit several type I enzymes at nM concentrations (Eram et al., 2016). It is well-documented that PRMT1 is the major cellular type I enzyme (Pawlak et al, 2000) and this is why PRMT1, but not the other type I PRMTs are embryonic lethal in mice (Guccione & Richard, 2019). The numerous published data in cellulo with MS023 are thus far only reproduced by PRMT1-deficiency by siRNA or knockout, suggesting that MS023 actions in vivo are predominantly mediated by inhibiting PRMT1 (Gao et al, 2019; Plotnikov et al, 2020; Wu et al, 2022; Zhu et al, 2019). Thus, the effects of MS023 on MuSCs are most likely mediated by inhibition of PRMT1.”

      Moreover, we investigated the expression of other type I PRMTs as suggested by the reviewer. We investigated their expression from publicly available single cell RNAseq dataset (Oprescu SN et al, iScience 2020, 23:100993), which performed analysis on skeletal muscle at different time points post-cardiotoxin injury (uninjured, and 0.5, 2, 3.5, 5, 10, 21 days post-injury). The findings show that Prmt1 is by far the most expressed type I PRMT in MuSCs at every time point tested. Carm1 (Prmt4) is expressed at high level in a small/moderate subset of cells, especially during regeneration. Prmt6 is expressed at low level in a small proportion of cells, while Prmt8 expression was not detected. These findings are coherent with our observation that Prmt1 is the predominant type I Prmt in MuSCs, which further support our hypothesis that it is the main target of MS023. These findings were added in Suppl. Fig 1B.

      The expression of Myf5 during asymmetric division is indeed well characterized on muscle fiber-associated MuSCs (Dumont et al., 2015 Nat Med 21:1455; Kawabe et al., 2012 Cell Stem Cell 11:333). As the reviewer states, the 4-day time point is too late to investigate Myf5 expression. Additionally, these cells were cultured ex-vivo and were not fiber-associated. Therefore, scRNAseq is not an ideal method to address the question of whether MS023 treatment modulates Myf5 expression, and further experiments would be required to examine Myf5 in an appropriate context (i.e. on ex-vivo cultured myofibers).

      2) Figure 2 is not very informative, while the second paragraph of the result parts is excessive and too complicated. The extensive description of differential gene expression in each potential subpopulation is neither very informative nor helpful to convince the reader that the M3/M5 population has acquired more stemness-like features due to the MS023 treatment. From my point of view, the data just reflect the increased proliferative capacity of MS023-treated cells with elevated cell cycle markers, ribosomal protein, and metabolic state. Do the M1-M5 populations show any different distribution along the trajectory? The authors need to show cell trajectories for each sample and cluster in Figure S3A. It is also imperative to present the distribution of signature genes for each individual cluster. Essentially, M1-M5 all located together in one cloud. What justifies segregation into different subclusters? The color code for the different clusters (including the trajectories) to allow better distinction. 

      MS023 treated MuSCs contain a subpopulation with higher Pax7 expression (Supplementary Figure S2F, S2G), which is consistent with the IF results in Figure 1 and emphasized in the abstract. Why are these data in the supplements and not in a main figure (e.g. in figure 2)?

      We appreciate the thoughtful and detailed comments on our single-cell data. Please see below for a response to each point:

      To address the concern that the results section is excessive, our intention was to simply provide the reader with a descriptive overview of the identity of each subcluster that the software identified. In fact, to ensure clarity and conciseness, we elected to provide only the names of a select few cluster markers rather than list all of the significant cluster markers that were generated. We kindly refer the reviewer to Supplementary Table S1 for a more extensive list of markers.

      In response to the reviewer’s comment: “The color code for the different clusters (including the trajectories) to allow better distinction,” we agree that colour-coding is helpful, please refer to Figure 2A for a colour-coded map of the clusters.

      To address the reviewer’s question regarding what justifies segregation into different subclusters for M1-M5, refer to Supplementary Table S1 for a list of uniquely enriched markers for each cluster. This list was filtered to include marker genes that were present only in a given cluster, thus contributing to its uniqueness and explains why that cluster was identified as being distinct from another given cluster.

      Lastly, since the elevated Pax7 levels in MS023-treated MuSCs was already presented and discussed thoroughly in Figure 1, we elected to avoid repetition in the main Figures and presented the ridge plots showing elevated Pax7 in the Supplementary Material for Figure 2

      3) The same group has reported previously that PRMT1-deficient MSCs show reduced expression of MyoD due to disruption of Eya1/Six1 recruitment to the MyoD promoter (PMID: 27849571). However, the scRNAseq result does reflect this finding. MyoD levels are not significantly changed in d4 MS023 compared with d4 (Supplementary figure S2G). The authors need to provide an explanation. Furthermore, the authors previously described that "the majority of PRMT1-deficient MSCs repressed Pax7 expression at day 3 while being Ki67 positive (Fig. 5B). How does that fit to the current observations, which indicate an increase of Pax7+ cells after MS023 treatment? This discrepancy needs to be resolved. 

      While the scRNAseq does not show a reduction in overall MyoD expression in MS023-treated MUSCs, there is indeed a reduction in the proportion of MyoD+ myofiber-associated MuSCs (Figure 1C, 1D). Supplemental Figure S2G further shows a subpopulation in the d4MS023 group with lower MyoD expression that was not present in the d4 group, reflective of the findings in Figures 1C and 1D. Therefore, although the average expression was not shifted significantly with MS023, there was indeed a subpopulation of MuSCs with lower MyoD expression.

      The reviewer additionally points out that Fig. 5B from a previous study (Blanc et al., 2017 MCB 37:e00457) performed by our group, shows that Pax7 expression was repressed at day 3 of culture in PRMT1-null MuSCs. However, this quantification was based on immunofluorescence staining where cells are marked positive or negative for Pax7 expression and does not look at the intensity of Pax7 expression levels. In our current study, we examine the expression levels of Pax7 in discrete subpopulations of MuSCs and found that there is a subpopulation of MuSCs that emerges with MS023 treatment that has higher Pax7 expression than untreated counterparts. Therefore, the results of the two experiments are not directly comparable. 

      4) I do have a major problem with the interpretation of the metabolic changes in MS023-treated MuSC. In the abstract, the authors wrote, "These findings suggest that type I PRMT inhibition metabolically reprograms MuSCs resulting in improved self-renewal and muscle regeneration fitness." There is simply no causal evidence to support this claim, which is solely based on a correlation. If the authors want to maintain this claim they either need to stimulate OXPHOS and glycolysis by other means to see whether such a manipulation recapitulates the effects of MS023 or attenuate OXPHOS and glycolysis to see whether this abrogates the effects of MS023. To prove whether increased OXPHIS is a cause for improved self-renewal, the authors might simply co-treat MuSC with MS023 and an OXPHIS inhibitor and analyze consequences for the Pax7+/MyoD- population. 

      We thank the reviewer for the excellent suggestions of experiments that would solidify a causal relationship between increased metabolism and increased self-renewal. We will certainly consider them for future studies. We agree that the relationship in the present study is correlative, and the text has been modified in the abstract as follows:

      “Single cell RNA sequencing (scRNAseq) of ex vivo cultured MuSCs revealed the emergence of subpopulations in MS023-treated cells which are defined by elevated Pax7 expression and markers of MuSC quiescence, both features of enhanced self-renewal. Furthermore, the scRNAseq identified MS023-specific subpopulations to be metabolically altered with upregulated glycolysis and oxidative phosphorylation (OxPhos). Transplantation of MuSCs treated with MS023 had a better ability to repopulate the MuSC niche and contributed efficiently to muscle regeneration following injury. Interestingly, the preclinical mouse model of Duchenne muscular dystrophy had increased bilateral grip strength 10 days after a single intraperitoneal dose of MS023. Our  findings show that inhibition of type I PRMTs increased the proliferation capabilities of MuSCs with altered cellular metabolism, while maintaining their stem-like properties such as self-renewal and engraftment potential.”

      5) Ryall et al reported that MuSCs undergo a metabolic switch from fatty acid oxidation to glycolysis with reduced intracellular NAD+ levels and reduced activity of SIRT1, leading to elevated H4K16 acetylation. Here, both OXPHOS and glycolysis are increased after treatment of MuSC with MS023. Are the NAD+ and H4K16ac levels changed in MS023-treated MuSC? 

      This is another excellent study that would help to support a causal relationship between MS023 treatment and increase OXPHOS and glycolysis and could certainly be addressed in future studies.

      6) In Ryall et al.'s results, there was no difference in the basal mitochondrial OCR between freshly isolated MuSCs and cultured MuSCs. Importantly, stimulation of OXPHOS will increase ROS concentration, resulting in premature differentiation of MuSC (PMID: 30106373). Furthermore, increased ROS levels will most likely enhance DNA damage rather than improve self-renewal. The authors have to address these issues and also monitor ROS and DNA damage levels. 

      The lack of cell death upon treatment with MS023 in the present study would indicate that there is no major ROS-induced DNA damage occurring. Additionally, the propensity of MS023-treated MuSCs to retain their stemness while in long-term culture (Supplemental figure S1E) would indicate that in this context, premature differentiation is not a concern.

      7) The authors used FACS-analysis of MuSCs three weeks after transplantation to demonstrate that MS023 treatment enables better engraftment into the MuSC niche. The six-fold increase of transplanted cells in the MuSC niche is difficult to understand, Why shall transplanted cells compete so efficiently with endogenous MuSC for repopulation of the niche? Is it possible that some of the transplanted MuSC are still lingering within the interstitium and erroneously counted as bona fide MuSC? The authors have to determine the localization of transplanted MuSC. Are all transplanted cells indeed situated in the proper niche or are they also present outside the basal lamina of muscle fibers? 

      The hindlimbs which received the engraftment were irradiated 24h prior to engraftment, therefore the ability of endogenous MuSCs to compete is compromised. Additionally, Figure 5E shows that the regenerated muscle indeed has GFP negative fibers that would have been generated from endogenous MuSCs, indicating that MS023-treated MuSCs did not fully outcompete endogenous MuSCs.

      8) The authors reported that an only 3-day treatment with MS023 is sufficient to dramatically improve muscle function in mdx mice even 30 days later, which is hard to swallow. What is the evidence that such strong effects are primarily mediated by stimulation of MuSC expansion? Are there other pathways or cells that respond to MS023 treatment and stimulate muscle strength? To support the claim of a 'better' stem cell function as the major cause for MS023-dependent stimulation of muscle strength in mdx mice, the authors need to determine the total number of Pax7+ cells, Pax7+/Ki67+, Pax7+/MyoD+, Pax7+/MyoD-, Pax7-/MyoD+ and myonuclei. It is also absolutely mandatory to include wildtype controls in the muscle strength measurements. Does MS023 treatment also increase muscle strength in wild-type controls? 

      Agreed. We cannot exclude if the effect is mediated by an expansion of the MuSC pool or by an effect on other cell types, such as a direct impact on the myofibers. The manuscript has been modified to include the following text:

      “Furthermore, our findings show that injection of MS023 in the dystrophic mouse model mdx led to enhanced muscle strength with effects lasting up to 30 days.  We cannot exclude if the effect of MS023 was mediated by an expansion of the MuSC pool or by an effect on other cell types, such as a direct impact on the myofibers. The goal of this experiment was to provide a therapeutic perspective for the possible use of type I PRMT inhibitor for the treatment of DMD.”

      The goal of this figure was to provide a therapeutic perspective for the use of type I PRMT inhibitor for the treatment of DMD. Muscle wasting/weakness in DMD is a complex and multifactorial process (e.g., myofiber fragility, MuSC defects, chronic inflammation, fibrofatty accumulation). If MS023 can target multiple aspects of the physiopathology of the disease it would increase its therapeutic applicability. Further studies will be needed to determine the exact mechanism by which MS023 mediate its beneficial effect. These future studies could include the use of wild type control, as the reviewer suggests, to investigate the role of MS023 in a non-muscle degenerative context.

      9) Ideally, a genetic inactivation-reactivation of PRMT1 should be done to validate the results with MS023 and to make sure that indeed the transient inhibition of PRMT1 is responsible for the beneficial effects of MS023. Of course, this would be a major effort when done in genetically manipulated mice and therefore is not adequate to ask for. However, it should be possible to use PRMT1-deficient MuSC, which the authors have in hand, and re-express PRMT1 in these cells with an AAV or a lentivirus. 

      We agree that genetic ablation of PRMT1 is a key experiment to validate MS023 results. Please refer to previous work from our group, which shows that PRMT1-KO MuSCs have an enhanced self-renewal phenotype (Blanc et al., 2017 MCB 37:e00457), similar to what was observed in the present study with MS023 treatment.

      10) Some claims are overstated and/or to aggressive. E.g.: "Therefore, through repression of type I PRMTs with MS023, we have reprogramed MuSCs to acquire a unique and previously uncharacterized identity." I do not see clear evidence that MS023 treatment 'reprograms' MuSC to a 'unique identity'. The observed changes are in large parts compatible with a simple stimulation of proliferation. 

      The unique finding in our data is that treatment with MS023 resulted in a shift in identity as compared to the DMSO-treated proliferating MuSCs (M1, M2 and M4), creating transcriptionally distinct M3 and M5 clusters. M3 and M5 had elevated markers for metabolism (E.g. Eno1, Atp5k, etc) and early activation (E.g. Fos, Jun), while the untreated MuSCs in clusters M1, M2 and M4 did not. Furthermore, M3 and M5 had higher baseline levels of Pax7 expression when compared to untreated cells. Together, these findings describe a transitional subpopulation of MuSCs unique to MS023 treatment which not only harbour stem like/early activation markers Pax7, Fos and Jun, but also elevated proliferative markers related to cell cycle and energy metabolism. This particular combination of characteristics is unique to the MS023-treated MuSCs, thus identifying a unique subtype of MuSC identity. In accordance with our scRNAseq data, we validated experimentally that MS023-treated cells have higher energy metabolism and increased self-renewal potential, thereby confirming that the unique transcriptomic signature of these cells also lead to a different cell fate decision.

    1. Author Response:

      The following is the authors' response to the original reviews.

      1) l. 80: "evolved from a fourth domain of cellular life": I am worried a little bit about putting together what I believe are too distinct hypothesis: (i) NCLDV deriving from a complex (ancestral) cellular life form (possibly proto-eukaryotic) by reductive evolution, and (ii) NCLDV forming or deriving from a fourth domain of cellular life. To clarify for non-expert reader, I would suggest rephrasing as "evolved reductive evolution, possibly from a fourth domain of cellular life...".

      Following the reviewer’s recommendation, we have clarified the sentence by writing: “These observations are at odds with the suggestion that NCLDVs originated by reductive evolution, possibly from a fourth domain of cellular life (Colson et al., 2018; Legendre et al., 2012; Patil and Kondabagil, 2021).”.

      2) l. 187-198: Please provide more information on which tool (with version number and parameter) was used to search genomes for MCPs. When I downloaded the HMM model and the faa file for the MCP from the figshare repository and tried to match the two, only a small number (4) of the MCP sequences actually matched the MCP HMM model with significant e-value, but I am not sure why? (for reference, I was using hmmsearch 3.3.2, default parameters)

      We used HMMER version 3.3.2 using the default parameters (hmmbuild and hmmsearch algorithms). We now include this information in the relevant section of the Methods: “Next, we constructed a set of Hidden Markov Models (HMMER version 3.3.2, hmmbuild/hmmsearch using the default parameters) for each of the 4 core proteins involved in virion morphogenesis”.

      We were able to reproduce the reviewer’s observation that the Major capsid curated HMM model returns 4 significant hits when used on the Major capsid multiple alignment file provided in FigShare (significant matches: 1. maverick2_NW_021681489.1_105940131438, 2. ncbincldv_NC_011335.1, 3. ncbincldv_NC_038553.1, 4. yutin_PLVACE1). This curated HMM model was one of the models used for searching homologous protein sequences and was built from a preliminary multiple sequence alignment comprising a different set of taxa (N. taxa = 48). In contrast, the multiple sequence alignment provided in Figshare is the final multiple sequence alignment of major capsid proteins that was used in phylogenetic analyses (N. taxa = 54). Therefore, we should not expect an exact match between the two files.

      We have updated the Figshare repository with a compressed file containing all the HMMs used for searching protein homologues (n = 38), which can be validated on hmmsearch on the European Bioinformatic Institute’s website (https://www.ebi.ac.uk/Tools/hmmer/search/hmmsearch).) A separate compressed file contains the final multiple sequence alignments that were used in phylogenetic inference and hypothesis testing.

      3) Figure 4: The acronyms should be explained in the legend (pPOLB, MCP, mCP, pro, atp, int, TIRs, etc)

      We now provide an explanation of the acronyms used for the traits matrix on Figure 4: “Acronyms refer to genes and genomic features present in the viral genomes: pPOLB (protein-primed DNA polymerase B), MCP (major capsid protein), mCP (minor capsid protein), int (rve-type integrase), pro (adenoviral-like protease), atp (FtsK/HerA DNA packaging ATPase), TIRs (terminal inverted repeats).”

      4) Figure 4: I believe that "TIRs" should be "Present in some members" for the virophages, based on https://doi.org/10.1186/s13062-015-0054-9? Interestingly, this group is typically the one that branches the deepest within virophages, which would be consistent with TIRs being an ancestral trait of the Maveriviricetes class (formerly Lavidaviridae family).

      As suggested, we updated the terminal inverted repeats (TIRs) trait for virophages to “Present in some members” to account for the Rumen virophages described by Yutin, Kapitonov and Koonin (2015, doi: 10.1186/s13062-015-0054-9).

      Additional changes:

      1) Figure 1 has been updated and now shows a polytomy between Mavericks 1/2 and PLVs. This reflects more closely the conceptual framework for our analyses since the specific branching of these groups was not specified in the phylogenetic models.

      2) We have added an Acknowledgements section to the end of the manuscript:

      Acknowledgements

      We wish to thank Peter Simmonds and Alexander Suh for their critical reading and comments on the manuscript, which served to improve this work. We also thank the reviewers for their recommendations and feedback. This work was supported by a doctoral scholarship (Dr. Jose Gregorio Hernandez Award) to JGNB made by the National Academy of Medicine of Venezuela and Pembroke College, Oxford.

    1. Author Response

      “It is unclear whether the Ter sites integrated by a single copy plasmid have any effect on the replication of this region but the data show that the observed effects are dependent on expression of the Tus protein.”

      -The lack of perturbation of the TerB sequence on fork progression has extensively been studied previously in both Willis et al, 2014 and Larsen et. al, 2014. Furthermore, as the detection of the SMARD signal at the TerB sites is dependent on the 7.5kb probe that spans the TerB sites (orange probe, Fig 2B & 2D), it would be impossible to study the effect on replication in this region, with and without the integration of the single copy plasmid.

      “The SMARD data do not reveal what proportion of forks are arrested at Tus/Ter, or how long the fork delay is imposed.”

      -The percentage of fork stalling at the TerB sites, with and without Tus expression, has been quantified in Figure 2E & 2F. Essentially, 36% forks stall at the TerB block, i.e. 18% of the forks stall in both the 5’ to 3’ (orange) and 3’ to 5’ (blue) direction when the Tus-TerB block is active.

      “It is not shown whether the replication inhibitor HU leads to the same widely spread gamma H2AX response.”

      -While we have not shown gH2AX accumulation via ChIP after HU treatment, Supplementary Figure 5A & 5B clearly show increased gH2AX foci when the cells are treated with HU, suggesting a global replication stress response that is in stark contrast to the response to Tus-TerB.

    1. Author Response

      First, we would like to thank eLife and the reviewers for the positive assessment of our manuscript and for providing the medium to expand the scientific dialogue on the present study. We greatly appreciate the reviewers´ assessment and will revise the manuscript considering their input. Please see our provisional response below.

      Reviewer #1’s main concern is centered on the evidential strength of the study’s conclusion that age-specific effects of birth weight on brain structure are more localized and less consistent across cohorts than age-uniform, stable effects. More specifically, the reviewer points out the evidence (or lack of such) for age-specific effects. The reviewer specifies four methodological concerns: that #1 no direct statistical comparisons are conducted between samples (beyond the spin-tests) and that #2 the differential composition of samples in terms of age distribution leads to the possibility that lack of results is explained by methodological differences. Further, he/she adds that #3 some datasets have a narrow age range precluding detection of age-related effects and #4 that the modeling strategy does not allow for non-linear interaction between age and BW suggesting the use of spline models instead in a _mega-_analytical fashion. Reviewer #1 also asks for greater clarity regarding the statistical models and the provision of effect-size maps.

      As a response to the reviewer’s comments, we will submit a revised version of the manuscript with a revised description of the statistical models and submit effect-size maps in an accompanying repository. We will tackle the first two concerns by performing additional statistical analyses. The planned analyses will include across-sample reliability and within-sample reliability for the time*birth weight analyses. These analyses will address the concerns that the lack of age-specific effects is due to sample differences rather than a lack of biological effects (#2) and provide further explicit statistical comparisons across samples (#1).

      We do not believe concern #3 is critically problematic since time*birth weight refers to a within-subject contrast, e.g. longitudinal-only based contrast. Birth weight, even when self-reported is a highly reliable measure and the sample sizes are relatively large (n = 635, 1759, and 3324 unique individuals). Note that the smaller dataset does have longer follow-up times and more observations per participant increasing the reliability of estimations in individual change. Structural MRI measures have very high reliability (often ICC > .95). Clearly, longitudinal brain change is less reliable, yet the present sample size and the high reliability of birth weight should provide enough statistical power to capture even small time-varying effects of birth weight on brain structure.

      We agree that some - if not most - brain structures follow non-linear trajectories throughout life (#4). In the present study, age regressors are used only for accounting for variance in the data rather than capturing any effect of interest. Rather, it is the time*birth weight regressor that captures age-varying changes in brain structure. Time reflects within-subject follow-up time. In any case, the inclusion of non-linear regressors is superfluous in ABCD given the small age range and will not  regress out additional variance. Likewise, published data on UKB has shown that most age trajectories of cortical data follow grossly linear fits (age range in the UKB: 45 – 85 years). In LCBC, however, we recognize that non-linear modeling may be able to capture additional variance since it is a lifespan dataset. This may provide better estimates for birth weight and time*birth weight though it is unlikely that it has a big effect on the slope. In any case, #4 is a valid concern and we will repeat the main analysis using «spline models to fit age We will not follow the reviewer's suggestion for a mega-analytical approach. It is a valid approach, but it is best suited to other configurations of data than those used here where we used longitudinal data only. The use of cross-sectional data for deriving differential age trajectories is also not without problems.

      We see Reviewer #2´s  concerns mainly being: #1 The translation of birth-weight effects on brain volume to years of aging is inadequate and not fully supported by the data. #2 Lack of data regarding the functional relevance of brain weight effects on brain structure; suggesting the inclusion of cognitive data and (birth weight – brain structure – cognition) mediation analyses. #3 The degree to which the association between birth weight and cortical area/volume is explained by overall somatic growth. #4 The lack of non-linear regressors for fitting age.

      1 We will modify the section on translation of birth weight effects into years of aging. While it is often used - see for example the literature on brain age - and is relatable, it can be easily misinterpreted and does not reflect any objective measure of effect size. #2 The degree to which birth weight relates to cognition throughout the lifespan through individual variations in brain structure is a key question in the field for which currently we only have indirect evidence. Unfortunately, the study of this triple relationship is out of the scope of the current study. This, we agree with the reviewer, warrants further research. We disagree, however, in that mediation analyses are able to provide a satisfactory answer to this question. Also, we believe that the conclusions of the present study are not affected by the omission of cognitive data beyond what has been commented on in #1. The relationship between birth weight, structure, and cognition will be further discussed in the revised version of the manuscript.

      3 Indeed, part of birth weight effects on brain structure can be explained by overall somatic growth. This study does not provide - and is not designed to provide - a specific mechanism for which birth weight is associated with brain structure. Yet, research has also shown that differences in birth weight partially reflect factors associated with variations in neurodevelopmental processes in gestation. Mechanisms are thought to be partially genetic (including geneticly influenced potential for somatic growth) and partially environmental prenatal influences. The amount of variance explained by such mechanisms is unknown and may differ across samples (e.g. population samples vs. monozygotic twins). In our view, the association is no less meaningful if mostly attributed to somatic growth if it is associated with partly modifiable variance in brain growth. The revised manuscript will include further considerations on the possible mechanisms explaining the association between birth weight and cortical structure throughout the lifespan.

      Concern #4 is equivalent to Reviewer #1, concern #4. In short, the age regressor is used only to account for variance. Linear modeling in the ABCD and UKB datasets has been shown to fit relatively well the different cortical regions due to the samples’ narrow age range. In LCBC, however, we recognize that non-linear modeling may be able to capture additional variance since it is a lifespan dataset. This may provide better estimates for birth weight and time*birth weight though it is unlikely that has a big effect on the slope of the effects. We will repeat the main analysis using «spline models to fit age.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We thank the reviewers for their positive remarks, which we have addressed in detail below and which we have considered in our revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      The authors claim several times to have documented electrogenic chloride/oxalate exchange mediated by human SLC26A6. However, they fail to detect whole cell currents in SLC26A6-expressing HEK293 cells in oxalate bath, despite robust, saturable Cl- efflux from proteoliposomes into extracellular oxalate solution, as detected by AMCA fluorescence decay.

      We interpret the low, and essentially non-detectable currents for Cl-/oxalate exchange as a consequence of the slow kinetics of transport. This lack of sensitivity is not unusual for electrogenic secondary-active transport processes recorded by patch-clamp electrophysiology in mammalian cells, which renders the recording in large X. laevis oocytes by two-electrode voltage clamp the preferred method for such investigations. In contrast to the non-detectable activity in electrophysiology, the pronounced signal in the ACMA assay reflects the influx of H+ as a consequence of the negative membrane potential established by the influx of the divalent anion oxalate, which we assume to occur in exchange with the monovalent Cl-.

      Instances in the manuscript include:

      Abstract Line 17 overstates the paper's findings as "we have characterized SLC26A6 as a strictly coupled exchanger of chloride with either bicarbonate or oxalate". To the extent that "strictly coupled" implies 1:1 stoichiometry, the authors conclude Cl-/bicarbonate exchange is electroneutral based on its lack of exchange current. In contrast, the lack of Cl/oxalate exchange current does not lead the authors to the same conclusion of electroneutrality for Cl-/oxalate exchange. The data presented do not measure the stoichiometry of Cl-/oxalate exchange.

      We agree that our ACMA experiments do not strictly discriminate between coupled and uncoupled oxalate transport. However, it should be emphasized that, assuming that transport proceeds by an alternate access mechanism, uncoupled oxalate transport would require the change of the unloaded transporter between inward- and outward-facing conformations, which was shown to be unfavorable in Figure 1D.

      We have reworded the sentence in the abstract to:

      “Here we have determined the structure of the closely related human transporter SLC26A6 and characterized it as a coupled exchanger of chloride with bicarbonate and presumably also oxalate.”

      Line 264 claims that "the paper's functional data has defined SLC26A6 as a coupled transporter that exchanges Cl- with either HCO3- or oxalate at equimolar stoichiometry."

      We have changed the sentence to:

      “Whereas our functional data has defined SLC26A6 as a coupled antiporter that exchanges Cl- with HCO3- and presumably also oxalate with equimolar stoichiometry…”

      In lines 299-302, the authors claim to have "detected strict equimolar exchange of anions"...leading to the reasonable conclusion of electroneutral Cl-/HCO3- exchange and the reasonable but unsupported conclusion of coupled Cl/oxalate exchange.

      We have reworded the sentence to:

      “In the case of Cl-/HCO3- transport, we detect a strict equimolar exchange of anions binding to a conserved site in the mobile core domain of the transmembrane transport unit (Figure 4B, H). Although not shown unambiguously, we assume an analogous mechanism also for Cl-/oxalate exchange.”

      Lines 505-508 in Methods claim that the AMCA proteoliposome assay "measured electrogenic oxalate transport." However, the assay documented extracellular oxalate- dependent anion transport that was most simply interpreted as coupled exchange.

      The assay has detected H+ uptake into proteoliposomes as a consequence of electrogenic anion influx. In these experiments, oxalate is the only anion on the outside of vesicles and it requires to be transported to be able to observe any fluorescent change. The claim of electrogenic oxalate transport is thus justified. As described above, the assay does under the applied conditions not discriminate between uncoupled and coupled oxalate transport, however uncoupled oxalate transport would require the conformational change of an unloaded transporter, which was shown to be kinetically disfavored.

      In contrast, other parts of the manuscript acknowledge that the evidence presented falls short of documenting stoichiometric chloride/oxalate exchange.

      Results Line 151 sets out to "investigate a potentially electrogenic Cl-/oxalate exchanger. Similarly, results line 160 conservatively claims that Cl-/oxalate exchange occurs "presumably" with a 1:1 stoichiometry. This more accurate language needs to be used throughout the paper, replacing the more absolute but unjustified descriptions summarized earlier above.

      We have now introduced the requested clarifications throughout.

      I have otherwise only Minor points to suggest. Abstract:

      "Among the eleven paralogs in humans.... ". This should be "at least 10," as the original

      status of human SLC26A10 as a transcribed pseudogene vs. a truncated protein-expressing gene remains unresolved. The authors recognize this in the introduction, where on p. 3 they acknowledge "ten functional SLC26 paralogs in humans."

      We have changed to ‘ten functional paralogs’

      Introduction:

      p. 4 line 45: membrane-inserted

      We have introduced the correction.

      Methods:

      Construct Generation:

      p. 25 lines 380-2: Add a sentence describing any C-terminal sequence extension added after C3 cleavage product, and whether/how it modified the PDZ-binding domain sequence. Has the modification been tested for PDZ-binding activity?

      We have introduced the following sentence:

      “As a consequence of FX cloning, expressed constructs include an additional serine at the N- terminus and an alanine at the C-terminus. Following C3-cleavage, SLC26A6 carries a seven residues long C-terminal extension (of sequence ALEVLFQ).”

      We have not tested PDZ-domain binding but expect that the added residues interfere with interaction with the C-terminal binding motif.

      Liposome Reconstitution:

      p.28: lines 453-4: Please clarify the meaning of: "absorbance at 540 nm was used to detect liposome destabilization," followed immediately by "After the formation of stabilized liposomes".... Does destabilization mean liposomal leak of Eu.L1+ chromophore, with decline of absorbance? What is practically meant in terms of the number of 10 mL additions of 10% TTX-100 routinely added to generate stabilized liposomes without generating destabilized liposomes? Did this number vary from trial to trial? How did you know when to stop adding aliquots of TTX-100?

      We have added the following sentence:

      “For protein incorporation, 10 µl aliquots of 10% Triton X-100 were added in order to destabilize liposomes and permit protein incorporation. After reaching a plateau of the light scattering measured at 540 nm, 4 additional aliquots of Triton X-100 were added. The number of additions required for destabilization did not vary between reconstitutions.”

      p.28 line 463: "dissolved" should be "suspended."

      We have introduced the correction.

      Bicarbonate Transport Assay

      p. 29 line 480-1. How many repetitions represented by the phrase "sequential ultracentrifugation steps"- please provide a number or a range, as applicable.

      We have defined the number of ultracentrifugation steps (two).

      Pp 29-30, lines 485-7: define "cycles" - are these fluorometric excitation-emission cycles?

      We have defined cycles as fluorometric excitation/emission cycles.

      p. 30 line 489: delete "by"

      We have deleted ‘by’

      Name the fluorimeter used.

      We have named the fluorimeter used as Tecan Infinite M1000 Pro microplate reader.

      AMCA assay

      Pp 30-31, lines 505-8: Add composition of extraliposomal oxalate-containing buffer. In Fig 1 Suppl Fig. 1 panels H and I, and Methods lines 505-508, with 150 mM oxalate substituting for 150 mM Cl- how was osmotic balance maintained in the external chloride solution?

      We have added the composition of the oxalate-containing buffer. The osmolarity of the extracellular solution was not balanced.

      Electrophysiology

      p.32 line 532: What fold-increase of SLC26 protein levels was produced by inclusion of 3 mM valproic acid?

      We consistently see an increase of expression upon addition of valproic for different membrane proteins acid but did not quantify it in this case.

      Results:

      Functional characterization of SLC26A6

      Line 91: "comparably" to what? Otherwise, perhaps, "comparatively" was intended here?

      We have changed to ‘comparatively’

      Fig 1E legend: line 763 "time- and concentration-dependent". Same for line 791, line 799

      We have introduced the correction.

      Fig. 1G: Change Y axis legend to "Normalized [Eu.L1+] emission." Add bath ion composition for "neg" condition (black trace).

      We have corrected the label on the Y-axis and added ion composition for neg.

      Fig. 1H legend sentence 2 "in a concentration-dependent manner for liposomes (Mock) in

      75 mM oxalate (n=5) and for SLC26A6 proteoliposomes in extracellular oxalate concentrations of 9.4 mM (n=3) etc

      We have reworded the sentence:

      “Traces show mean quenching of ACMA fluorescence in a time- and concentration-dependent manner for SLC26A6 proteoliposomes with outside oxalate concentrations of 9.4 mM (n = 3), 37.5 mM (n = 5), 75 mM (n = 6), 150 mM (n = 8, all from two independent reconstitutions). Neg. refers to liposomes not containing SLC26A6 assayed upon addition of 75 mM oxalate as defined in Figure1-figure supplement 1G.”

      Fig. 1 Fig Suppl. 1. p.45 line 790: change "chemical formulas" to "2-D chemical structures"

      We have introduced the change.

      lines 799: Time-

      We have introduced the change.

      Fig 1 Fig Suppl. 1. p 46 lines 809-810: dashed lines indicating 0 pA are indeed red in panels A and B, but black in panels H and I.

      We explicitly refer to recordings, where dashed lines at 0 pA are consistently in red.

      Fig 2 Fig Suppl.1 p. 50, line 832: Two additional multi-class.

      We have introduced the change.

      Fig 2 Suppl Fig 2B, p. 51: Please label the residue numbers of the side chains coordinating the chloride binding site. Can those residues be indicated in Fig 2 Suppl Fig 2A in the appropriate helices? These residues might also be asterisked in the primary sequence alignment of Fig 2 Suppl Fig 3A.

      We have labeled the residues in Fig. 2-figure supplement 2B but not 2A where the focus is on the general quality of the density in different parts of the protein. We have also labeled the same residues in Fig. 2-figure supplement 3A.

      Fig.4 legend p. 59 line 882 – Deviating residues in SLC26A9 (typo A6) are highlighted in violet.

      We have introduced the correction.

      p. 60 lines 888-9: Please clarify the individual meaning of green and purple asterisks on defining the substrate cavity diameter; How do purple and green asterisks relate to the yellow and green lines in the graph? Should the asterisks be two green and two yellow asterisks, or should they be black? What is the meaning of the purple and green asterisks at the two upper corners of panel G with respect to the substrate cavity radii?

      Please specify if y axis label "radius" refers to substrate cavity radius, and whether X axis label "distance" refers to axial distance along helix alpha10, alpha1, or of the helical pair. Is value "0 A" on the X axis anchored at the top of the helices as depicted in panels D-F? Is X-axis value 10.5A sited at the bottom of the helices? Please indicate on the panel G curves the x-y value range depicted in the inset images- or clarify that the inset images present the entire curves of panel G.

      We have clarified these remarks in a revised legend:

      “The radius of the substrate cavity of either protein is mapped along a trajectory connecting a start position at the entrance of each cavity (distance 0 Å) and an end position located outside of the cavity in the protein region (distance 10 Å). Both points are defined by asterisks in insets showing the substrate cavities for either transporter and they are indicated in the graph (green, cavity entrance towards the aqueous vestibule; violet, protein region).”

      Fig. 4 p . 61-2 panel H and lines 907-8: Addition to the panel of the A5 "buried Cl- binding site" would be helpful, if possible to do without obscuring the A6 and A9 Cl-s.

      Panels H and I show the cavity harboring the ion binding site in two orientations, including the surrounding residues. We prefer to show all surrounding residues for both orientations, even if this somewhat obscures the view on the ion in the left panels. An unobscured view of the ion in its cavity is provided in panels D and E.

      p. 12. Results line 234: Please specify that "both proteins" here refers to A6 and A5 vs A6 and A9

      We have specified this.

      p. 13 Results lines 268-72: R404 is "ubiquitous in other mammalian paralogs..." should be changed to "shared by most but not all mammalian paralogs".

      We have changed the text accordingly.

      Fig 4C should have a red or purple asterisk placed under the yellow column corresponding to R404 of SLC26A6, so that the discussion can refer to it. It would also be helpful to remind the reader that R404 corresponds to conserved position 6 in Fig 4 Fig Suppl. 1 panels   D-G.

      Here the authors might note that sulfate -transporting SLC26A1 and -A2 have the shorter side chain K residue.

      We have marked the position in Figure 4C with an asterisk and added the following sentence to its legend:

      Asterisk marks position that harbors a basic residue in all family members except for SLC26A9 where the residue is replaced by a valine. Whereas most paralogs, including the ones operating as bicarbonate exchangers, have an arginine at this site, the sulfate transporters SLC26A1 and 2 contain a smaller lysine.

      We have added the following statement to the legend of Figure4-figure supplement 1:

      “‘6’ indicates the position which contains a basic residue in all family members except for SLC26A9.”

      Fig 5B legend p. 63 line 916. Please specify if the 14 independent experiments include both the symmetric Cl- conditions and the asymmetric Cl-/HCO3- conditions or only one condition.

      The 14 independent experiments were only recorded in symmetric chloride conditions. We have changed the legend accordingly.

      Fig 5C. It would be useful for readers to add the I-V trace of WT SLC26A6 taken from Fig 1 Suppl 1B (perhaps in gray), to document the specificity of the very low magnitude R404V whole cell current. Alternatively, please note (if the case) that WT SLC26A6 currents (Fig 1 Supple 1B) are indistinguishable from the blacked dashed zero current density line.

      We have now displayed the I-V trace of WT SLC26A6 as grey dashed line for comparison and added a new panel that show the differences between the currents of R404V and WT recorded at 100 mV (Fig. 5D). Although the currents for R404V were consistently lager than for WT, the difference is not statistically significant. We have explicitly mentioned this in the text and the figure legend.

      Fig 5E depiction of decline in ACMA fluorescence is missing from the legend. Legend references to panels E and F seem to correspond to Fig 5 panels F and G (lucigenin fluorescence), as noted in Results p 14 lines 280-3.

      We have added the legend.

      Chernova et al (2005) reported electroneutral human and mouse A6-mediated Cl/HCO3- exchange in Xenopus oocytes. They also observed electrogenic Cl-/oxalate exchange by mouse SLC26A6, but detected no current generated by human SLC26A6-mediated Cl-/Oxalate exchange. That paper (already cited) might be referred to more explicitly in connection to the authors' current findings of electroneutral Cl-/HCO3- exchange by human SLC26A6 as well as their inability to detect human SLC26A6-mediated Cl-/oxalate exchange current in HEK-293 whole cell recordings.

      We now have included the reference in the discussion:

      “Consequently, transport would be electroneutral in case of the monovalent HCO - and electrogenic in case of the divalent oxalate (Figure 1E-H), which was already proposed in a previous study (Chernova et al., 2005).   We also want to re-emphasize that the inability to measure discernable currents does not necessarily imply that the transport might not be electrogenic as, due to their slow kinetics, transport-mediated currents might be below the detection limit of patch-clamp electrophysiology.”

      Reviewer #2 (Recommendations For The Authors):

      -  It would be helpful if the authors briefly clarify the depiction/scheme of the hypothetical SLC26A6 outward-facing conformation. Is this gleaned from a prior structure of a related SLC family or distant homolog? Functional data? Biochemical/biophysical data? As well, I would also recommend labeling this within the figure (Figure 3-figure supplement 1D labeling, for instance - inward, hypothetical outward).  

      As mentioned in the legend of Figure 3-figure supplement 1D, the outward conformation is hypothetical. We have also now mentioned this in the title. The displayed outward- conformation was constructed by manually moving the area depicting the core domain relative to the fixed gate domain.  

      -  Have the authors attempted to block SLC26A6-mediated transport with the addition of a known inhibitor, such as niflumic acid? I understand that this may be technically challenging, but it would strengthen the transport assay data, especially in Figure 5D with the ACMA assay testing the SLC26A6 R404V mutant.  

      We have not attempted to block the currents by addition of niflumic acid.  

      -  It could be helpful to the reader to move the schematics in Figure 1-figure supplement 1C into Figure 1.  

      We have now displayed the schematics illustrating the principle of the respective transport assays next to the data in Figure 1, but kept Figure 1-figure supplement 1C for a more detailed description of the assays.  

      -  Figure 3-figure supplement 1D legend, should be "hypothetical" instead of "hypothetic."  

      We have introduced the correction.

      -  I might consider coloring the Cl- ion something that is distinct from the model colors that are used in the figures (see Figure 4 and Figure 4-figure supplement 1). This would help to clarify Figure4-figure supplement 1H, where I believed that the Cl- ions at first were from the SLC26A6 model at first glance.

      We have used the green color for chloride throughout the manuscript and would prefer to keep it that way for consistency.

      -  Labeling in Figure 5 legend (E, F) do not match the Figure (F,G). The description of the ACMA assay is absent from the figure legend (the real Figure 5E).

      This has been corrected.

      Reviewer #3 (Recommendations For The Authors):

      None.  This  is  a  well-done  manuscript  and  I  have  no  further  suggestions.

    1. Author Response

      Reviewer #3 (Public Review):

      In this manuscript, Man et al. describe a new signaling pathway for regulation of the voltage-gated calcium channel Cav1.2 and show that it can modulate synaptic plasticity in the hippocampus. Studies with specific inhibitors, phosphopecific antibodies, and gene knockdown show that activation of alpha-1 adrenergic receptors induces downstream activation of the serine/threonine protein kinase PKC and the tyrosine protein kinases Pyk2 and Src, which bind to the Cav1.2 channel through its large intracellular segment connecting domains II and III. This signaling complex leads to tyrosine phosphorylation of Cav1.2 and increased channel activity. Block of this novel signaling pathway in hippocampal slices with specific inhibitors of Pyk2 and Src reduced a specific component of long-term potentiation whose induction requires Cav1.2 channel activity.

      This work is an important advance, as it presents a novel signaling pathway through which the ubiquitous neurotransmitter norepinephine and the neurohormone epinephrine can regulate synaptic plasticity, attention, learning, and memory. The experiments are comprehensive, carefully done, and clearly presented. The authors should consider revisions and responses to the points below.

      1) Figure 2B, D. Inhibitors reduce Ica below control. Is there endogenous stimulation of this regulatory pathway under control conditions?

      We now explicitly state in the Discussion: “Inhibitors of PKC, Pyk2, and Src reduce under nearly all conditions Cav1.2 baseline activity and also tyrosine phosphorylation of Cav1.2, Pyk2, and Src even when activators for alpha1 AR and PKC were present. Especially notable is the strong reduction of channel activity way below the control conditions by the Src inhibitor PP2 as well as the PKC inhibitor chelerythrine in Figure 2C. This effect is consistent with PP2 strongly reducing down below control conditions tyrosine phosphorylation of Src (Figure 8J), Pyk2 (Figure 8L), and Cav1.2 (Figure 9E) even with the PKC activator PMA present. These findings suggest that Pyk2 and Src experience significant although clearly by far not full activation under basal conditions as reflected by their own phosphorylation status, which translates into tyrosine phosphorylation of Cav1.2 under such basal conditions.” Because there are multiple ways Pyk2 and Src can be activated including Ca influx and cell-matrix interactions, defining the cause of this baseline activity has to remain beyond the scope of the current work.

      2) As noted by the authors, it would be interesting to know if peptides from the linker between domains II and III block this signaling pathway. This would be an important result because, without this information, it is not clear if this is the correct functional site of interaction for this regulatory complex.

      Briefly, we were not able to identify shorter loopII/III-derived peptides that would constitute the Pyk2 binding site and thus cannot displace Pyk2 from loop II/III either acutely with peptides or through mutagenesis of the binding site.

      3) Figure 4B. The Brain IP for Src has a weak signal. The authors should replace this panel with a more convincing immunoblot.

      We provide now the uncropped version in the raw dataset, which clearly illustrates clean, monospecific detection of the Src band over the full length of the blot. Also, please note that earlier work already reported that showed that Src binds to the C-terminus of Cav1.2 (Bence-Hanulec et al., 2000).

      4) Scatter plots are provided for the electrophysiological results but not immunoblots. For immunoblots that are quanitified, it would be valuable to add a scatter plot of the replicates.

      We now also provide scatter plots for the biochemical analysis.

    1. Author Response

      Reviewer #1 (Public Review):

      The paper addresses why and how odor discrimination ability achieved after learning occurs in select contexts. The finding is that two related odors trigger near identical Kenyon cell responses when tested in isolation, but trigger different responses to the second odor if these are experienced in sequence within a small temporal window. The authors argue that this template comparison requires some activity downstream of Kenyon cells, that is recruited by MBONs. Overall, the experiments provide very nice physiological evidence for a neural mechanism that underlies a contextual basis for the precision of memory recall.

      The experiments were well designed and done. The findings are interesting, but the pitch (e.g. the last paragraph of the discussion and the title of the paper) seems to both ignore the main finding of the paper and overstate the novelty of the idea that memory recall can be flexibly regulated by context. There should be more space dedicated to clearly articulated statements/descriptions of hypotheses and candidate mechanisms to explain the interesting phenomenon described here. For instance, explaining "enhanced template mismatch detection" by potential " real-time and delay line summation" of MBON activity is not super useful for the reader as seems to use one abstraction to explain another. The authors cite Lin et al, 2014 from Miesenbock's lab which shows a key role for GABAergic APL neurons in discrimination. Is there increased activation of APL neurons when similar odourants are being compared and discrimination is required? This seems like a simple physically embodied mechanism that could/ should be examined.

      Overall, I think the idea that memories are recalled with high precision (less generalisation) only when increased precision is demanded, is a fact that sure is well appreciated by behavioral biologists even beyond the two papers cited here (Campbell et al., J Neurosci 2013; Xu and Südhof, Science 2013). The new findings fill in a physiological gap in this phenomenology. I think the paper would be greatly improved if the authors highlighted what and focused on the physiological correlate uncovered, and tried to communicate (or test) possible mechanistic origins for this in more physically accessible terms.

      We thank Reviewer #1 for their appreciation of our findings. We are grateful for this extremely constructive feedback on re-focussing the pitch of the paper and have extensively revised the manuscript along these lines, particularly changing the Title, Introduction and Discussion. As suggested, we now highlight how similar stimuli can be categorized together, or apart, depending on the stimulus choices animals are presented during recall.

      Reviewer #2 (Public Review):

      One of the key questions in circuit neuroscience is how learned information guides behavior. Modi et al. investigated this question in Drosophila's mushroom bodies (MBs), where olfactory memory traces are formed during pavlovian olfactory conditioning. They have used optogenetics to restrict the formation of memory traces in selective output compartments of the Kenyon cell (KC) axon terminals, the principal intrinsic neurons of the MB, and tested how flies use these 'minimal memories' during learned olfactory discrimination. They found that memory traces formed in some compartments support discrimination between similar odor pairs, whereas others do not. They then investigated the neural basis of this difference by comparing the responses of relevant output neurons (MBONs) to similar and dissimilar odor pairs. They discovered that MBONs' responses could predict behavioral outcomes if odor presentation profiles during calcium imaging mimic olfactory experience during behavior. This paper and previous works support the idea that flies use olfactory memory templates flexibly to suit their behavioral needs. However, one key difference between this paper and the previous works is the site of discrimination. While previous studies using intensity discrimination have pointed towards spike-latency and on and off responses of the KCs as the main mechanism behind discrimination, Modi et al. have not detected any response difference for similar odor pairs among the KCs. Therefore, they concluded that a hitherto unknown mechanism creates these context-specific responses at the MBONs. The findings will advance our understanding of how memories are recalled during behavior. However, the authors need to bolster their data by including some critical controls that are currently missing.

      We thank Reviewer #2 for highlighting how our work contributes to the literature and for pointing out the gaps in our discussion of previous work, as well as the missing controls.

      Reviewer #3 (Public Review):

      This manuscript by Modi et al represents a novel and significant advance in the neurobiology of memory retrieval. The authors employ a novel behavioral paradigm in order to investigate memory generalization and discrimination. They investigate the role of two different populations of dopamine neurons (DANs) targeting different compartments involved in aversion learning, i.e. α3 (MB630B) and γ2α'1 (MB296B).

      The behavioral platform is clear and convincing but lacks natural reinforcement comparisons. The entire paper uses optogenetic reinforcement of said DAN populations.

      The authors identify that the gamma DANs can enable both easy and hard odor discrimination, while the alphas DANs can only do easy.

      The odors can be separated by calcium imaging analysis of Kenyon cells. Subsequent calcium imaging of the gamma DANs themselves showed that a single training event was insufficient to enable easy odor discrimination at the gamma DAN level, but strangely not for the hard discrimination that gamma DANs can mediate. Seemingly, this is due to the lack of the temporal contiguity of odors (present in behavioral experiments but not in the initial imaging experiments. However, in gamma DANs, Odour transitions enabled discrimination of odors in hard discrimination, based on the depression of calcium activity in DANs after training that was odor-specific. The same was not true for alpha DANs, though the authors used natural electric shock pairings instead of optogenetic stimulation of DANs for the alpha experiment. However, statistical comparisons are done within group and need also be provided for between the groups for both pre and post-training. The authors persuasively show that hard discrimination can only happen in transitions. They also argue that the same engram can be read in two different ways. This is convincing overall, but they claim it is happening downstream of the Kenyon cells just because they do not see it in the Kenyon cells, and I cannot comment on the modeling in Figure 5 (expertise).

      Experimental methods used are appropriate, as are data analysis strategies.

      The manuscript itself is well written in parts, though at times paragraphs are quite patchy, especially in the discussion. There are also a visible number of typos. The figures are well constructed, and generally well organized. The overall document is concise and has sufficient detail.

      We appreciate the reviewer’s comments on the novelty and significance of our study.

    1. Author Response

      Reviewer #2 (Public Review):

      In Rey et al., the authors goal was to characterize the development of a myelin-like (lacunar) expansion of glial membrane in Drosophila. Although myelin is largely considered a vertebrate innovation, there are a handful of invertebrate models that have been described with glial-derived "myelin," though these systems are not amenable to the same genetic control as Drosophila. To that end, the authors first newly-developed genetics and antibodies to characterize the presence of an axon initial segment (AIS) for adult Drosophila motor neurons that is present at the border between the central and peripheral nervous systems. They show that both sodium (Para) and potassium (Shal) channels, which are typically enriched at the AIS in mammalian neurons, are enriched at this border specifically on motor neurons. They then used multiple types of transmission electron microscopy to visualize this region and found that along with clustering of channels, there is an expansion of membranes from wrapping glia that is reminiscent of myelin. At times, this expansion spirally wraps around larger axons. Finally, they show that genetic ablation of wrapping glia results in an upregulation and redistribution of Para.

      Major strengths of this manuscript include the creation of new genetic tools for visualization of subcellular features (e.g. channels) by both light microscopy and electron microscopy.

      While this manuscript provides an interesting set of data, but suffers from a lack of quantification and annotation to allow the reader to judge whether this is a robust phenomenon. To increase the reader's confidence in these studies, substantially more quantification of the data is required.

      Furthermore, to improve the accessibility of this manuscript, I have the following suggestions:

      1) Please label the panels throughout the figures with an abbreviated genotype and what the fluorophores signify. Similarly, the presence of scale bars in uneven across the figures.

      This was all corrected.

      2) For panels where only one channel is shown, please show these in black and white, which is easier for the visually-impaired.

      We have not done this, since the color adds another layer of information (e.g. paramCherry is in magenta, whereas anti-Para staining is in green) which in our view helps to make the complex figures easier to understand.

    1. Author Response

      Reviewer #1 (Public Review):

      Inhibition of translation has been found as a conserved intervention to extend lifespan across a number of species. In this work, the authors systematically investigate the similarities and differences between pharmacological inhibition of protein synthesis at the initiation or elongation steps on longevity and stress resistance. They find that translation elongation inhibition is beneficial during times when proteostasis collapse is the primary phenotype such as proteasome dysfunction, hsf-1 mutants, and heat shock, but this intervention does not extend the lifespan of wt worms. While translation initiation inhibition extends the lifespan of wt worms and heat shock, but in an HSF-1 dependent manner. This work shows that a simple explanation of just inhibiting total protein synthesis and reduced folding load cannot explain all of the phenotypes seen from protein synthesis inhibition, as initiation and elongation inhibition repress overall translation similarly, but have different effects depending on the experiment tested. Using multiple interventions that target both initiation and elongation lends further support to their findings. These experiments are important for conceptualizing how translation inhibition actually extends lifespan and promotes proteostasis.

      Major Comment:

      The authors acknowledge that lifespan extension must not necessarily arise just from reducing protein synthesis, as elongation inhibition reduced protein synthesis but did not extend lifespan. Yet for the converse effects from elongation inhibition they seem to suggest that it arises from reducing protein synthesis. For example, regarding how elongation inhibition extends lifespan in an hsf-1 mutant, the authors suggest that "inhibition of elongation lowers the production of newly synthesized proteins and thus reduces the folding load on the proteostasis machinery", even though initiation inhibitors do not extend lifespan in an hsf-1 background (while presumably lowering the production of newly synthesized proteins).

      Thank you for this excellent comment. It led us to conduct a crucial experiment with a new finding that is now Figure 6. As suggested, we asked if initiation inhibitors lower the concentration of newly synthesized protein in the hsf-1(sy441) background. The surprising answer is that initiation inhibitors lower the concentration of newly synthesized proteins in N2 but dramatically increase it in hsf-1(sy441). The failure to lower the concentration of newly synthesized proteins was true for the pharmacological inhibitors as well as RNAi against ifg-1. Therefore, inhibition of initiation requires HSF1 to lower the protein concentration. These new findings enable us to make a much more precise statement now added to the discussion:

      Lines 372: “The inability of translation-initiation inhibitors to reduce the concentration of newly synthesized proteins in hsf-1(sy441) mutants and the inability to extend their lifespan shows that lowering the concentration of newly synthesized proteins is necessary for the beneficial effects. On the other hand, the finding that elongation inhibitors protect from proteotoxic stress but does not extend lifespan shows that lowering the concentration of newly synthesized proteins is sufficient to protect from proteotoxic stress but is not sufficient to extend lifespan in wild-type, which appears to require selective translation.”

      Reviewer #2 (Public Review):

      In this manuscript, Clay et al. investigate the underlying effects of reduced mRNA translation beneficial on protein aggregation and aging. They aim to test two pre-existing hypotheses: The selective translation model proposes that downregulation of overall translation increases the capacity of ribosomes to translate selected factors that in turn increase stress resistance against toxicity. The reduced folding load model suggests that during high mRNA translation rates, newly synthesized peptides and proteins can overwhelm the protein folding capacity of the cell and therefore cause protein toxicity. By generally lowering mRNA translation, lower loads of newly synthesized proteins should cause less protein folding stress and hence protein toxicity.

      To understand how reduced mRNA translation mediates its beneficial effects in the context of the proposed models, the authors use different drugs established previously in other in vitro and in vivo systems to inhibit selected steps of translation. The systemic effects of translation initiation versus elongation inhibition in C. elegans are compared during heat shock, specific protein aggregation stresses and aging. These phenotypes are further tested for dependence on hsf-1, as contradictory data on the effect of translation inhibition during thermal stress in the context of hsf-1 dependency exist.

      The data show that inhibition of translation initiation protects from heat stress and age-associated protein aggregation but on the contrary further sensitizes animals to protein toxicity induced by a misfunctioning proteasome. Further, inhibition of translation initiation increases lifespan in WT animals. The survival phenotypes observed during heat shock and regular lifespan assays are dependent of HSF-1, supporting the selective translation model. As stated in the manuscript, these findings themselves are not new, given that similar observations were made before using genetic models. Interestingly, the inhibition of translation elongation protects from heat stress, but, unlike initiation inhibition, also proteasome-misfunction-induced protein toxicity. Both phenotypes were observed to be independent of hsf-1. The authors further find that inhibiting elongation does not reduce protein aggregation in aged worms and does not prolong lifespan in wild-type animals. It does increase lifespan in short-lived hsf-1 mutants, where protein homeostasis is compromised. To a degree, these findings support the reduced folding load model. Overall, from these observations the authors summarize that the systemic consequences of lowering translation depend on the step in which translation is inhibited as well as the environmental context. The authors conclude that different ways to inhibit translation can protect from different insults by independent mechanisms.

      Impact, strengths and weaknesses:

      mRNA translation and its regulation is one of the most studied mechanisms connected to lifespan extension. However, gaps behind the protective effects of translation inhibition are so far unresolved, as stated by the authors. Therefore, testing existing hypotheses explaining the beneficial effects of translation inhibition is of great interest, not only for C. elegans researchers but a broad community working on the effects of misregulated translation during aging and disease. Overall, the conclusions made by the authors are generally supported by the data shown in this manuscript. However, some major gaps remain and need to be clarified and extended.

      Thank you for your generous comments and thorough review.

      Reviewer #3 (Public Review):

      Clay and colleagues investigate the proteostasis and longevity benefits derived from translation inhibition in C. elegans by examining the impacts of chemical translation initiation inhibitors (IIs) and translation elongation inhibitors (EIs) on thermotolerance, protein folding stress, aggregation and longevity. They observe somewhat distinct impacts by the two chemical groups. IIs increased longevity in wild-type animals in an hsf-1 dependent manner, whereas, EIs only extended hsf-1 mutants' lifespan. Only EIs protected against proteasome dysfunction. Both protected against heat stress but with differing hsf-1 dependence. The authors utilize these observations to derive conclusions regarding two dominant points of view on the mechanism by which translation inhibition improves lifespan and proteostasis.

      The study is based on interesting observations and several promising avenues of further investigation can be identified. However, the manuscript appears somewhat preliminary in nature, with many of the observations, while interesting, only explored superficially for mechanistic insights. The rationale behind some of the interpretations was also difficult to interpret. For example, the authors make conclusions about 'selective translation' being adopted upon IIs treatment without directly testing this. Protein aggregation, while possibly predictive, is not a reliable readout for selective translation of some mRNAs. Similarly, the evidence for a reduction in 'newly-synthesized protein load' by EIs is thin based on one reporter. Previous studies from the Blackwell lab have identified differential impacts of SKN-1 on select cytoprotective genes' expression and proteasomal gene expression based on inhibition of translation initiation or elongation. So there is precedence for both the differential impact of initiation vs. elongation inhibition as well as genetic background. There are several other such studies that reduce the impact of the observations presented here. With limited novelty and mechanistic insight, the impact of the study on the field is likely to be moderate.

      We thank the reviewer for the thorough analysis and candid summary. Some of the criticisms rang true, and we have made considerable efforts to address them, both increasing the thoroughness of our study by establishing that these inhibitors inhibit initiation and elongation (new Figure 1) and by providing a novel mechanism showing that the ability of the initiation machinery requires HSF1 to lower the concentration of newly synthesized proteins.

      Before we go into the specific criticisms, we would like to note that of the 30-40 eukaryotic translation inhibitors used in cell culture and yeast, very few have been validated in C. elegans. The go-to inhibitor was cycloheximide which, in our hands, is reliable in cell culture but unreliable in C. elegans, most likely due to its poor pharmacokinetics (data now added to the supplementary figures). To our knowledge, no C. elegans study investigating translation has made sure to equalize the concentration of newly synthesized proteins or could have because of a lack of validation of the chemical tools used in other organisms. Thus, the comment of reviewer #3 that we did not go far enough with the validation struck home, and in the revised version of Figure 1 we added more validation.

      We are of the opinion that it is essential to ensure that both mechanisms reduce the concentration of newly synthesized proteins to the same degree to study mechanistic differences. Otherwise, one cannot deconvolute if any phenotypic difference is caused by the mechanistic difference or the degree of translation inhibition. The importance of monitoring the level of inhibition became evident in our new Figure 6, which shows that inhibition of the translation initiation machinery no longer reduces the concentration of newly synthesized proteins but increases it in the absence of HSF1.

    1. Author Response

      Reviewer #1 (Public Review):

      The role of HCO3 (or possibly CO2) in regulating sACs is well established yet its physiological context is less clear. The heart is indeed an excellent choice of organ to study this. Isolated mitochondria offer a tractable model for studying the model, although are not without limitations. The quality of recordings is very high, as judged by the consistency of results (i.e. lack of clustering between biological repeats). My primary concern is about distinguishing the effect of pH and HCO3. A rise in HCO3 will also raise pH unless this had been compensated by CO2. It is unclear, from the legend or results, if the bicarbonate effect is due to HCO3 or pH. Was pH controlled by matching the rise in HCO3 with an appropriate level of CO2? The swings in pH are likely to be very large and, potentially, a confounding factor. Certainly, there will be an effect on the proton motive force. A more informative test would compare the effect of 0 CO2/0HCO3 at a pH set to say 7.2, 2.5% CO2/7.5 mM HCO3, and then 5% CO2/15 mM HCO3, etc. Control experiments would then repeat these observations over a range of pH (at zero CO2/HCO3) and over a range of CO2 (at constant HCO3). Data for zero bicarbonate are not informative, as this will never be a physiological setting (results claim 0-15 mM bicarb to represent physiology). Importantly, there seems to be no significant difference in 2A between 10 v 15 mM bicarb, i.e. the physiological range.

      Thank you for your clear discussion and suggestions. We agree that pH must be controlled in these experiments to avoid the confounding situation you describe. In fact, pH was carefully controlled but this was not described adequately. To make this clearer the methods section was modified.

      We agree that bicarbonate concentration is above 0 in living tissue. We used that value only as a reference in Figure 2A to examine which type of adenylyl cyclase (AC) is inside mitochondria, i.e., bicarbonate-activated soluble AC as opposed to transmembrane AC which is not bicarbonate activated. The wording has been corrected to better describe this. The question of what the physiological consequences are require an assay with higher signal-to-noise ratio. In effect, this is achieved in the experiments of Fig. 2B-C which show that physiologically relevant changes in bicarbonate have a large and significant influence on mitochondrial ATP production.

      There is also a question on the validity of the model. A rise in respiratory rate will produce more CO2 in the matrix. This may raise matrix HCO3, and stimulate sACs therein, but the authors claim sACs are in the IMS, rather than the matrix. Since HCO3 is impermeable, it is unclear how sACs would detect HCO3 beyond the IMM. CO2 escaping the matrix will enter the continuum of the cytoplasmic space, which has finely controlled pH. Since membranes (including IMM) are highly permeable to CO2, the gradient between matrix and cytoplasm will be small (i.e. you only need a small gradient to drive a big flux, if the permeability is massive). Since CO2 can dissipate over a large volume, it is unlikely to accumulate to any degree. CO2 will be in equilibrium with HCO3 and pH (because there are carbonic anhydrases available). Since the cytoplasm has near-constant pH, [HCO3] must also be close to constancy. It is therefore difficult to imagine how HCO3 could change dramatically to meaningfully affect sACs and hence cAMP. Evidence for major changes in IMS pH in intact cells during swings of respiratory activity would be required to make this point. Indeed, for that reason, it would be more sensible to anchor sACS in the matrix, as there, HCO3 could rise to high levels, as it is impermeable, i.e. could be confined within the mitochondrion. I am therefore not convinced the numbers are favorable to the proposed mechanism to be meaningful physiologically.

      The question of how CO2/bicarbonate signaling can work in the intermembrane space (IMS) is explicitly addressed in the revisions in the results and discussion sections. CO2, a membrane permeable gas, can easily cross the IMM (permeability coefficient of 0.01 to 0.33 cm/s). Once in the IMS, CO2 can combine with water to produce bicarbonate, a fast reaction in a physiological context (i.e. mM/s in physiological saline) that can be accelerated to nearly diffusion limits by carbonic anhydrase if present. The assumption that all the “CO2 escaping the matrix will enter the continuum of the cytoplasmic space” is not supported by the structure of cardiomyocyte mitochondria. As illustrated in new Fig 2H, only a small fraction (9-15%) of the IMS occurs along the mitochondrial periphery adjacent to the outer membrane and cytosol. Most of the IMS is contained inside the cristae (the intracristal space or ICS), which in interfibrillar mitochondrion are composed of closely packed extended flat membranes that create scores of alternating layers of matrix and ICS ideal for rapid gas exchange between compartments. The cristae are connected to the peripheral IMS through narrow “crista junction” openings that restrict solute diffusion between the peripheral IMS and ICS. Thus, rather than being dominated by the ionic equilibria of the cytosol, the crista compartments are functionally distinct from the peripheral IMS region and cytosol. We cite recent publications using super-resolution light microscopy in which gradients of 0.3-0.4 pH units (dependent on metabolic state) have been detected along the cristae of respiring cells and between the peripheral IMS and crista interiors. These diffusion effects would likely be even more pronounced for cardiac muscle mitochondria, which have larger, more densely packed lamellar cristae than other cell types. Thus, the microenvironment of the cristae provide confined spaces in close communication with the matrix CO2 production, and ideal for operation of a sAC signaling system.

      Reviewer #2 (Public Review):

      The authors explore the role of bicarbonate-regulated soluble adenylate cyclase in modulating cardiac mitochondrial energy supply. In isolated rat mitochondria, they show that cyclic AMP (but not the permeable cAMP analog 8-Br-cAMP) increases ATP production via a Ca-independent mechanism at a location in the intermembrane space of the mitochondria, rather than in the matrix, as previously reported. Moreover, they show that inhibition of EPAC, but not PKA, inhibits the response. The effect required supplementing the mitochondria with GTP and GDP to facilitate activation of the EPAC effector GTPase Rap1. The study provides interesting new information about how the heart might adapt to changes in energy supply and demand through complementary regulatory processes involving both Ca and cyclic AMP.

      The authors nicely demonstrate that soluble adenylate cyclase is localized to mitochondria. They argue, based on the effects of cyclic AMP, which is accessible to the mitochondrial intermembrane space (IMS) but not the matrix, that the signalling pathway is located in the IMS. They also find that EPAC/Rap1 is the likely downstream effector of cyclic AMP, through yet unknown targets regulating oxidative phosphorylation.

      A weakness is that the components of signaling (sAC, EPAC, and rap1) are not definitively localized to a specific mitochondrial compartment using the superresolution imaging methods employed.

      Thank you for the concise summary of key findings. While the super-resolution data indicates sAC and CA are localized to the interior of the mitochondria (possibly co-localized in the same subspace), identification of the particular microcompartment is not possible from the imaging data alone. We explain clearly in the manuscript that the functional experiments are critical to the conclusion that the sAC signaling pathway most likely operates in the IMS.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript is interesting because of the exploration of a novel model organisms utilizing next-generation sequencing approaches, such as single-cell-RNA-seq. Despite the authors' efforts the manuscript lacks a cohesive narrative and suffers from being extremely preliminary in nature. For example, most of the figures are cut and pasted directly from the computational programs with very little formatting or thought to creating new knowledge from the data generated. Essentially the manuscript consists of 2-3 experiments where the authors performed single-cell-RNA-seq on different anatomical locations in the pig and also on a couple of different pig types (The Chenghua and Large White). The authors used standard computational pipelines consisting of Seurat, Monocle, Cell Chat, and others to characterize differences in their data.

      There is potential in this manuscript but the authors should improve upon the manuscript by mining the data better and generating a better understanding of anatomical positions of pig skin by evaluating the Hox genes.

      (1) Thanks for the reviewer's positive evaluation for our article and providing valuable feedback to improve the quality of our manuscript. To provide a more cohesive narrative, we have edited throughout the manuscript.

      (2) Meanwhile, we also modified and formatted some figures including Figures 2-6, Figure 4—figure supplement 1 and 2, Figure 5—figure supplement 1 and 2, and Figure 6—figure supplement 1.

      (3) We have analyzed these data of regional- or species-based differences more extensively, and the added content are in Result Section of “Heterogeneity of skin FBs in different anatomic sites” and “Heterogeneity of skin cells in different pig populations”.

      (4) However, in our study, we did not identify any Hox gene among these differentially expressed genes in skin fibroblasts from both different anatomical sites and different pig populations. The differences of Hox code expression patterns might come from the heterogeneity of different species.

      Reviewer #2 (Public Review):

      The authors aimed to analyze different dermal compositions of various skin regions, focusing on fibroblast, endothelium and smooth muscle cells. They collect skin samples from six different skin regions of adult pig skin including the head, ear, shoulder, back, abdomen, and leg skins. After dissociating the tissues into single cells, they perform single-cell RNA analyses. A total of 215 thousand cells were analyzed. The authors identified distinct cell clusters, enriched molecules within each cell cluster, and the dynamic of cell cluster transition and interactions. Based on their findings, they conclude that tenascin N, collagen 11A1, and inhibin A are candidate genes for facilitating extracellular matrix accumulation.

      Strength:

      The methodology they used to prepare scRNA data is appropriate. Bioinformatic analyses are solid. The authors emphasize the heterogeneous phenotypes and composition ratios of smooth muscle cells, endothelial cells and fibroblasts in each skin region. They identify potential cell communication pathways among cell clusters. Expression of selective molecules on tissue sections were done.

      Weakness:

      While tenascin, collagen and inhibin are highlighted as genes important for ECM accumulation, there is no functional evaluation data. The discussion section is a compilation of comparisons, and is somewhat fragmentary. More significance from this dataset could have been extracted.

      (1) We appreciate the reviewer's suggestions for evaluating the functional significance further. In our next research, we will perform some experiments in vitro and in vivo to explore the functions of these identified key genes.

      (2) The discussion section have been greatly modified and it shall be more logical and readable.

    1. Author Response

      Reviewer #1 (Public Review):

      Wu et al. provide a powerful cross-species approach to better understand brain cell-type specific responses to mutant tau and aging. Therefore, they use scRNAseq of established Drosophila models that they had previously used for bulk RNAseq (Mangleburg et al., 2020) at 1, 10 and 20 days of age, which thus allows them to study the contribution of pathogenic tau (R406W-mutant) in isolation in an experimentally highly controllable manner. They find a large overlap between tau-induced and aging-induced deregulated genes, however different cell-types were primarily affected, suggesting that expression of tau does not simply induce accelerated aging. When assessing cell number abundance in response to tau expression the authors noted that certain excitatory neurons were preferentially lost. They then examined innate immune pathways downstream of NFkB, which they had already uncovered in their previous bulk studies to be associated with tau expression. Also at the scRNAseq level, they find these pathways to be deregulated after expression of tau. In addition, in control cell types that are lost when tau is expressed, they find an inverse correlation of the expression of these pathways and cellular loss, suggesting they might be predictors of neurodegeneration severity. Finally, they use this finding uncovered in Drosophila and reexamined human Alzheimer's disease snRNAseq datasets, were they also find the NFkB pathway to be deregulated.

      This study has several strengths. It demonstrates the power of studying taueffects in a tractable model and then using the obtained knowledge to pin-point relevant pathways in cross-sectional studies of human tauopathy, which are otherwise not easy to interpret given the overlayed effects of other disease triggers. By examining the single-cell level they uncover cell type specific effects, which would otherwise be hidden. This study also represents a valuable resource. Given that the authors have included multiple time points the dataset provides an opportunity to understand the evolution of cell-type specific tau effects over time. The authors have also included a replication dataset, which confirms the results of the primary analysis of neuronal loss. I also appreciate the efforts to understand the apparent increase in glia cell number after expression of tau. By combining computational and experimental methods the authors reach the well supported conclusion that in fact glial cell numbers remain constant but only appear increased due to the proportional nature of the scRNAseq data and profound loss of some neurons. Overall, it is interesting that the authors nominate the innate immunity and NFkB pathways in tauopathy, based on deregulated genes and also based on vulnerable neurons. Nevertheless, this is a correlative finding and as such does not proof that it is causal.

      As noted [R3], above, we agree that our findings of NFkB dysregulation are correlative. We have performed new experiments to directly test the hypothesis that neuronal immune pathways are causally linked to tau-mediated neurodegeneration; however, the results were negative. These data are included in the revision and we also carefully discuss published work from other fly models of aging and neurodegeneration as well as mouse tauopathy that strongly suggest NFkB can directly modulate neurodegeneration.

      The authors correctly point out the importance of aging as a risk factor for Alzheimer's disease. However, it is unclear whether their models actually capture age-dependent neurodegeneration. Alternatively, they might represent neurodevelopmental tau toxicity. In Figure 1B it can be seen that all vulnerable cell types are already lost at day 1, most notably a'/b'-KC, a/b-KC and G-KC with a >4-fold decrease. This raises the question whether the lost cells might developmentally have not correctly formed, as suggested by a study that the authors cite (Kosmidis et al., 2010). This distinction is important in order to strengthen the translational value of the study to human tauopathies.

      The elav>tauR406W model manifests both developmental toxicity and age-dependent neurodegenerative changes. Our revision includes new data highlighting specific examples and includes a more balanced discussion of these issues.

      The analysis of tau expression levels relative to its impact across cell types in Figure S8 is interesting, however has caveats. The profound neuronal loss makes the interpretation of the correlation analysis of tau levels vs. neuronal vulnerability difficult - since it might be that the individual surviving a'/b'KC, a/b-KC and G-KC cells are the ones that expressed little amounts of tau, while those that are missing used to express high tau. In addition, it is unclear from the methods whether the 3' UTR from the transformation vector to generate the models was included in the counting. The majority of reads would be expected to be there.

      As suggested, we have repeated the alignment and analysis of MAPT expression including the short SV40 3’UTR (135bp) from the transformation vector. The result appears very similar to that from the previous analysis, and we have updated Figure 3–figure supplement 4 with these data. Based on the feedback from Reviewer 2, we also include a new plot highlighting the non-relation between MAPT expression and cell abundance changes (Figure 3–figure supplement 4B). We acknowledge the caveat that “missing” / dead cells may have previously expressed high levels of tau, leaving behind survivors with low tau levels. We have added mention of this possible caveat in the Results (lines 197-198). However, while this scenario might impact our interpretation of cell abundance changes, it is a less likely to confound our analyses of differential expression, in which the number of differentially expressed genes show very poor correlation, if any, with MAPT transgene expression (Figure 3–figure supplement 4C).

      It would be relevant to know whether the animals were in the same genetic background. I.e. is UAS-TauR406W in the same background of the fly that was crossed to elav-Gal4 to serve as the control. This is not mentioned in the paper and also not in Mangleburg et al., 2020 which the authors refer to. There is a lot of tau-induced DEGs (~1/3 of the detected genes) and it would be relevant to know whether some of them might be due to genetic background.

      Our experimental design mitigates the possibility of a substantial impact from genetic background; however, we have added text in the revision noting that this is an important consideration and possible confounder.

      The finding of the authors that NFkB pathways are higher in cell types that degenerate more is interesting. However, in Figure 4D it is also apparent that multiple cell types that do not degenerate have comparably high expression. Therefore, it is not a sufficient factor to explain why some neurons are vulnerable vs. others are not, but rather predicts amongst the vulnerable neurons how much they will be lost. It would be helpful to make this distinction clear in the text.

      We agree with the reviewer’s interpretation, and we have tried to make this more clear in both the results and discussion (lines 286-288; 387-390). Indeed, the NFkB expression level seems to be a marker for the severity of tau-triggered cell loss among the vulnerable cells.

      Reviewer 2 (Public Review):

      Wu et al. conducted longitudinal single-nucleus RNA sequencing in a Drosophila transgenic line expressing pathogenic tau (Arg406 ->Trp) and control to study presenile degenerative dementia with bitemporal atrophy. Their data is consistent with previous findings on Tau neurotoxicity, which significantly affects excitatory neurons in human brain samples and transgenic mice. Authors identify aging-like signatures, and an innate immune glial response, including the NFKB pathway, in the transgenic animals.

      Strength: This is a great resource for the dissection of dynamic, age-dependent gene expression changes at cellular resolution for the fly community. The article's conclusions are largely supported by the data.

      We thank the reviewer for recognizing the value of this work as a resource for the field.

      Weakness: No additional orthogonal validation is done on the identified pathways using immunohistochemistry. Also, the authors hypothesized that innate immune signatures might serve as predictors of neuronal subtype vulnerability in tauopathies. Although their data support stronger immune responses in the mutant lines, these findings are not validated. Moreover, the Authors need to use appropriate control animals to compare the mutant Tau animals.

      Our original manuscript included experimental validation demonstrating that (1) the apparent increases in glial cell abundance is likely due to changes in cell proportions, and we also (2) confirmed the expression of Relish in both neurons and glia of the adult fly brain. For our revision, we were guided by the requested Essential Revision #3 [R3]. We have therefore performed additional experiments directly testing whether manipulation of Relish/NFkB in neurons alters tau-induced neurodegeneration. While the results of these experiments were negative, we have incorporated them into the results and discussion, along with discussion of related published studies that support a causal role for NFkB immune pathways in tauopathy. Lastly, in our response to Essential Revision #4 [R4], we address concerns about the conservation of neurotoxic mechanisms between mutant and wildtype froms of MAPT and the use of mutant MAPT transgenic models for investigations of AD, along with the caveats. New analyses have been performed and textual revisions have been made in response this feedback

      Reviewer3 (Public Review):

      Understanding the changes in the brain during the progression of neurodegenerative diseases may provide a critical entry point towards medical treatments. Many genes have been directly or indirectly implemented in an array of neurodegenerative diseases, including the microtubule associated protein tau (MAPT). Various studies have shown that misexpression of tau can cause behavioral, genetic as well as molecular phenotypes that display properties of human neurodegenerative diseases connected to tauopathies. Here the authors use the fruit fly as model to assess phenotypic defects at single-cell resolution. Pan-neuronal misexpression of a mutant form of tau (R406W) and single-cell RNAseq at different time points provides the basis for the investigation.

      The authors assess which cell-types are affected (by comparing it with previously described brain cell atlas identities) and find that certain cell types are missing (or less abundant) while other appear unaffected. They do this comparison in relative abundance; both neurons and glia cells are affected.

      As next step they compare this with the cell-cluster changes during aging and compare both types of analysis; the investigation here includes the analysis of differentially expressed genes in defined cell clusters. One particularly affected pathway in response to tau is the NFκB signaling pathway. The authors investigate the gene expression changes of the NFκB signaling pathway in the current dataset in more detail. In the last section the authors compare singlecell transcriptomic analyses between fly and human postmortem tissue, showing that the NFκB signaling pathway might be a conserved aspect of neurodegeneration.

      The manuscript is overall an elegant example of how single-cell RNAseq can be employed as tool to study the impact of genetic modulators of neurodegeneration (in this case tau) and that it allows direct comparison with human tissues. The results are clean, logically presented and accordingly discussed. It shows that such approaches are indeed powerful for genetic dissection of mechanisms at a descriptive level and opening doors for functional studies.

      We thank the reviewer for this positive summary of our manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      The goal of the authors was to understand how the kinase, hpk-1, could regulate and interrogate different aspects of cellular stress resilience. To this end, the authors uncovered that hpk-1 is coexpressed with several transcription factors known to regulate different stress responses and this coregulation only appears to occur in the nervous system. Taking a deeper dive, they convincingly find that hpk-1 overexpression in either serotonergic of GABAergic neurons can protect animals from heat stress or toxic protein aggregates. Interesting, it appears that hpk-1 functions in serotonergic neurons differently from GABAergic neurons in the induction of the heat shock response and autophagy.

      Overall, the experiments and results are solid and the conclusions drawn reflect the result. The model suggests that the receiving cell deciphers that either heat shock response or autophagy can be induced in the same cell, but the data suggest otherwise. Perhaps the model should be reworked to reflect this point.

      We thank the reviewer for their kind assessment and suggestion to refine our model. Indeed, we did not intend to imply that that the receiving cell/tissues were the same after each stimulus, but were attempting to simplify the diagram and condense space. In the revised manuscript we have altered the model (Figure 9B) to reflect that the recipient tissues are distinct.

      Reviewer #2 (Public Review):

      Lazaro-Pena et al. investigated how a conserved kinase called homeodomain interacting protein kinase (HPK-1), helps to preserve neuronal function, motlity and stress resilience during aging in the metazoan, C. elegans. HPK-1 is a member of the HIPK kinases that, in mammalian systems, regulate the activity of transcription factors (TFs), chromatin modifiers, signaling molecules and scaffolding proteins in response to cellular stress. The group finds that in C. elegans, HPK-1 depletion causes a premature shortening of lifespan and decreases motility and stress resilience in the whole animal. Conversely, increasing active, but not enzymatically dead, HPK-1 levels in the nervous system alone is sufficient to extend lifespan and mitigate the accumulation of aging-associated protein aggregates. The authors then identify a subset of neurons and cell stress response pathways that could be responsible for the contribution of HPK-1 to lifespan and neuronal health. This leads the authors to propose a hypothesis whereby HPK-1 activity in specific neurons preserves protein homeostasis and neuronal integrity, and thus limits the aging-induced decline in organismal function.

      Overall, the authors test several functional readouts for neuronal activity to support their claim that HPK-1 activity limits functional decline during aging. These experiments are solid, and the use of a kinase dead HPK-1 in these experiments adds strong support to their claim that HPK-1 activity preserves organismal health. However, weaknesses in the experimental layout and rigor, and the statistical analyses of the publicly available data, limit the inferences that can be made, and further experimental evidence would be required to confirm the working model proposed by the authors.

      We thank the reviewer for their thoughtful and balanced assessment of our study.

    1. Author Response

      The authors would like to thank the reviewers and editors for this thorough and constructive assessment of our paper. We look forward to addressing their suggestions for improvement of our work in a revised manuscript. In particular: (i) Reviewer 1 raises interesting questions regarding the potential impact of intrinsic cortical and mesh morphology on interpolation, smoothing and the resultant patterns of gene expression. We will test these ideas by developing a null model framework. (ii) Reviewer 2 suggests re-creating dense expression maps using an alternative Gaussian Processes for interpolation. We will implement this suggestion and compare the resulting maps with those generated by the current interpolation method. Notwithstanding these helpful lines of further enquiry, we believe our study provides a meaningful step forwards in multiscale analysis of the human brain by generating. validating, describing and annotating dense gene expression maps which can accelerate translation between neuroimaging and genomic analysis of the human cortical sheet.

    1. Author Response

      Reviewer #1 (Public Review):

      First, we thank the reviewer for his instructive remarks. In the following we address the queries of Reviewer 1.

      1.1) At several points, the authors make claims that I believe extend beyond the data presented here. For instance, in the Abstract (line 27), the authors state "the development of adult songs requires restructuring the entire HVC, including most HVC cell types, rather than altering only neuronal subpopulations or cellular components." The gene ontology analyses performed do suggest that there is a progression from cellular transcriptional changes to organ-level changes, however caution should be taken in claiming that "most HVC cell types" exhibit transcriptional changes. In fact, according to Fig. 3D most of the transcriptional changes appear restricted to neurons. As the authors themselves note elsewhere, claims at this resolution are difficult without support from single-cell approaches. I do not suggest that the authors need to perform single-cell RNA-seq for this work, but strong claims like this should be avoided.

      We have revised our claim to more accurately reflect our findings. Our intended message is that testosterone treatment leads to extensive transcriptional changes in the HVC, likely affecting a majority of neuronal subpopulations rather than solely targeting specific cellular components. The revised text in lines 29-32 now reads: "Thus, the development of adult songs stimulated by testosterone results in widespread transcriptional changes in the HVC, potentially affecting a majority of neuronal subpopulations, rather than altering only specific cellular components."

      1.2) Similarly the Abstract states that parallel regulation "directly" by androgen and estrogen receptors, as well as the transcription factor SP8, "lead" to the transcriptional and neural changes observed after testosterone treatment of females. However, experiments that demonstrate such a causal role have not been performed. The authors do perform a set of bioinformatic analyses that point in this direction - enrichment of androgen and estrogen receptor binding sites in the promoters of differentially expressed genes, high coexpression of SP8 with other genes, and the enrichment of predicted SP8 binding sites in coexpressed genes. However, further support for direct regulation, at the level that the authors claim, would require some form of transcription factor binding assay, e.g. ChIP-seq or CUT&RUN. I am fully aware that these assays are enormously challenging to perform in this system (and again I don’t suggest that these experiments need to be done for this work); however, statements of direct regulation should be tempered. This is especially true for the role of SP8. This does appear to be a compelling target, but without some manipulation of the activity of SP8 (e.g. through knockdowns) and subsequent analysis of gene expression, it is too much to claim that this transcription factor is a regulatory link in the testosterone-driven responses. SP8 does appear to be a highly connected hub gene in correlation network analysis, but this alone does not indicate that it acts as a hub transcription factor in a gene regulatory network.

      We appreciate the reviewer's comment and have revised the statement concerning the role of SP8. Indeed, we document the coexpression of ESR2 and SP8, and our bioinformatics analysis suggests that SP8 might play an important role in transcriptomics. We have rephrased the statement in line 29-32 as follows: "Parallel gene regulation directly by androgen and estrogen receptors, potentially amplified by coexpressed transcription factors that are themselves steroid receptor regulated, leads to substantial transcriptomic and neural changes in specific behavior-controlling brain areas, resulting in the gradual seasonal occurrence of singing behavior." In addition, we have included discussions regarding limitations of promoter sequence analyses (lines 414 to 427).

      1.3. Along these lines, the in-situ hybridizations of ESR2 and SP8 presented in Figure 5 need significant improvement. The signals in the red and green channels, SP8 and ESR2, look suspiciously similar, showing almost identical subcellular colocalization. This signal pattern usually suggests bleed-through during image acquisition, as it’s highly unlikely that the mRNA of both genes would show this degree of overlap. I would suggest that control ISHs be run with one probe left out, either SP8 or ESR2, and compare these ISHs with the dual label ISHs to determine if signal intensity and cellular distribution look similar. Furthermore, on lines 354-356 the authors write, "The fact that the two genes were expressed nearby in the same cell may indicate physical interactions between the gene pair and warrant further investigation into the nature of their relationship.". Yet, even if the overlap between ESR2 and SP8 shown in Figure 5 is confirmed, close localization of transcripts does not imply that the protein products physically interact. The STRING bioinformatic analysis is more convincing that there is a putative regulatory interaction between ESR2 and the SP8 locus, and this suggestion of protein-protein interaction is weak and should be omitted. In addition, the authors note that ESR2 has not been detected in the songbird HVC in a previous study. To further demonstrate the expression of ESR2 (and SP8) in HVC, it would be useful to plot their expression from the microarray data across the different testosterone conditions.

      We repeated the coexpression study using confocal microscopy and fluorescent RNAScope in situ hybridization, which is now reflected in the revised Figure 5 and a new Figure 5 - Supplement Figure 1. We have also moderated our statement regarding the sparse co-expression of ESR2 and SP8 in HVC neurons. While the presence of co-expressing neurons may provide some anatomical basis for the bioinformatic findings, we have been cautious in our interpretation and have stated that "SP8 and ESR2 mRNAs exhibited low expression levels in HVC, co-localizing in a subset of cells, predominantly GABAergic cells" (lines 369-370). We have removed the speculation about potential protein interaction based on mRNA distribution. Additionally, we have highlighted that SP8 and ESR2 were differentially upregulated at T14d (lines 362-363).

      1.4) My final concern lies in the interpretation of these results as generalizable to other sex hormone-modualated behaviors. On lines 452-455, the authors write, "This suggests that the testosterone (or estrogen)-triggered induction of adult behaviors, such as parental behavior and courtship, requires a much more extensive reorganization of the transcriptome and the associated biological functions of the brain areas involved than previously thought.". The experiments and argument likely apply to other neural systems to undergo large seasonal fluctuations in sex hormones and similar morphological changes. However, the authors argue that the large number of transcriptional changes seen here may generalize broadly to sex hormone modulated adult behaviors. I think there are a couple of problems with this argument. First, as described here and in past work, testosterone drives major morphological changes the song system of adult canaries; such dramatic changes are not seen for instance in sex hormone-receptive areas underlying mating behavior in adult mammals. Similarly, the study introduced testosterone into female birds which drives a greater morphological change in HVC relative to similar manipulations in males, which again may account for the large number of differentially expressed genes. I would temper the generality of these results and note how the experimental and biological differences between this system and other sex hormone-responsive systems and behaviors may contribute to the observed transcriptional differences.

      We modified this statement in lines 473-478: “The testosterone-driven changes in female HVC morphology and function represent some of the most notable modifications known in the vertebrate brain. However, how this extensive, testosterone-induced gene regulation in the HVC applies to other seasonally testosterone-sensitive brain areas remains to be seen. Endpoint analysis of testosterone-induced singing in male canaries during the non-reproductive season also indicates considerable regulation of HVC transcriptomes (Frankl-Vilches et al., 2015; Ko et al., 2021)”.

      Reviewer #2 (Public Review):

      First, we would like to express our gratitude to Reviewer #2 for the constructive feedback. We have addressed the concerns in detail below:

      2.1). The bulk of the manuscript details WGCNA, GO terms, and promoter ARE/ERE motif abundance, using the initial pairwise comparisons for each timepoint as input lists. However, there are no p/adjp values provided for these pair-wise comparisons that form the basis of all subsequent analyses. Nor are there supplementary tables to indicate how consistent the replicates are within each group or how abundantly the genes-of-interest are expressed. With the statistical tests used here, and the lack of relevant information in the supplementary tables, I cannot determine if the data support the authors’ conclusions. These omissions mar what is otherwise a conceptually intriguing line of investigation.

      We appreciate the reviewer’s concerns. Please refer to our response addressing this point and the subsequent one (2.2) together in the section below.

      Reviewer #3 (Public Review):

      We appreciate the positive feedback from the reviewer and below addressed the issues pointed out by the reviewer.

      3.1) My biggest concern is the sample size. Most of the time points only have 5 or 6 individuals represented, and I question whether these numbers provide sufficient statistical power to uncover the effects the authors are trying to explore. This is a particular problem when it comes to evaluating the supposed "transient" of testosterone on gene expression. There is currently little basis for distinguishing such effects from noise that accrues because of low power. This can be a major problem with studies of gene expression in non-model species, like canaries, where among-individual variability in transcript abundance is quite high. Thus, it is possible that one or two outliers at a given time point cause the effect testosterone at this time point to become indistinguishable from the controls; if so, then a gene may get put into the transient category, when in fact its regulation was not likely transient.

      We acknowledge that our sample sizes may appear moderate. To address the concern regarding temporal regulation analysis, we followed Reviewer 3's suggestion and conducted a probe-level power analysis (point 2 of recommendations for the authors; labelled as point 3.9 below). We then excluded differentially expressed genes with a power less than 0.8 prior to conducting temporal classification. Consequently, 93% of our differentially expressed genes demonstrated a power ≥ 0.8 (9025/9710). Following further classification by temporal regulation pattern, we identified 29 constantly upregulated, 41 constantly downregulated, 39 dynamically regulated, and 8916 transiently regulated genes. If we apply a stricter constraint by requiring each differentially expressed gene to have at least two probe-sets with a power ≥ 0.8, 83% of differentially expressed genes (8033/9710) still have sufficient power.

      We recognize that our sample size may not be sufficient to detect weakly differentially expressed genes. However, we have intentionally excluded these genes from the beginning (those with |log2(fold change)| ≤ 0.5 were excluded).

      The scenario outlined by the reviewer, where outliers might cause the effect of testosterone to blend with controls, leading to misclassification, is indeed plausible. This could occur either because the genes are weakly regulated, or because the power to detect differential expression is insufficient, thus preventing these genes from surpassing the threshold to be deemed significantly differentially expressed. However, this also illustrates that the effect of testosterone does not regulate every gene in the same way.

      We have appended a column indicating high power genes (≥ 0.8) in the DiffExpression.tsv file, available in the Dryad repository. The power analysis has been incorporated to the method section at lines 801-808 and result section at lines 188-192.

      3.2) More on the transient categorization. Would a gene whose expression is not immediately upregulated (within 1 hour), but is upregulated later on (say in the 14d group) be considered transient? If so, this seems problematic. Aren’t the authors setting the null expectation of "non-transient" as a gene that does not increase immediately after 1 hour of treatment? The authors even recognize that it is quite surprising that gene expression changes after an hour. It may be that some genes whose regulation is classified as transient are simply slower to upregulate; but, really, would we say their expression in transient per se? Maybe I’m misunderstanding the categorizations?

      We appreciate the reviewer's insightful discussion regarding the transient categorization. We understand that it is indeed more challenging for a gene to be classified as constantly regulated than transiently regulated, due to smaller effects by testosterone or being undetectable owing to low power. To address this concern, we further dissected the transiently regulated category by reporting the number of time points at which a gene is differentially expressed in Figure 2 - Figure supplement 1. Approximately half of the transiently regulated genes were only regulated at one time point, further illustrating that the effect of testosterone on gene expression was not constant during the time window we examined (see lines 184 - 187).

      3.3) The authors don’t fully explain the logic for using females in this study to measure a "male-typical" behavior (singing). My understanding is that females have underlying circuitry to sign, and T administration triggers it; thus, this situation that creates a natural experiment in which we can explore T’s on brain and behavior, unlike in males which have fluctuating T. First, it might be good to clarify this logic for readers, unless perhaps I’m misunderstanding something. Second, I found myself questioning this logic a little. Our understanding of basic sex differences and the role that steroid hormones play in generating them has changed over the last few decades. There are, for example, a variety of genetic factors that underlie the development of sex differences in the brain (I’m especially thinking about the incredible work from Art Arnold and many others that harness the experimental power of the four core genotype mice). Might some of these factors influence female development, such that T’s effects on the female brain and subsequent ability to increase HVC size and sing is not the same as males.

      Indeed, sex-chromosome dosage compensation is absent in birds leading to higher Z-chromosomal gene expression in males. We demonstrated substantial sex differences in gene expression in our earlier work [Ko, M.-C., Frankl-Vilches, C., Bakker, A., Gahr, M., 2021. The Gene Expression Profile of the Song Control Nucleus HVC Shows Sex Specificity, Hormone Responsiveness, and Species Specificity Among Songbirds. Frontiers in Neuroscience 15].

      We have revised the introduction (lines 96-98) to clarify our rationale for using female canaries as a model for adult behavioral development, not as a model for male canaries. After testosterone treatment, these females start to sing, with song structure developing over time, similar to male seasonal progression. This approach eliminates the confounding effect of fluctuating testosterone levels seen in males, supported by distinct HVC transcriptomes in testosterone-implanted singing female canaries compared to males (Ko et al., 2021).

      The revised paragraph reads as below: Female canaries (Serinus canaria) are typically non-singers, with their spontaneous songs displaying less complexity than their male counterparts (Hartley et al., 1997; Herrick and Harris, 1957; Ko et al., 2020; Pesch and Güttinger, 1985). Despite their infrequent singing, these females possess the necessary underlying circuitry that can be activated by testosterone. Following testosterone treatment, these females start to produce simple songs, which gradually evolve in structure over weeks—paralleling the seasonal progression of male singing (Hartog et al., 2009; Ko et al., 2020; Shoemaker, 1939; Vallet et al., 1996; Vellema et al., 2019). Moreover, testosterone induces the differentiation of song control-related brain nuclei in adult female canaries, a critical step for song development (Fusani et al., 2003; Madison et al., 2015; Nottebohm, 1980). In this study, we focus on these testosterone-treated female canaries as a model for adult behavioral development rather than a model for male canaries. This unique model allows us to examine transcriptional cascades in parallel with the differentiation of the song control system and the progression of song development, without the confounding impact of fluctuating testosterone levels seen in males, which often results in considerable individual differences in the non-reproductive season baseline singing behavior. This approach is backed by the observation that the HVC transcriptomes of testosterone-implanted singing female canaries are distinct from those of singing males (Ko et al., 2021).

      3.4) I was surprised by the authors assertion that testosterone would only influence several tens or hundreds of genes. My read of the literature says that this is low, and I would have expected 100s, if not 1,000s, of genes to be influenced. I think that the total number of genes influenced by T is therefore quite consistent with the literature.

      We apologize for any confusion caused by our statement. We did not mean to imply that testosterone only influences several tens or hundreds of genes, but rather that we did not expect such an extensive transcriptional regulation in the HVC by testosterone. We have clarified this in our revised manuscript, specifically in lines 450-451. Thank you for helping us to clarify this point.

      3.5) I found the GO analyses presented herein uncompelling. As the authors likely know, not all GO terms are created equally. Some GO terms are enriched by hundreds of genes and thus reflect broad functional categories, whereas other GO terms are much more specific and thus are enriched by only a few genes. The authors report broad GO terms that don’t tell us much about what is happening in the HVC functionally. This is particularly the case when a good 50% of the genome is being differentially regulated.

      We appreciate the reviewer's comment. We have added KEGG pathway enrichment analysis in Figure 3 - Figure supplement 1 as an alternative. However, we believe that the GO term enrichment results still provide valuable insights, and therefore we have retained them in Fig. 3.

      3.6) The Genomatix analyses are similarly uncompelling. This approach to finding putative response elements can uncover many false positives, and these should always be validated thoroughly. Don’t get me wrong-I appreciate that these validations are not trivial, and I value the authors response element analysis.

      We appreciate the reviewer's comment on the presence of AR or ER motifs in promoters and acknowledge that in mammals, AR and ER predominantly bind at distal enhancers rather than promoters. Our analysis focused on promoter regions due to the limitations of available tools and resources for our study species. We understand that this approach may not capture the full complexity of AR and ER regulation. We have revised our manuscript to note the limitations of our approach and clarify that the presence of AREs and EREs alone is not indicative of active receptor binding or direct regulation (lines 416-427).

      3.7) I’m sceptical about the section of the paper that speculates about modification of steroid sensitivity in the HVC. These conclusions are based on analyses of mRNA expression of AKR1D1, SRD5A2, and the like. However, this does not reflect a different in the capacity to metabolize steroids, or at least there is little evidence to suggest this. Note that many of these transcripts have different isoforms, which could also influence steroidal metabolism.

      We agree that mRNA expression levels of AKR1D1, SRD5A2, and other transcripts involved in steroid metabolism do not necessarily reflect changes in steroid metabolizing capacity. However, we believe that these changes in mRNA expression are indicative of potential changes in steroid sensitivity in the HVC, which could affect the neural response to steroids. We acknowledge that isoform differences of these transcripts may influence steroid metabolism and further studies are necessary to confirm our findings and elucidate the mechanisms underlying the observed changes in gene expression. In response to this comment, we have amended the text in lines 245-249 to reflect this consideration.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors have compiled and analysed a unique dataset of patients with treatment-resistant aggressive behaviours who received deep brain stimulation (DBS) of the posterior hypothalamic region. They used established analysis pipelines to identify local predictors of clinical outcomes and performed normative structural and functional connectivity analyses to derive networks associated with treatment response. Finally, Gouveia et al. perform spatial transcriptomics to determine the molecular substrates subserving the identified circuits. The inclusion of data from multiple centres is a notable strength of this retrospective study, but there are current limitations in the methodology and interpretation of findings that need to be addressed.

      1) The validation of findings is heterogeneous and inconsistent across analysis pipelines. While the authors performed non-parametric permutation testing during sweet-spot mapping, structural and functional connectivity were validated using a 'four-fold consistency analysis'. The latter consists of a visual representation of streamlines and peak intensities after randomly dividing data into four groups, the findings were not validated quantitatively. If possible, the authors should apply permutation analysis in alignment with sweet-spot mapping and demonstrate the predictive ability of their identified networks in a LOO or k-fold cross-validation paradigm as carried out by similar studies. Given that the data has been derived from multiple centers, the prediction of left-out cohorts based on models generated by the remaining cohorts could be another means of validation. If validation is not possible, the authors should clearly state the limitations of their approach.

      We appreciate the comment. We have now improved the validation of our connectomics analyses and removed the four-fold consistency analysis. For the functional connectivity analysis, we performed a 1000 permutation test (p<0.05). Similar brain areas were detected in the corrected and uncorrected maps. For the structural connectivity analysis, we used False Discovery Rate (FDR) correction at a significant level of p<0.001, as it is not feasible to perform a 1000 permutation test with this data. The structural connectome is composed of 12 million fibres, and every single permutation takes approximately 4 hours to be completed using our most powerful computational system. To perform 1000 permutations, it would take at least 4000 hours (i.e. 167 days or 5.5 months) of uninterrupted analysis to complete the test. However, it is important to highlight that an FDR correction at the level of p<0.001 is an extremely stringent method. This means that of the 23,000 fibres detected as being touched by the VATs, only 23 would be incorrect, while the remaining 22,977 are correct. Here again, we observed many similarities between the uncorrected and corrected maps, with the main anatomical structures being detected in both. The Methods section and Figures 4 and 5 were revised to reflect these changes.

      2) In addition to a 'four-fold consistency analysis', functional connectivity was evaluated using LOOCV in a priori identified ROIs. Their network analysis, however, revealed a far more extensive network encompassing cortical, subcortical, and cerebellar structures. To avoid selection bias the authors should incorporate identified structures into their analysis and apply appropriate means of validation.

      We thank the reviewer for this valuable suggestion. We originally did not explore the various significant areas but performed a more focused analysis intended to demonstrate that regions of the known ‘aggression network’ are indeed implicated in our findings. We performed a new analysis exploring the correlation between symptom improvement and the functional connectivity of all the areas described in Figure 5 (i.e., functional connectivity map). To this aim, we extracted individual connectivity values from the peak within each significant region and performed the same additive linear model, incorporating the functional connectivity of each area as well as the age of the patients to estimate individual symptomatic improvement. In addition, we performed a complete exploratory analysis considering the connectivity of any 2 brain structures and age. The resulting matrix shows to what extent functional connectivity to any two areas can be used to estimate clinical outcomes. Interestingly, this new analysis revealed the Periaqueductal Grey matter (PAG) to be the most important functionally connected area when investigated alone or in combination with brain structures critically involved in the regulation of emotional responses, namely the amygdala, anterior cingulate cortex, bed nucleus of the stria terminalis, nucleus accumbens, orbitofrontal cortex and fusiform gyrus. Also, the significance of the PAG connectivity was retained during leave-one-out cross-validation (LOOCV). The Methods, Results, Discussion and Figure 6 were revised. In addition, we added a new Table 2 and Supplementary File 1 to describe the new analysis and results.

      3) Functional connectivity mapping: how were R-maps generated? The authors mention that patient-specific R-maps were p-thresholded and corrected for multiple comparisons, but it is not clear how group-level maps were generated. How did the authors perform regression on these maps? Were voxels that did not survive thresholding excluded?

      This is a multiple-step analysis. First, it is necessary to localize the electrodes in each patient’s brain and estimate the volume of activated tissue (VAT) observed when stimulation parameters associated with symptomatic improvement are used. The VATs are then used as seeds for the next steps, during which we investigate how much functional influence the VTAs have on the other areas of the brain (i.e., individual r-map). This is done by correlating the BOLD time course of the VAT’s seed with the BOLD time course of all other voxels in the brain. The individual r-maps are then corrected for multiple comparisons to exclude voxels with potentially spurious correlations, resulting in an individual r-map that only included voxels surviving Bonferroni correction at the level of p<0.05. Finally, to create group-level maps, a voxel-wise linear regression analysis was performed to investigate whether each voxel of the map exerts more or less influence (corrected individual r-map with the functional connectivity of the patient’s VAT) or is more or less related to the clinical outcome (i.e. individual improvement). The last step is a permutation correction resulting in a significant group-level functional connectivity map (ppermute<0.05). We modified the Methods section and added a new Figure 1-figure supplement 1 illustrating this analysis.

      4) The authors determined that age was a significant prédictor of the outcome, but it is unclear whether certain age groups presented with distinct etiologies underlying their aggressiveness. For example, aggression in epilepsy may show a better response to DBS as opposed to schizophrenia. How does patient outcome change when stratifying according to etiology? How does model performance change when controlling for etiology? The authors should include the etiology of aggressiveness in Table 1.

      This is an interesting point. We observed a similar distribution between the pediatric and adult populations in relation to the most common etiologies reported. Epilepsy was the most frequent diagnosis in both populations (pediatric: 50%, adult: 62%), followed by autism spectrum disorder (pediatric: 34%, adult: 24%). The remaining etiologies were largely composed of single cases. A similar proportion of intellectual disability was also observed in pediatric and adult populations. Severe cases were observed in 75% of pediatric and 85% of adult patients. Moderate disability was present in 25% of pediatric and 15% of adult patients. Since several diagnoses were unique to some patients, the addition of this information to Table 1 could result in the identification of the patient. Thus, to preserve anonymity, the diagnoses were added to the end of Table 1 from more to less frequent. We have also revised the Results and Discussion sections to address this concern.

      5) Stimulation parameters. The authors report average pulse widths of 219 µs and 142µs respectively, which is up to 4-fold higher as compared to DBS settings used conventionally in movement disorders and will significantly alter the volume of activated tissue. Did the authors account for the drastic increases in pulse width during VAT modeling?

      We thank the reviewer for raising this point about the volume of activated tissue (VAT) modelled and the unusual pulse width observed in some patients in this cohort. These patients presented stimulation-induced sympathetic side effects when DBS was set with higher frequencies (e.g. increased heart rate and blood pressure). The chosen final parameters were the ones associated with a clinical benefit without generating side effects. There are a multitude of ways to estimate the VATs, from advanced axon cable models – the gold standard, which simulate axon membrane dynamics and require patient-specific diffusion-weighted imaging and tremendous computing power 1 - to simple heuristics-based models that estimate the rough extent of a VAT based on stimulation parameters without constructing an actual spatial model 2–4. The model employed in our study (and in a number of previous publications by our group 5–10) was the FieldTripSimBio ‘E-field norm’ finite element method (FEM) model. This model, which was first described by Horn et al. 11 and is freely available in Lead-DBS (https://www.lead-dbs.org/), strikes a balance between the sophisticated axon cable models and the simpler heuristic models. In particular, it constructs an electric field (E-field, by applying an electric field strength threshold, or activation threshold) and calculates the VAT associated with specific voltage settings and contact configurations, taking into account the conductivity of surrounding brain tissue and electrode components. Notably, studies comparing VAT modelling techniques 12 showed that ‘E-field norm’ FEM models closely approximate (<0.1 mm difference) the gold standard axon cable models in terms of the size of VATs constructed for monopolar stimulation settings. However, it should be acknowledged that the FieldTripSimBio model in Lead-DBS does not allow the user to specifically enter values for pulse width. Instead, it employs a standard activation/electric field strength threshold (0.2 V/mm) that reflects a combination of commonly modelled axon diameters (roughly 3.5 μm) and pulse width values (i.e., 60-90 μs). This threshold is based on work by researchers such as Astrom et al. 13 and reflects a ‘middle ground’ value that takes into account the fact that any VAT model will necessarily be an imperfect approximation of how electrical stimulation interfaces with brain tissue, depending heavily on aspects such as the diameter of local axons. Nonetheless, it is certainly understood that increased pulse width does meaningfully increase the effective range of stimulation (thus translating to a larger VAT) by lowering the activation threshold of nearby axons 12.

      Given that our patient cohort included a small number of patients who were stimulated with higher pulse widths than the values assumed by our model (90 μs), it is reasonable to wonder whether we underestimated the size of these patients’ VATs. To address this aspect, we modelled these patients’ VATs using a simpler heuristic model 2 that does allow specific pulse width values to be selected by the user. More specifically, we computed a range of VATs for these patients using varied pulse width values (ranging from 90 μs up to their actual values). Not surprisingly, this endeavour did yield larger VATs when higher pulse widths were used. On average, the absolute difference in VAT diameter between 90 μs and 450 μs (the largest pulse width observed in this cohort) versions of these patients’ VATs was 2 mm. To check whether or not this difference could have potentially impacted our results, we repeated our probabilistic mapping analysis using altered VATs (specifically, VATs that were enlarged by 2 mm in diameter) for the patients with higher pulse widths. This new repeat analysis yielded a very similar average map to the original analysis: the overall map pattern and location/values of the peak corresponding to the most efficacious area for maximal symptom alleviation remaining unaltered, and only a few voxels on the periphery of the map changing in value by a couple of percentage points. This new supplementary analysis indicates that our results were not meaningfully altered by the unusual pulse width observed in these patients. We modified the Methods section to address some of these aspects and added a new Figure 3-figure supplement 2 illustrating both voxel efficacy maps.

      6) Imaging transcriptomics. The methods described lack detail: How did the authors account for differences in expression across donors, samples, and regions during preprocessing of the Allen Human Brain Atlas? How was expression data collapsed into regions of interest? Did the authors apply any normalization? Recent publications have introduced reproducible workflows for processing and preparing the AHBA expression data for analysis that is publicly available.

      7) 'genes with similar patterns of spatial distribution to the TFCE map were compiled in an extensive list'. It is unclear why authors used TFCE maps for spatial transcriptomics as opposed to the functional connectivity map featured in Figure 5. How was similarity measured between the TFCE map and the AHBA? How were candidate genes identified? Please provide a more comprehensive description of the analysis pipeline.

      We apologize for the short description of this analysis. We performed a gene set analysis using the abagen toolbox (https://abagen.readthedocs.io/en/stable/index.html) to investigate genes with a spatial pattern distribution similar to one of clinically relevant functional connectivity. For this analysis, we used the Allen Human Brain Atlas (https://alleninstitute.org/) microarray data describing the cortical, subcortical, brainstem and cerebellar localization of over 20,000 genes in the human brain (3702 anatomical locations from 6 neurotypical adult brains) 14–17, along with a cell-specific aggregate gene set 18. These data are provided preprocessed, with gene expression values normalized across all brains, and registered to standard MNI space, allowing for a direct comparison between the spatial pattern of gene expression and the functional connectivity map (https://human.brain-map.org/microarray/search) 15. The TFCE maps were used to create clusters of clinically relevant functional connectivity with a spatial extent that overlaps with the anatomical locations from which microarray data was obtained. We parcellated both datasets (results of functional connectivity analysis and Allen Gene Atlas) according to the Harvard-Oxford brain atlas and correlated the spatial distribution of gene expression with the spatial distribution of the results of the functional connectivity mapping. The resultant list of candidate genes was used as input in gene ontology tools to investigate the associated biological processes and cell types. It is important to highlight that this process involves 2 corrections for multiple comparisons using FDR at q<0.005; one correction occurs at the level of the gene list to include only the most significant genes in the gene ontology analysis; a second correction occurs at the level of the gene ontology analysis to consider only the most significant biological processes. We have included some of these details in the revised Methods section.

      8) What do the bar plots in Figure 7 (left) represent? P-values? The authors should label the axes to make this clear to the reader.

      9) Interprétation of imaging transcriptomics: The authors identify a therapeutic circuit associated with deep brain stimulation of the posterior hypothalamic area, however, it is unclear how to reconcile genes associated with hormones, inflammation, and plasticity in this context. The authors mention and discuss genes implicated in hormonal processing, specifically oxytocin. The results provided in Figure 7, however, do not support this finding and it is unclear how the authors identified genes linked to oxytocin. In addition, the authors identified reductions in the number of microglia and astrocytes, while oligodendrocytes were overexpressed relative to the expected distribution of genes per cell type. These findings were attributed to DBS effects, however, both connectomic and transcriptomic data are acquired from healthy subjects, which suggests a physiological deficit/enrichment in a therapeutic circuit. How do the authors interpret findings given that no electrode implantation and stimulation were performed?

      The analysis of normative datasets (functional and structural connectomics and spatial transcriptomics) is based on the idea of better understanding mechanisms of treatment considering our current knowledge of the average human brain. Unlike patient-specific studies in which imaging is acquired from a single patient or genetic profiles are extracted from tissue samples, these normative analyses rely on high-quality “atlases” derived from healthy subjects. In the case of functional and structural connectivity, these atlases are calculated from very large cohorts of subjects (around 1000 brain scans). Thus, imaging connectomics investigates the pattern of brain activity and structural connectivity related to a specific area of the brain (in this case, the volume of tissue activated (VATs) with DBS) and correlate these data with clinical outcomes to shed light on potential mechanisms of action. Similarly, the spatial transcriptomic analysis identifies spatial correlations between patterns of gene expression and brain characteristics detected by MRI 19 (in this case, the spatial pattern of functional connectivity) to investigate possible genetic underlying mechanisms. It is important to highlight that previous studies have shown that normative analyses yield results that are similar to the ones observed using patient-specific data 20–22. In the specific case of imaging connectomics, It has been shown that normative datasets can be used to create probabilistic models of optimal connectivity associated with patients’ outcomes that are meaningful to predict outcomes in patient-specific connectivity data 21. Thus, these exploratory data-driven approaches strive to simulate the presumed fingerprint that a particular patient’s individualized DBS intervention might modulate. They also allow the investigation of possible mechanisms of action in a large, previously inaccessible cohort of patients whose individual data are available. We apologize for the inaccuracy in Figure 7. Along with improving the Discussion section of the manuscript, we included the label for the bar plots in the left panel to improve the clarity of the graph and added the missing result from the KEGG 2021 Human Library that shows the oxytocin signalling pathway.

      10) Data availability. Code used for data processing should be made openly available or shared as source data along with the Figures that were generated using the code. Sweet-spot, structural, and functional connectivity maps should be shared openly.

      All tools and codes necessary for localizing the electrodes, estimating the volume of activated tissues, and analyzing imaging connectomics are freely available in Lead-DBS (https://www.lead-dbs.org/), a toolbox designed for DBS electrode reconstructions and computer simulations based on postoperative imaging. All codes for spatial transcriptomics are freely available in abagen (https://abagen.readthedocs.io/en/stable/), a toolbox designed to analyze the Allen Brain Atlas genetics data. Along with the codes, the websites for these tools provide manuals describing the step-by-step procedure for successful analysis. The datasets were made freely available at Zenodo (doi: 10.5281/zenodo.7344268). We improved our Data Availability Statement to address this concern.

      Reviewer #2 (Public Review):

      Deep brain stimulation (DBS) is an important, relatively new approach for treating refractory psychiatric illnesses including depression, addiction, and obsessive-compulsive disorder. This study examines the structural and functional connections associated with symptom improvement following DBS in the posterior hypothalamus (pHyp-DBS) for severe and refractory aggressive behavior. Behavioral assessments, outcome data, electrode placements, and structural and functional (resting-state) imaging data were collected from 33 patients from 5 sites. The results show structural connections of the effective electrodes (91% of patients responded positively) were with sensorimotor regions, emotional regulation areas, and monoamine pathways. Functional connectivity between the target, periaqueductal gray, and amygdala was highly predictive of treatment outcome.

      Strengths.

      This dataset is interesting and potentially valuable.

      Weaknesses.

      The figures seem to indicate that electrodes and symptom improvement is located lateral to the hypothalamus, perhaps in the subthalamic nucleus (STN). This is might explain why the streamlines from the tractography are strongest in motor regions. The inclusion of the monoaminergic based on the tractography is not warranted, as the resolution is not sufficient to demonstrate the distinction between the MFB (a relatively small bundle) and others flowing through this region to the brainstem.

      This is an interesting point. The sweet spot identified in this work is indeed located in the posterior-inferior-lateral region of the posterior hypothalamic area, reaching the most superior part of the red nucleus, without including the STN. It is important to highlight that the voxel-efficacy mapping only shows voxels associated with a minimum of 50% symptomatic improvement following treatment. Thus, the areas not touching the red nucleus are also associated with excellent symptom alleviation. Although the structural connectivity mapping revealed tracts involved in motor and sensory information, it also showed tracts known to be involved in the regulation of emotions, such as the MFB, the Amygdalofugal Pathway and the ALIC. It is worth noting that this analysis is excellent for segregating the fibre tracts as relevant or not associated with a clinical improvement, but it is not capable of tearing apart the system to determine which of those are necessary for symptom alleviation. As a result, it is not possible to determine whether the motor projections are stronger or more relevant than others. However, the structural connectivity analysis presented here contributes to the body of knowledge on the network of aggressive behaviour and provides clinically relevant data that can be useful to improve future patient outcomes.

      We agree with the reviewer that the engagement of the motor system is indeed highly relevant for the reduction of aggressive behaviours, as we have previously shown that aggressive behaviour is highly correlated with motor agitation 23,24. Additionally, in the context of ASD, self-injury behaviour is defined as a type of repetitive/stereotypic behaviour that results in physical injury to the patient’s own body. In relation to the involvement of the monoaminergic system, we would like to apologize for not being clear in the discussion of our findings. Although the functional and structural connectivity maps are related, they provide different means of exploring distinct aspects of the connectivity profile of each VAT. While the structural connectivity map may elucidate symptom improvement via direct fibre modulation (i.e. fibres that touch vs fibres that do not touch the VAT), the functional connectivity map investigates the functional dynamics of the network via BOLD signals (functional MRI). In this manuscript, we showed the functional connectivity (not fibre tracts) of the VATs with areas known to regulate monoamine production, such as the Raphe nuclei and the Substantia Nigra. Both serotonin and dopamine are critically involved in the control of aggressive behaviours, being the target of the main medications used to treat these symptoms in several patient populations. To address all the raised concerns, we incorporated a few sentences in the discussion, highlighting the relevance of the motor system and some limitations of our analysis. We also added a new Figure 3-figure supplement 1 and a discussion on the position of the sweet spot in relation to the red nucleus and subthalamic nucleus.

      REFERENCES: 1. Gunalan, K., Howell, B. & McIntyre, C. C. Quantifying axonal responses in patient-specific models of subthalamic deep brain stimulation. Neuroimage 172, 263–277 (2018). 2. Dembek, T. A. et al. Probabilistic mapping of deep brain stimulation effects in essential tremor. Neuroimage Clin 13, 164–173 (2017). 3. Kuncel, A. M., Cooper, S. E. & Grill, W. M. A method to estimate the spatial extent of activation in thalamic deep brain stimulation. Clin. Neurophysiol. 119, 2148–2158 (2008). 4. Mädler, B. & Coenen, V. A. Explaining clinical effects of deep brain stimulation through simplified target-specific modeling of the volume of activated tissue. AJNR Am. J. Neuroradiol. 33, 1072–1080 (2012). 5. Elias, G. J. B. et al. Probabilistic Mapping of Deep Brain Stimulation: Insights from 15 Years of Therapy. Ann. Neurol. 89, 426–443 (2021). 6. Germann, J. et al. Brain structures and networks responsible for stimulation-induced memory flashbacks during forniceal deep brain stimulation for Alzheimer’s disease. Alzheimers. Dement. 17, 777–787 (2021). 7. Elias, G. J. B. et al. Mapping the network underpinnings of central poststroke pain and analgesic neuromodulation. Pain 161, 2805–2819 (2020). 8. Gouveia, F. V. et al. Case report: 5 Years follow-up on posterior hypothalamus deep brain stimulation for intractable aggressive behaviour associated with drug-resistant epilepsy. Brain Stimul. 14, 1201–1204 (2021). 9. Coblentz, A. et al. Mapping efficacious deep brain stimulation for pediatric dystonia. J. Neurosurg. Pediatr. 27, 346–356 (2021). 10. M Oliveira, L. et al. Probabilistic characterisation of deep brain stimulation in patients with tardive syndromes. J. Neurol. Neurosurg. Psychiatry (2021) doi:10.1136/jnnp-2020-324270. 11. Horn, A. et al. Connectivity Predicts deep brain stimulation outcome in Parkinson disease. Ann. Neurol. 82, 67–78 (2017). 12. Duffley, G., Anderson, D. N., Vorwerk, J., Dorval, A. D. & Butson, C. R. Evaluation of methodologies for computing the deep brain stimulation volume of tissue activated. J. Neural Eng. 16, 066024 (2019). 13. Astrom, M., Diczfalusy, E., Martens, H. & Wardell, K. Relationship between neural activation and electric field distribution during deep brain stimulation. IEEE Trans. Biomed. Eng. 62, 664–672 (2015). 14. Hawrylycz, M. J. et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489, 391–399 (2012). 15. Arnatkeviciute, A., Fulcher, B. D. & Fornito, A. A practical guide to linking brain-wide gene expression and neuroimaging data. Neuroimage 189, 353–367 (2019). 16. Arnatkeviciute, A., Markello, R. D., Fulcher, B. D., Misic, B. & Fornito, A. Toward Best Practices for Imaging Transcriptomics of the Human Brain. Biol. Psychiatry 93, 391–404 (2023). 17. Markello, R. D. et al. Standardizing workflows in imaging transcriptomics with the abagen toolbox. Elife 10, (2021). 18. Seidlitz, J. et al. Transcriptomic and cellular decoding of regional brain vulnerability to neurogenetic disorders. Nat. Commun. 11, 3358 (2020). 19. Fornito, A., Arnatkevičiūtė, A. & Fulcher, B. D. Bridging the Gap between Connectome and Transcriptome. Trends Cogn. Sci. 23, 34–50 (2019). 20. Arnatkeviciute, A., Fulcher, B. D., Bellgrove, M. A. & Fornito, A. Imaging Transcriptomics of Brain Disorders. Biol Psychiatry Glob Open Sci 2, 319–331 (2022). 21. Wang, Q. et al. Normative vs. patient-specific brain connectivity in deep brain stimulation. Neuroimage 224, 117307 (2021). 22. Elias, G. J. B. et al. Normative connectomes and their use in DBS. in Connectomic Deep Brain Stimulation 245–274 (Elsevier, 2022). 23. Gouveia, F. V. et al. Bilateral Amygdala Radio-Frequency Ablation for Refractory Aggressive Behavior Alters Local Cortical Thickness to a Pattern Found in Non-refractory Patients. Front. Hum. Neurosci. 15, 653631 (2021). 24. Venetucci Gouveia, F. et al. Case report: 5 Years follow-up on posterior hypothalamus deep brain stimulation for intractable aggressive behaviour associated with drug-resistant epilepsy. Brain Stimul. (2021) doi:10.1016/j.brs.2021.07.062. 25. Gouveia, F. V. et al. Amygdala and Hypothalamus: Historical Overview With Focus on Aggression. Neurosurgery 85, 11–30 (2019). 26. Gouveia, F. V. et al. Longitudinal Changes After Amygdala Surgery for Intractable Aggressive Behavior: Clinical, Imaging Genetics, and Deformation-Based Morphometry Study-A Case Series. Neurosurgery 88, E158–E169 (2021). 27. Yan, H. et al. Deep brain stimulation for extreme behaviors associated with autism spectrum disorder converges on a common pathway: a systematic review and connectomic analysis. J. Neurosurg. 1–10 (2022). 28. Knotkova, H. et al. Neuromodulation for chronic pain. Lancet 397, 2111–2124 (2021). 29. Gray, A. M. et al. Deep brain stimulation as a treatment for neuropathic pain: a longitudinal study addressing neuropsychological outcomes. J. Pain 15, 283–292 (2014). 30. Miczek, K. A., Fish, E. W., De Bold, J. F. & De Almeida, R. M. M. Social and neural determinants of aggressive behavior: pharmacotherapeutic targets at serotonin, dopamine and gamma-aminobutyric acid systems. Psychopharmacology 163, 434–458 (2002). 31. Gouveia, F. V. et al. Reduction of aggressive behaviour following hypothalamic deep brain stimulation: involvement of 5-HT1Aand testosterone. bioRxiv (2023) doi:10.1101/2023.03.20.533520.

    1. Author Response

      Reviewer #1 (Public Review):

      The strength of the manuscript is highlighted by the application of fractal formalism, which is commonly used in colloidal systems, in conjunction with MD simulation to study the phase separation of an IDP. The weakness lies in the fact that this study does not provide any discussion on how our understanding of the network structure and dynamical behavior of biomolecular condensates and their biological significance improves through this study. The experimental part remains weak, without any measurements of the dynamics of the condensates. Whether and how the formalism can distinguish between phase-separated condensates (WT) and classical protein aggregates (Y to A variant) remains unclear.

      We thank the Reviewer for their careful reading of the manuscript and their appreciation of the link between IDP phase separation and colloid chemistry. Establishment of a quantitative framework behind this link, as given by the fractal formalism, and a multiscale model of the spatial organization of a biomolecular condensate, derived from MD simulations in combination with fractal scaling, are indeed two of our main contributions. In particular, to the best of our knowledge, ours is the first atomistically resolved model of the spatial organization of a biomolecular condensate at an arbitrary scale. The key features of the proposed model, as elaborated in the Discussion of the revised manuscript (p. 18, 20-21), are the coexistence of differently sized clusters inside a condensate, and a quantitative prediction of a particular scaling of mass with cluster size (Figure 5A), as further discussed below. Moreover, our results also point to the possible formation of pre-percolation clusters with sizes below the resolution limit of typical microscopy experiments, in agreement with recent observations (https://doi.org/10.1073/pnas.2202222119).

      We agree that the full understanding of biomolecular condensates also requires a detailed treatment of the dynamical aspects. Following the Reviewer’s comments, we have provided significant new results in this regard and included an experimental characterization of fusion behavior (Videos 1, 2) and condensate dynamics by FRAP (Figure 1D, E and Figure 1—figure supplement 2) as well as a detailed analysis of diffusion and viscosity in the simulated systems (Figure 4C and Figure 4—figure supplement 1D-F). The newly performed FRAP experiments provide a direct measure of the condensate dynamics. Importantly, the measured recovery half-times for WT and R>K condensates resemble those of other well-characterized in vitro condensates. We have occasionally observed elongated, amorphous Y>A precipitates, albeit in low number and only at 50-fold higher concentration than the wild-type (45 mM and above, Figure 1C). While this may be consistent with the predictions of the fractal model and hint at the differences in mesoscopic organization between the WT and R>K condensates and the Y>A precipitates, the latter are rare and we are reluctant to draw major conclusions.

      Furthermore, we could show that the WT diffusion coefficient is lower than for either mutant (Figure 4C, and Supplementary File 2). Clearly, this difference is not due to the effect of protein size or a higher solvent viscosity, but primarily indicates protein slow-down due to the more extensive interactions with partners (reflected also in higher average valency, Figure 2D, or probability of interactions Figure 2—figure supplement 1D). The fact that the WT diffusion coefficient drops by about 20% over the last 0.3 µs of the MD trajectory also correlates with the formation of a single percolating cluster in the system (Figure 2C). This is an expected effect on protein diffusion upon crossing the percolation threshold (https://doi.org/10.1038/ncomms11817, https://doi.org/10.1021/acs.jpcb.7b08785). Moreover, the difference in the recovery dynamics observed for WT and R>K mutant can be interpreted using the proposed model. Namely, accurate fitting of FRAP data was only possible if using at least two components (Figure 1—figure supplement 2). According to https://doi.org/10.1016/j.tcb.2004.12.001, these components indicate the contribution of particle diffusion and interaction (binding). Thus, recovery of the centrally bleached condensates is faster for WT than for the R>K mutant, which can be related to the higher compactness of the WT particles across scales as compared to R>K. On the other hand, the FRAP results for the condensates bleached in the peripheral area highlight the contribution of the binding component. Indeed, the recovery is about 3-fold faster for the R>K mutant, which could potentially be related to the lower valency of the interactions and the ease of the replacement of inactivated fluorescent species and/or exchange with proteins in the bulk. A further connection of the developed model and condensate dynamics concerns the multimodal description of diffusion in biomolecular condensates, together with multimodal fitting of FCS and FRAP data as used recently for interpreting single particle tracking results (https://doi.org/10.1016/j.bpj.2021.01.001). Namely, the polydisperse nature of the protein phase as suggested by the model translates to multimodal diffusion, reflecting the dynamics of protein clusters of different size. For instance, regularization fits used for DLS autocorrelation curves assume a multimodal character of the diffusion and are interpreted to reflect a multimodal distribution of cluster sizes in condensates (https://doi.org/10.1073/pnas.2202222119).

      Finally, a way of testing the model prediction, which would merit a study in its own right, would involve static light scattering (SLS): a linearly decreasing scattering intensity as a function of the scattering vector in a log-log representation, as frequently seen for different colloidal systems, is expected by the fractal model. In fact, fractal dimension dF could directly be estimated from SLS experiments (https://doi.org/10.1038/339360a0) from the limiting value of scattering intensity for high values of the product of the scattering vector q and the average cluster size <Rg>. As a direct test of the predictions of the model, the experimental value of dF could then be compared with the predicted one. Moreover, techniques such as DLS and MALS could be used to measure independently masses and sizes of biomolecular condensates in vitro at different scales in order to test the validity of the particular scaling predicted by the fractal model. Such experiments are not trivial and are out of scope of the present study.

      Reviewer #2 (Public Review):

      A key aspect of the work is to use the simulations to explain differences between (i) dilute and dense phases and (ii) wild-type and mutant variants. Here, it would be important with a clearer analysis of convergence and errors to quantify which differences are significant.

      Following the Reviewer’s suggestion, we now provide an analysis of convergence and statistical significance. Specifically, in Supplementary File 1 “Technical summary” we now report the average value, standard deviation and a block-average measure of convergence for all the key observables analyzed, including radius of gyration (Rg), valency (n), and compactness (), for all modeled systems. Furthermore, in the revised manuscript, we now also include the analysis of protein translational diffusion constants and solution viscosity for all modeled systems to assess the ability of the simulations to capture protein dynamics realistically (Figure 4C, Figure 4—figure supplement 1D-F, Supplementary File 2, see also above). Moreover, we include in the revised version a new figure depicting time evolution of average compactness in the 24-copy systems (Figure 4—figure supplement 1C). Thus, it can be seen that the two key model parameters derived from MD simulations of the 24-copy system – protein valency and compactness – reach a stable plateau over the last 0.3 µs (Figure 2D and Figure 4—figure supplement 1C), which were used for final analyses, with block-averaged deviations of less than 10% throughout (see below for details). All the differences in these parameters between single-copy and 24-copy simulations, as well as those between WT and mutation simulations, were found to be significant with p-values < 2.2 10-16 according to the Wilcoxon rank sum test with continuity correction (details in Supplementary File 1). Finally, considering the sampling limitations implicit in most MD studies, we clearly recognize the possibility that with longer simulation times or more protein copies per simulation box, the simulated systems may show a qualitatively different behavior. However, we emphasize that our derivation of the formalism that links the features of simulated ensembles on the scale of 10s of nanometers with their behavior on the scale of 100s of nanometers and beyond is independent of such limitations. Once longer, larger and more accurate simulations become available, one will be able to apply the formalism without alteration and obtain a model of the spatial organization of the condensate on an arbitrary scale, starting just from the local features of individual proteins. We now discuss these details on pp. 10, 11, 13 of the revised manuscript.

      It would also be useful with a clearer description of how the analytical model is predictive, of which properties, and how they have been/can be validated. Which measurable quantities does the model predict?

      As pointed above, the model predicts the existence and provides a quantitative description of pre-percolation finite-size clusters (https://doi.org/10.1016/j.molcel.2022.05.018, https://doi.org/10.1073/pnas.2202222119). More generally, the model provides the fractal dimension (dF) of protein clusters and enables evaluation of different scale-dependent properties of clusters of arbitrary size, including protein density as a function of cluster size (Figure 5—figure supplement 1C, Figure 5C). Importantly, the fractal dimension can be used in combination with local MD simulations and cluster–cluster aggregation algorithms to derive a detailed model of the 3D organization of fractal clusters of a chosen size at atomistic resolution (Figure 5A, B, and Videos 4, 5, and 6). Such detailed structural understanding of the interior organization of a condensate can, for example, be used to evaluate cavity sizes and interpret partitioning experiments. Since the differences in the morphology of WT and mutant protein clusters propagate across length scales, they can even be qualitatively characterized by the analysis of microscopic images (e.g. circularity, Figure 1—figure supplement 1C, see also discussion above). Finally, static light scattering (SLS) experiments give the possibility to test the model directly, which will be the subject of our future work. Namely, the fractal formalism predicts linear behavior in the log-log representation of the SLS intensity vs. scattering vector curves, while dF, which can directly be evaluated from such experiments, providing a quantitative point of comparison between theoretical predictions and experiment (see above).

      In addition to these overall questions, a number of more specific suggestions follow below.

      Major:

      p. 7, line 120 (Fig. S1B) The proteins do not appear particularly pure based on the presented SDS PAGE analysis. How pure is the protein estimated to be, and is the presence of the other bands expected to affect e.g. the data presented in Fig. 1?

      We have quantified the purity of the constructs by densitometry of the Coomassie stained gels and included it in Figure 1—figure supplement 1A: in the case of WT and R>K, we achieve purity higher than 91%. Importantly, the observed LLPS behavior of the constructs is consistent with the simulation and in agreement with other studies on R>K substitutions (https://doi.org/10.1073/pnas.2000223117; https://doi.org/10.1016/j.molcel.2020.01.025; https://doi.org/10.1073/pnas.2200559119; https://doi.org/10.1016/j.jmb.2019.08.008). In the case of Y>A, we have obtained the least pure protein (~65%), and must note that the precipitates observed in the experiments of Figure 1C are only present at the protein concentrations that are 50-fold higher as compared to WT (45 mM and above). Therefore, at such high total protein concentration, we cannot exclude the possibility that there might be some contamination affecting the behavior of this construct.

      p. 7 & 8, lines 138-159: Has the method and energy function used to calculate the interact potential been validated by comparison to experiments, including studying the effect of varying the solvent? I see the computed error bars are very small, but am more interested in the average error when comparing to experiments. The numbers in water appear different from those e.g. reported by Krainer et al (https://doi.org/10.1038/s41467-021-21181-9), though the latter are also not immediately compared to experiments. Thus, it would be useful to know how much to trust these numbers.

      We thank the Reviewer for raising this important point. To the best of our knowledge, the absolute binding free energies between Y-Y, Y-R or Y-K sidechain analogs or complete amino acids have never been determined experimentally, preventing a direct validation of the computed values and/or an evaluation of the average error when comparing to experiments. On the other hand, we did compare our data against the PMF curves presented by Krainer et al. (https://doi.org/10.1038/s41467-021-21181-9) for R-Y and Y-Y and the general trends are largely similar. In particular, in both analyses the R-Y interaction is stronger than the Y-Y interaction across different conditions, except at zero salt in Krainer et al. where the two are similar. When it comes to exact quantitative differences between the studies, it should first be pointed out that Krainer et al. studied capped amino-acids, while we used amino-acid side-chain analogs. The difference in the observed binding strengths is in part certainly related to the contribution of the capped backbone. Second, the values in Krainer et al. refer to the depth of the free energy minimum in the obtained PMFs and not to the resulting G values, as in our method. The latter includes integration over the PMF and an assumption of a standard-state concentration, which could also lead to significant differences. Finally, the differences could also be due to the intrinsic properties of the interaction potentials used. In particular, the prominent free-energy minima for the R-Y pair in the Krainer et al. study could only be obtained after refitting of the original AMBERff03ws charges on the Y bound to R via semi-empirical quantum-chemical calculations. On the other hand, the interaction potential used in our study was not adjusted to the system at hand, but rather comes from a published, widely used force field, the OPLS-AA (https://doi.org/10.1021/ja9621760), that was independently tested and validated experimentally in multiple studies. For example, OPLS-AA exhibits the low average error in absolute hydration free energy of ~0.5 kcal/mol, errors of only ~2% for heats of vaporization and densities (https://doi.org/10.1021/ja9621760), and a close agreement with osmotic coefficients (https://doi.org/10.1021/acs.jcim.9b00552) or a large range of organic compounds. This raises our confidence in the accuracy of the derived binding free energies, which directly or indirectly depend on these fundamental thermodynamic properties.

      Regarding the method to evaluate PMF profiles, we have used a classical all-atom Monte Carlo approach originally developed by Jorgensen and coworkers (see, e.g., https://doi.org/10.1021/ar00161a004 and https://doi.org/10.1021/ja00168a022), as implemented in the widely used BOSS program (v. 4.8) (https://doi.org/10.1002/jcc.20297). This approach has been extensively tested against experimental data on ΔΔG values of various compounds in environments of different polarity (e.g., 2). Moreover, we have previously successfully applied this methodology in studies of the free energy of association of amino acid residues (https://doi.org/10.1021/jp803640e) and other biologically important groups (https://doi.org/10.1021/acs.jcim.9b00193). The results obtained have been compared with the available experimental data and demonstrated a good agreement. As for the small error bars in the plots, the fairly good convergence achieved in our PMF calculations is a result of extensive sampling combined with small system size, although obviously this is not always the case – see, for example, PMFs in our recent work (https://doi.org/10.1021/acs.jcim.9b00193).

      The above points have been discussed on pp 7-8 of the revised manuscript.

      p. 8, lines 149-154: Following up on the above, the authors also write "Importantly, only in the latter case are the R-Y interactions slightly more favorable than the K-Y ones (Figure S1C). While this can potentially contribute to increasing of Csat for the R>K mutant as compared to WT, the estimated thermodynamic effect is not too strong, especially if one considers that these interactions take place in an environment with largely water-like polarity. Therefore, the effect of R>K substitution on LLPS should be further explored in the context of protein-protein interactions." In the absence of estimates of the accuracy of the predictions, these sentences are somewhat unclear. Also, it is unclear what the authors mean by that the effect of R>K should be studied; there are already several examples of this (https://doi.org/10.1016/j.cell.2018.06.006 [already cited], https://doi.org/10.1038/s41557-021-00840-w & https://doi.org/10.1073/pnas.2000223117 come to mind, but there are likely more).

      As pointed above, the free-energy values were obtained using well-established computational techniques and are expected to reflect realistic trends. However, considering that there exist no equivalent experimental results to assess the accuracy of the predicted free energies, they indeed must clearly be understood as predictions. This is now stated on pp. 7-8 of the revised manuscript. Furthermore, it seems that the vague phrasing on our part in the above paragraph resulted in a misunderstanding. Namely, when we talk about “further exploration”, we only meant it in relation to our study, i.e. a connection with the MD part, and not in relation to a wider literature on the topic. In other words, we simply wanted to refer to the fact that our binding free energies for individual residues do not provide sufficient information about interactions between Lge11-80 protein chains. Following the Reviewer’s comment, we have rephrased this part and included additional references on the known role of R and K residues on phase separation.

      p. 8, lines 161-162: The authors perform MD simulations of Lge1 and variants using 24 copies and a box that gives them protein concentrations "in the mM concentration range". I realize that there's a concern about what is computationally feasible, but it would be important with an argument for this choice. Why is 24 expected to be enough to represent a condensate (I expect that there could be substantial finite-size effects)? What is the exact protein concentration in the simulations of the 24 chains [and of the 1-chain simulations]? How does this protein concentration compare to that in the condensates? The authors performed simulations in the NPT ensemble; how stable were the box dimensions?

      The effective protein concentration for different 24-copy systems is 6-7 mM, depending on the system (Figure 2—figure supplement 1A). This concentration range was selected in order to get a reasonable system size for microsecond all-atom MD simulations, while still being approximately one order of magnitude lower than the semi-dilute regime of the protein at hand. As a testament to the internal consistency of our framework, the fractal model predicts the concentration inside WT condensates of the size observed in the experiment to indeed be in the mM range. Moreover, as seen in many other systems, the concentration inside the observed droplets is expected to be significantly higher than Csat (https://doi.org/10.1101/2020.10.25.352823). Here, we should again emphasize that we did not aim to model the process of phase separation in our all-atom MD. We rather use multicopy simulations for the analyses of the organization of the protein crowded phase and specifically, the mode of intermolecular interactions, and then use the fractal scaling to derive a model of the internal organization of condensates at arbitrary scales.

      Regarding the experimental determination of the protein concentration in the condensates, we have used different approaches to estimate Csat and CD values: spin-down analyses (https://doi.org/10.1126/science.aaw8653), volumetry analysis (https://doi.org/10.1038/nchem.2803), estimation of concentration by fluorescent intensity of the condensates (https://doi.org/10.1016/j.molcel.2018.12.007; https://doi.org/10.1016/j.cell.2019.08.008), FCS (https://doi.org/10.1038/nchem.2803; https://doi.org/10.1016/j.cell.2019.10.011; https://doi.org/10.1126/science.aaw8653). However, different approaches yield values that vary by several orders of magnitude. That is the reason why we did not report definitive numbers. In general, there are uncertainties in the field about how to reliably measure protein concentrations in a condensate, necessitating the development of new approaches (https://doi.org/10.1101/2020.10.25.352823).

      With regard to the convergence and potential finite-size effects, we agree that this is an important issue and have addressed it in the revised version. In general, the convergence of our observables such as valency or compactness (Figure 2C, D and Figure 4—figure supplement 1C) gives confidence that the simulations are at least in a local equilibrium, especially when it comes to short-range properties such as contact preferences as further elaborated in our reply to the Reviewer’s specific comment about convergence below (please, see also above for our response to Editor’s comment #5). Importantly, in all 24-copy systems, the average separation between protein images lies in the 12-15 nm range, and no instances of self-interaction between images due to PBC were observed (Supplementary File 1). Finally, analysis of fluctuations in box dimensions shows that they are all in the range of picometers and largely negligible when it comes to the analysis at hand (Supplementary File 1).

      In order to highlight the realistic behavior of the simulated systems in the revised version, we now also report a detailed analysis of protein translational diffusion in MD simulations (Figure 4C and Figure 4—figure supplement 1D-F and Supplementary File 2). According to this analysis, single-molecule translational diffusion coefficients of Lge11-80 variants obtained from fitting of MSD curves with applied finite-size PBC correction and rescaling by the solvent viscosity (see Methods for details) are typically in the range of 100 µm2/s (Figure 4C and Supplementary File 2), which corresponds to experimentally measured values for different proteins of similar size. Importantly, the requisite finite-size corrections applied in the case of 24-copy systems are relatively small and amount to about 35-60%, while this is almost an order of magnitude higher (450-530%) for the single-copy simulations (Supplementary File 2). Please, see also the reply above to the Editor’s statements above for more details.

      Also, did the authors include the Strep- and His-tags in the simulations? If not, why not?

      We did not simulate the constant part of the constructs in order to: 1. expedite computation and 2. more directly expose the effect of different mutations. Since our comparison between simulation and experiment concerned largely qualitative observables, we have primarily focused on the relative differences between the three Lge11-80 variants. Importantly, the effect of mutations on the full-length protein and its different variants was analyzed in vivo in a previous publication (https://doi.org/10.1038/s41586-020-2097-z).

      Throughout: One of my major concerns about this work is the general lack of analysis of convergence of the simulations. The authors must present some solid analysis of which results are robust given the relatively short simulations and potential for bias from the chosen starting structures.

      First, we would like to emphasize that we did not attempt to capture the process of phase separation or characterize two coexisting phases, for which much larger ensembles and/or simulation times would be needed. Rather, our aim was to study the conformational behavior of individual protein chains in the context of a crowded protein mixture, taken as a model for the dense phase, and then use fractal scaling to provide a model of spatial organization of a condensate at an arbitrary length scale. Having said this, it is absolutely important to address how converged the key observables are, given the finite size of the all-atom simulation setup and the limited sampling used. In the revised manuscript, we have included an additional analysis of convergence of our simulations and could show that both key MD-derived parameters required by the fractal model, protein compactness and valency, display convergent behavior over the last third of 0.3 µs MD in the 24-copy systems (Figure 4—figure supplement 1C) and all analyses were performed over this region. In particular, the block averages of compactness and valency exhibit a standard deviation of only 2-4% and 4-8%, respectively, over the last 0.3 µs of MD simulations. Moreover, since we are interested in single-chain features in the context of a crowded mixture, our sampling corresponds effectively to 24 x 0.3 µs = 7.2 µs. Finally, a detailed analysis of convergence in conformational sampling was performed for single-copy simulations using calculations of configurational entropy as evaluated by the MIST formalism (Figure 4—figure supplement 1B). For instance, in the case of the weakly self-interacting Y>A, we do observe a close convergence in terms of the configurational entropy between two independent replicas on 1 µs MD trajectory (Figure 4—figure supplement 1B). However, we still recognize the possibility that with longer simulation times and/or more protein copies per simulation, the simulated systems may show a qualitatively different behavior, as discussed on pp. 10, 11, and 13 of the revised manuscript. Finally, we would like to reiterate the point that our derivation of the formalism that links the features of simulated ensembles on the scale of 10s of nanometers with their behavior on the scale of 100 s of nanometers and beyond is independent of such limitations. Once longer, larger and more accurate simulations become available, one will be able to apply the formalism without alteration and obtain a model of the spatial organization of the condensate on an arbitrary scale, starting just from the local features of individual proteins. We now discuss these details on pp. 10, 11, and 13 of the revised manuscript.

      As an example, on p. 8 the authors discuss a potential asymmetry between the interactions found in the dilute (single-copy) and dense (24-mer) phases. These observations are somewhat in contrast to other observations in the field, namely that it is the same interactions that drive compaction of monomers as those that drive condensate formation.

      Obviously, both the results in the literature and those presented here could be true. But in order to substantiate the statements made here, the authors should show some substantial statistical analyses to make it clear which differences are robust. The above holds for all parts of the computational/simulation work (e.g. other aspects of Fig. 2)

      Note: this comment by the Reviewer echoes in several respects the comment 7 by the Editor. Because of this, our reply in some parts is identical to that given above to the Editor. We have decided to include it here for the ease of reading and completeness.

      An expectation of the symmetry between intra- and intermolecular modes of interaction emerged from the background of polymer theory, which was primarily aimed to describe the behavior of homopolymers. In the case of heteropolymers such as proteins, the asymmetry in the aforementioned modes is rather intuitive. For instance, if there is only a single Y in a protein, then Y-Y contacts will not be possible in the intramolecular context, but could occur in multichain interactions. However, we agree with the Reviewer that this is an important issue and have deepened the analysis of this phenomenon in the revised manuscript.

      First, our analysis shows that the observed asymmetry between intra- and intermolecular contexts is statistically significant and is likely not a consequence of limited sampling (pp. 10-11, Figure 3—figure supplement 1B-C). Moreover, the observed symmetry breaking is in line with the recent studies by Bremer et al. (https://doi.org/10.1038/s41557-021-00840-w) and Martin et al. (https://doi.org/10.1126/science.aaw8653), which have delineated the key requirements for the symmetry between single-chain and collective phase behavior to hold. Specifically, we have compared in detail the sequence composition of Lge11-80 with that of A1-LCD variants studied by Bremer et al. When it comes to aromatic composition, Lge1 is most similar to the -12F+12Y mutant of A1-LCD, and by this token, i.e. the high frequency of stickers tyrosines, should exhibit a strong coupling between single-chain and phase behavior. However, the net charge per residue (NCPR) in Lge11-80 of 0.075 is greater than that of A1-LCD (0.059) and this could contribute to the extent of decoupling, as suggested by Bremer et al. Moreover, Lge1 is extremely abundant in Arg (13.5 % as compared to 7.4 % in A1-LCD), and is in this sense most similar to the +7R A1-LCD mutant, which showed the greatest degree of decoupling between single-chain and phase behavior in Bremer et al., in agreement with what we see here. While these authors have demonstrated that NCPR is the primary determinant of decoupling in the case of A1-LCD mutants, their analysis showed that the nature of positive and negative residues involved also makes a significant difference. In particular, the significant excess of Arg residues, as context-dependent auxiliary stickers, could create the asymmetry between interactions that determine single-chain dimensions vs. collective phase behavior.

      Furthermore, Martin et al. (https://doi.org/10.1126/science.aaw8653) have shown that an approximately uniform distribution of stickers along the sequence is required for the correspondence between the driving forces behind coil-to-globule transitions and phase separation to hold. We have analyzed the patterning of Tyr residues along the Lge11-80 sequence using Waro parameter used by Martin et al. (note that Tyr is the only aromatic in the Lge11-80 sequence). Interestingly, Lge11-80 exhibits a highly non-uniform patterning of Tyr residues, with Waro of the native Lge1 sequence (0.47) falling in the middle of the distribution for its shuffled variants (p=0.57). This is in contrast to the highly patterned sequences such as that of A1-LCD with p>0.99. Taken together, in addition to the relatively high NCPR, symmetry breaking in the case of Lge11-80 could be a consequence of its complex sequence composition, including both the non-uniform patterning of tyrosines and a high abundance of arginines. Provided that our simulations are long enough to provide an equilibrium picture and are on the length-scale of a single protein not strongly influenced by finite-size effects (these potential artifacts cannot be discounted), they actually can be seen as a demonstration of such symmetry breaking in a heteropolymer.

      Furthermore, analysis of pairwise contacts suggests that intra- and intermolecular interactions rely on a similar pool of contacts by amino-acid type, but differ significantly if one analyzes specific sequence location of the interacting residues involved (Figure 2—figure supplement 1B and C). For example, one observes a high correlation between the frequencies of different contacts by amino-acid type when comparing intramolecular contacts in single-copy simulations and intermolecular contacts in 24-copy simulations (Figure 3—figure supplement 1B). This correlation is completely lost (Figure 3—figure supplement 1C) if one analyzes position-resolved statistics (2D pairwise contacts maps) or statistically defined interaction modes (Figure 3A, and Figure 3—figure supplement 1A). For example, although Tyr-Tyr interactions dominate in both cases, in single-copy simulations of WT Lge11-80 the C-terminal Tyr80 barely participates in any intramolecular interactions with other residues (Figure 3—figure supplement 1A), while in 24-copy simulations it is one of the most intermolecularly interactive residues (Figure 3). In other words, while the symmetry between intra- and intermolecular interactions can be observed at the level of pairwise contact types (similar type contact used for both), the distribution of these contacts along the peptide sequence is clearly different in the two cases. Finally, it should be mentioned that the parallel between single-copy and phase behavior in both homopolymers and heteropolymers is observed primarily at the level of thermodynamic variables such as LLPS critical temperature (Tc), coil-to-globule transition temperature (Tq) or the Boyle temperature (TB). It is possible that the noted correspondence extends primarily to such and similar thermodynamic variables, while and more structural, topological features of the globule in the single-molecule case and the network in the collective phase case remain uncoupled.

      Interestingly, the core of intramolecular interactions observed for a single molecule at infinite dilution and in the crowded context remain approximately the same as reflected in the high correlation between intramolecular modes obtained in single and multichain simulations. Namely, proteins keep core self-contacts and establish new ones with neighbors, but do not donate everything to the intermolecular network losing “self-identity”, as in homopolymer melts. Similar effects have also been observed elsewhere: https://doi.org/10.1073/pnas.2000223117, https://doi.org/10.1073/pnas.1804177115.

      Similarly, how were the errors of the radius of gyration for WT, R>K and Y>A mutants calculated? Is the Rg for WT significantly smaller than the values for the two mutants? And are the differences in Rg between single-copy and multi-copy simulations statistically significant? I am asking since converging the Rg of IDPs of this length in all-atom MD is not easy.

      The errors for Rg values correspond to the standard deviations of the underlying distributions and are reported in Figure 4A and B, together with the corresponding means and an assessment of statistical significance of the difference. In particular, the character of the distributions (especially, for 24-copy systems) also suggests significant differences. In order to deepen this part in the revised version, we have added a new supplementary table (Supplementary File 1 “Technical summary) where we have included the average values of Rg together with the standard deviations for all modeled systems. Due to distributions being non-Gaussian, we have estimated the significance of the differences in Rgs between single-copy and multicopy simulations, as well as WT and mutants, using Wilcoxon rank sum test with continuity correction, with the resulting p-values < 2.2 10-16 for all cases.

      p. 12, line 251: Has the MIST formalism been validated for IDPs; if so please provide a reference.

      In the present work, we have evaluated the configurational entropy using a mutual information expansion approach with maximum-spanning-tree (MIST) approximation in internal-coordinate (bond-angle-torsion) representation. The latter is particularly well-suited for the analysis of IDPs as it allows one to avoid a number of artifacts (e.g., due to fitting of disordered ensembles to the average structure) associated with the more widely-used Cartesian-coordinate-based quasi-harmonic approaches. In particular, the MIST approach was used previously for the analysis of disordered protein ensembles (https://doi.org/10.1021/acs.jctc.8b00100). Here it should also be noted that, since intramolecular couplings are in general lower in IDPs, this makes them even better suited for MIST as compared to globular proteins. We have highlighted these points on p. 13 of the revision.

      p. 5, line 105, p. 16 line 334 and p. 18 line 283: It is not completely clear what the predictions are and what/which experiments they are compared to. On p. 16, exactly what does the analytical model predict? As far as I understand, the results from the MD simulations are input to the model, but I am probably missing something. Which concrete and testable predictions does the model enable?

      A key contribution of the present work is the development of a quantitative model that treats the spatial organization of a biomolecular condensate across scales using two key properties of individual polymer chains in the condensate - their average valency and compactness. The main predictions of the model concern the presence of a particular scaling of condensate mass with its radius, M(R), as captured by the fractal dimension, and the consequences this has on condensate morphology across scales. In the present manuscript, we have taken the first steps in testing these predictions in four different contexts. First, we could show that the MD simulations indeed match the predictions of fractal scaling for the three smallest clusters, which relates to the discussion on p. 16 that the Reviewer refers to. Here, it is important to understand that MD simulations in the first instance just give the average valency and compactness of individual chains in the dense phase. These values are then input into the fractal scaling formalism, which is conceptually fully independent from MD simulations, to obtain the dependence of condensate mass on its radius, M(R), at any desired length scale. The analysis presented in Figure 5—figure supplement 1B and discussed on p. 16 shows that the predictions of fractal scaling for the first three smallest clusters indeed correspond to what is seen in MD. This is a non-trivial correspondence and can be taken as direct evidence that fractal organization is present even at the shortest scale, i.e. at the level of MD simulation boxes.

      Second, the model was used to reconstruct the spatial organization of clusters of arbitrary size at the atomistic level (Figure 5A and B, Videos 4, 5, and 6), enabling a structural understanding of the organization of condensate interior. One direct practical application of such understanding concerns the nature of cavity sizes and interpretation of dextran partitioning experiments (p. 20). Moreover, as pointed above, differences in morphology of protein clusters propagate across scales, and can be qualitatively characterized by the analysis of microscopic images (see also discussion above). In particular, the model correctly predicts the difference in the behavior of WT and R>K as opposed to Y>A variants, solely based on the predicted fractal dimension they exhibit. Ultimately, however, static light scattering experiments would give the best possibility to test the model directly and will be the topic of our future work. In particular, the fractal formalism predicts significant regions of linear behavior in such curves in log-log representation, while the fractal dimension df, provides a quantitative point of comparison between theoretical predictions and experimental measurements (Figure 5C). These points have been further discussed on p. 21 of the revised manuscript.

      p. 19, lines 408-411: The authors find that when building clusters of Y>A from the simulations they find filamentous structures that they suggest explain the aggregation of the Y>A variant at high concentrations. While that sounds like an intriguing suggestion, it would be useful with a bit more detail about the robustness of this observation. For example, the simulations of Y>A appear similar to that of R>K; are the differences in topology really significantly different?

      Fractal dimension, dF, is the key parameter that defines self-similar organization of differently sized protein clusters according to the fractal model. Consequently, the difference in morphology between R>K and Y>A mutants is reflected in different values of dF for the two. In particular, with a dF of 1.63, the Y>A mutant is predicted to form low-dimensional clusters, straddling the range between a linear (1-dimensional) and a planar (2-dimensional object), unlike WT and R>K variants, which both exhibit dF values greater than 2. The qualitative behavior of the three variants, whereby WT and R>K result in spherical condensates and Y>A does not, is consistent with this. Notably, we have observed sporadic precipitates at high protein concentration in the Y>A mutant, which may be consistent with the predictions of the fractal model. However, the material properties and possible influence of sample impurities in the Y>A case at high concentrations remain unclear. Moreover, the sporadic nature of Y>A precipitates prevents an adequate statistical analysis. Hence, in the revised manuscript we refrain from commenting on these infrequently observed precipitates.

      Regarding MD simulations, the morphological differences between Y>A and R>K proteins can already be seen at the level of individual proteins in multicopy simulations, highlighted by the significantly different distribution of Rg (Figure 4B). This distribution in the case of Y>A has a prominently long tail, which indicates the possibility of adopting significantly more elongated configurations. Due to the self-similarity principle, such differences in morphology may propagate across length scales. Importantly, a recent publication included the experimental study of the possibility of IDRs to form low dimensional fractal systems upon disruption of the LLPS tendency by polyalanine insertion in synthetic elastin-like polypeptides (Roberts et al., Nature Materials, 2018).

      Finally, I would suggest that the authors make their code and data available in electronic format.

      All sharable data has been made available as part of the article package. Due to the heterogeneous character of our analysis, we do not have a single master code to be shared, but rather a collection of different scripts in combination with different software packages as indicated in the Methods section of the manuscript (GROMACS, MATLAB, R, FracVAL).

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript confirms previous studies suggesting a great deal of heterogeneity of gene expression at the neural plate border in early vertebrate embryos, as neural, placodal, neural crest, and epidermal lineages gradually segregate. Using scRNA-seq, the study expands previous studies by using far larger numbers of genes as evidence of this heterogeneity. The evidence for this heterogeneity and the change in heterogeneity over time is compelling.

      Many studies have suggested that there is considerable heterogeneity of gene expression in the developing neural plate border as the neural, neural crest, placodal and epidermal lineages segregate. Although the evidence for such heterogeneity was strong, until the advent of scRNA-seq, the extent of this heterogeneity was not appreciated. By using scRNA-seq at different stages of chick development, the authors sought to characterize how this heterogeneity develops and resolves over time.

      The work is technically sound, and the level of analysis of gene expression, clustering, synexpression groups, and dynamic changes in gene modules over time is state-of-the-art. A weakness of the results as they stand now is that the conclusions of the analysis are not tested by the authors and thus, are over-interpreted. Such tests could be performed in future studies either by gain- and loss-of-function experiments or by using lineage tracing to demonstrate that the cell states the authors observe - especially the "unstable progenitors" they characterize - are biologically meaningful. The data will nevertheless be a useful resource to investigators interested in understanding the development of different cell lineages at the neural plate border.

      We thank the reviewer for the positive assessment of our work. We agree that our models will need to be tested experimentally in the future, however, this will require a substantial amount of work. We therefore opted to share our data as a resource to be used by the community.

      Reviewer #2 (Public Review):

      The study of Thiery et al. aims to elucidate how cells undergo fate decisions between neural crest and (pan-) placodal cells at the neural plate border (NPB). While several previous single-cell RNA-Seq studies in vertebrates have included neural plate border cells (e.g. Briggs et al., 2018; Wagner et al., 2018; Williams et al., 2022), these previous studies did not provide conclusive insights on cell fate decisions between neural crest and placodes, due to either the limited number of genes recovered, the limited number of cells sampled or the limited numbers of stages included. The present study overcomes these limitations by analyzing almost 18,000 cells at six stages of development ranging from gastrulation until after neural tube closure (8 somite-stage), with an average depth of almost 4000 genes/cell. Using this extensive and high-quality data set, the study first describes the timing of segregation of neural crest and placodal lineages at the NPB suggesting that at late neural fold stages (somite stage 4) most cells have decided between placodal and neural crest fates. It then identifies gene modules specific for neural crest and placodal lineages and characterizes their temporal and spatial expression. Focusing on an NPB-specific subset of cells, the study then shows that initially most of these cells co-express neural crest and placodal gene modules suggesting that these are undecided cells, which they term "border-located unstable progenitors" (BLUPs). The proportion of BLUPs decreases over time, while cells classified as placodal or neural crest cells increases, with few BLUPs remaining at late neural fold stages (and a few scattered BLUPs even at somite stage 8). Based on these findings, the authors propose a new model of cell fate decisions at the NPB (termed the "gradient border model"), according to which the NPB is not defined by a specific transcriptional state but is rather a region of undecided cells, which diminishes in size between gastrulation and neural fold stages due to more and more cells committing to a placodal or neural crest fate based on their mediolateral position (with medial cells becoming specified as neural crest and lateral cells as placodal cells).

      The study of Thiery et al. provides an unprecedentedly detailed, methodologically careful, and well-argued analysis of cell fate decisions at the NPB. It provides novel insights into this process by clearly demonstrating that the NPB is an area of indecision, in which cells initially co-express gene modules for ectodermal fates (neural crest and placodes), which subsequently become segregated into mutually exclusive cell populations. The paper is very well written and largely succeeds in presenting the very complex strategy of data analysis in a clear way. By addressing the earliest cell fate decisions in the ectoderm and one of the earliest cell fate decisions in the developing vertebrate embryo, this study will have a significant impact and be of interest to a wide audience of developmental biologists. There are, two conceptual issues raised in the paper that require further discussion.

      We thank the reviewer for the positive comments on our work and its significance; we have addressed the conceptual issues below and in the revised version of the manuscript.

      First, the authors suggest that their data resolve a conflict between two previously proposed models, the "binary competence model" and the "neural plate border model". The authors correctly describe, that the binary competence model proposed by Ahrens and Schlosser (2005) and Schlosser (2006) suggests that the ectoderm is first divided into two territories (neural and non-neural), which differ in competence, with the neural territory subsequently giving rise to the neural plate and neural crest and the non-neural territory giving rise to placodes and epidermis (sequence of cell-fate decisions: ([neural or neural crest]-[epidermal or placodal]). This model was proposed as an alternative to a "neural plate border state model", which instead suggests that initially the NPB is induced as a territory characterized by a specific transcriptional state, from which then neural crest and placodes are induced by different signals (sequence of cell fate decisions: neural-[placodal or neural crest]-epidermal) (see Schlosser, 2006, 2014). Instead in this paper, the authors contrast the binary competence model with a model they call the "neural plate border" model according to which the NPB can give rise to all four ectodermal fates with equal probability. However, I think this misses the main point of contention since all previously proposed models are in agreement that initially the neural plate border region is unspecified and can give rise to all four fates and that lineage restrictions only appear over time. "Binary competence" and "Neural plate border state" model, differ, however, in their predictions about the sequence, in which these fate restrictions occur.

      We appreciate the reviewer's thoughtful feedback, but respectfully disagree with their comment regarding the sequence of events predicted by the neural plate border (NPB) model. While the NPB model does suggest that the NPB is a transcriptionally distinct state, it does not make specific predictions about the sequence of fate decisions. Although several papers cited in the Schlosser 2006 and 2014 reviews suggest that the NPB gives rise to all four ectodermal fates, none of them (and, to the best of our knowledge, no other primary paper referring to the NPB model) specifically defines the sequence of fate specification from the NPB.

      The key points of the NPB model are that the NPB is defined by overlapping expression of early neural/non-neural markers (which is also observed in Xenopus – see Pieper et al., 2012 supplementary material), contains progenitors for all four ectodermal fates, and that this "state" exists prior to the emergence of definitive neural crest and placodal cells.

      To investigate the heterogeneity in the order of cell fate decisions at the NPB, we carried out additional pairwise co-expression analyses of forebrain, mid-hindbrain, neural crest, and placodal gene modules, which reveals multiple different hierarchies of cell fate choice depending on a cell's axial positioning, as shown in Figure 6-figure supplement 1.

      Considering these findings, we have expanded our discussion of the previously proposed binary competence and neural plate border models to highlight how neither of these models is sufficient to fully characterize the heterogeneity in cell fate decisions observed in our study. We hope this clarification will help address any concerns the reviewer may have had about the NPB model and its implications for our results.

      Second, the authors should be more careful when relating their data to the specification or commitment of cells. Questions of specification and commitment can only be tested by experimental manipulation and cannot be inferred from a transcriptome analysis of normal development. So the conclusion that the activation of placodal, neural and neural crest-specific modules in that sequence suggests a sequence of specification in the same temporal order (lines 706-709) is not justified. Studies from the authors' own lab previously showed that epiblast cells from pre-gastrula stages are specified to express a large number of NPB border markers including neural crest and panplacodal markers, when cultured in vitro (Trevers et al., 2018; see also Basch et al., 2006 for early specification of the neural crest), which is not easily reconciled with this interpretation. I am not aware of any experimental evidence that shows that a panplacodal regulatory state is specified prior to neural crest in the chick (although I may have missed this). In Xenopus, experimental studies have shown instead that neural crest is specified and committed during late gastrulation, while the panplacodal states are specified much later, at neural fold stages (Mancilla and Mayor, 2006; Ahrens and Schlosser, 2005). It may well be the case that the relative timing of neural crest and panplacodal specification is different between species (and such easy dissociability may even be expected from the perspective of the binary competence model).

      We very much agree with the reviewer that the definitions and correct terminology is important and apologise for lack of clarity. We have reworded the text carefully.

      The reviewer is correct: specification of neural crest, placodes and neural plate is observed very early in chick, prior to gastrulation. However, in specification experiments tissue is removed from its normal environment to reveal what it does autonomously in the absence of additional signals. In the current study, we assess the activation of gene modules in normal development. We have therefore reworded the text to avoid ‘specification’ in this context.

      Reviewer #3 (Public Review):

      The goal of this work was to better understand how cell fate decisions at the neural plate border (NPB) occur. There are two prevailing models in the field for how neural, neural crest and placode fates emerge: (i) binary competence which suggests initial segregation of ectoderm into neural/neural crest versus placode/epidermis; (ii) neural plate border, where cells have mixed identity and retain the ability to generate all the ectodermal derivatives until after neurulation begins.

      The authors use single-cell sequencing to define the development of the NPB at a transcriptional level and suggest that their cell classification identified increased ectodermal cell diversity over time and that as cells age their fate probabilities become transcriptionally similar to their terminal state. The observation of a placode module emerging before the neural and neural crest modules is somewhat consistent with the binary competence model but the observation of cells with potentially mixed identity at earlier stages is consistent with the neural plate border model.

      Differences in the timing of analyses and techniques used can account for the generation of these two original models, and in essence, the authors have found some evidence for both models, possibly due to the period over which they performed their studies. However, the authors propose recognizing the neural plate border as an anatomical structure, containing transcriptionally unstable progenitors and that a gradient border model defines cell fate choice in concert with spatiotemporal positioning.

      The idea that the neural plate border is an anatomical structure is not new to most embryologists as this has been well-recognized in lineage tracing and transplantation assays in many different species over many decades.

      We appreciate the reviewers comment and agree that the neural plate border has previously been characterised anatomically. However, many studies have applied the term literally in reference to a transcriptional state which is specified through the expression of ‘neural plate border specifiers’, prior to segregation of the placodes and neural crest. Here we highlight that treating the neural plate border as a definitive transcriptional state which can be identified through the expression of ‘neural plate border specifiers’ is false. Instead, we find these ‘specifiers’ are upregulated within either neural crest, placodal or neural cell lineages over time. Cells at the neural plate border co-express these alternate lineage markers and therefore predicted to be undecided.

      The authors don't provide molecular evidence for transcriptional instability in any cells. It's a molecular term and phenomenon inaccurately applied to these cells that are simply bipotential progenitors.

      We thank the reviewer for pointing this out; we have therefore refrained from using the term unstable and instead refer to the cells as ‘undecided’ as suggested by reviewer 2.

      Lastly, there's no evidence of a gradient that fits the proper biochemical or molecular definition. Graded or sequential are more appropriate terms that reflect the lineage determination or segregation events the authors characterize, but there's no data provided to support a true role for a gradient such as that achieved by a concentration or time-dependent morphogen.

      We agree with the reviewer that ‘gradient’ was misleading. We have now replaced ‘gradient’ with ‘graded’ and expanded figure 6 to highlight the graded co-expression of gene modules associated with alternate fates. We have changed the title to reflect this.

      A limitation of the study is that much of it reads like a proof-of-principle because validation comes primarily from known genes, their expression patterns in vivo, and their subsequent in vivo functions. Thus, the authors need to qualify their interpretations and conclusions and provide caveats throughout the manuscript to reflect the fact that no functional testing was performed on any novel genes in the emerging modules classified as placode versus neural or neural crest.

      We agree with the reviewer that we do not provide any functional data to validate our predictions; it is for this reason that we submitted the manuscript as a ‘resource’ to make our data available to the community.

      Lastly, a limitation of gene expression studies is that it provides snapshots of cells in time, and while implying they have broad potential or are lineage fated, do not actually test and confirm their ultimate fate. Therefore, in parallel with their studies, the authors really need to consider, the wealth of lineage tracing data, especially single-cell lineage tracing, which has been performed using the embryos of the same stage as that sequenced in this study, and which has revealed critical data about the potential cells through when and where lineage segregation and cell fate determination occurs.

      The reviewer rightly points out the significance of the classical experiments in the context of the neural plate border. However, only one of the mentioned studies (Bronner-Fraser and Fraser, 1989), analyses cells at a single-cell level and does not assess placodes, while the remaining studies use tissue transplantation or cell population labelling. Although these studies provide valuable insights, they do not examine the fate or potential of single cells, nor do they reveal the transcriptional signature of these progenitors.

      Our findings emphasize the transcriptional heterogeneity at the neural plate border, suggesting that distinct subsets of neural plate border progenitors undergo varying sequences of fate restrictions. The upcoming challenge will be to conduct clonal analysis alongside scRNAseq to determine if neural plate border progenitors with similar transcriptional signatures experience the same fate restrictions or if external factors, such as cell-cell signalling, dictate cell fate choices.

      We have amended the manuscript to clarify that predictions of fate decisions require future validation through lineage tracing. Additionally, we have acknowledged in the introduction that previous studies have demonstrated the intermingling of neural, neural crest, and placodal progenitors at the neural plate border.

    1. Author Respone

      Reviewer #1 (Public Review):

      This article describes the application of a computational model, previously published in 2021 in Neuron, to an empirical dataset from monkeys, previously published in 2018 in eLife. The 2021 modeling paper argued that the model can be used to determine whether a particular task depends on the perirhinal cortex as opposed to being soluble using ventral visual stream structures alone. The 2018 empirical paper used a series of visual discrimination tasks in monkeys that were designed to contain high levels of 'feature ambiguity' (in which the stimuli that must be discriminated share a large proportion of overlapping features), and yet animals with rhinal cortex lesions were unimpaired, leading the authors to conclude that perirhinal cortex is not involved in the visual perception of objects. The present article revisits and revises that conclusion: when the 2018 tasks are run through the 2021 computational model, the model suggests that they should not depend on perirhinal cortex function after all, because the model of VVS function achieves the same levels of performance as both controls and PRC-lesioned animals from the 2018 paper. This leads the authors of the present study to conclude that the 2018 data are simply "non-diagnostic" in terms of the involvement of the perirhinal cortex in object perception.

      We appreciate the Reviewer’s careful reading and synthesis of the background and general findings of this manuscript.

      The authors have successfully applied the computational tool from 2021 to empirical data, in exactly the way the tool was designed to be used. To the extent that the model can be accepted as a veridical proxy for primate VVS function, its conclusions can be trusted and this study provides a useful piece of information in the interpretation of often contradictory literature. However, I found the contribution to be rather modest. The results of this computational study pertain to only a single empirical study from the literature on perirhinal function (Eldridge et al, 2018). Thus, it cannot be argued that by reinterpreting this study, the current contribution resolves all controversy or even most of the controversy in the foregoing literature. The Bonnen et al. 2021 paper provided a potentially useful computational tool for evaluating the empirical literature, but using that tool to evaluate (and ultimately rule out as non-diagnostic) a single study does not seem to warrant an entire manuscript: I would expect to see a reevaluation of a much larger sample of data in order to make a significant contribution to the literature, above and beyond the paper already published in 2021. In addition, the manuscript in its current form leaves the motivations for some analyses under-specified and the methods occasionally obscure.

      We believe that our comments outline our rationale for focusing our current analysis on data from Eldridge et al. In brief, these data provide compelling evidence against PRC involvement in perception, and are the only such data with PRC-lesioned/-intact macaques that we were able to secure the stimuli for. As such, data from Eldridge et al. provide a singular opportunity to address discrepancies between human and macaque lesion data. For this reason, we propose the current work as a Research Advance Article type, building off of a manuscript that was previously published in eLife.

      Reviewer #2 (Public Review):

      The goal of this paper is to use a model-based approach, developed by one of the authors and colleagues in 2021, to critically re-evaluate the claims made in a prior paper from 2018, written by the other author of this paper (and colleagues), concerning the role of perirhinal cortex in visual perception. The prior paper compared monkeys with and without lesions to the perirhinal cortex and found that their performance was indistinguishable on a difficult perceptual task (categorizing dog-cat morphs as dogs or cats). Because the performance was the same, the conclusion was that the perirhinal cortex is not needed for this task, and probably not needed for perception in general, since this task was chosen specifically to be a task that the perirhinal cortex might be important for. Well, the current work argues that in fact the task and stimuli were poorly chosen since the task can be accomplished by a model of the ventral visual cortex. More generally, the authors start with the logic that the perirhinal cortex gets input from the ventral visual processing stream and that if a task can be performed by the ventral visual processing stream alone, then the perirhinal cortex will add no benefit to that task. Hence to determine whether the perirhinal cortex plays a role in perception, one needs a task (and stimulus set) that cannot be done by the ventral visual cortex alone (or cannot be done at the level of monkeys or humans).

      There are two important questions the authors then address. First, can their model of the ventral visual cortex perform as well as macaques (with no lesion) on this task? The answer is yes, based on the analysis of this paper. The second question is, are there any tasks that humans or monkeys can perform better than their ventral visual model? If not, then maybe the ventral visual model (and biological ventral visual processing stream) is sufficient for all recognition. The answer here too is yes, there are some tasks humans can perform better than the model. These then would be good tasks to test with a lesion approach to the perirhinal cortex. It is worth noting, though, that none of the analyses showing that humans can outperform the ventral visual model are included in this paper - the papers which showed this are cited but not discussed in detail.

      Major strength:

      The computational and conceptual frameworks are very valuable. The authors make a compelling case that when patients (or animals) with perirhinal lesions perform equally to those without lesions, the interpretation is ambiguous: it could be that the perirhinal cortex doesn't matter for perception in general, or it could be that it doesn't matter for this stimulus set. They now have a way to distinguish these two possibilities, at least insofar as one trusts their ventral visual model (a standard convolutional neural network). While of course, the model cannot be perfectly accurate, it is nonetheless helpful to have a concrete tool to make a first-pass reasonable guess at how to disambiguate results. Here, the authors offer a potential way forward by trying to identify the kinds of stimuli that will vs won't rely on processing beyond the ventral visual stream. The re-interpretation of the 2018 paper is pretty compelling.

      We thank the Reviewer for the careful reading of our manuscript and for providing a fantistics synthesis of the current work.

      Major weakness:

      It is not clear that an off-the-shelf convolution neural network really is a great model of the ventral visual stream. Among other things, it lacks eccentricity-dependent scaling. It also lacks recurrence (as far as I could tell).

      We agree with the Reviewer completely on this point: there is little reason to expect that off-the-shelf convolutional neural networks should predict neural responses from the ventral visual stream, for the reasons outlined above (no eccentricity-dependent scaling, no recurrence) as well as others (weight sharing is biologically implausible, as well as the data distributions and objective functions use to optimize these models). Perhaps surprisingly, these models do provide quantitatively accurate accounts of information processing throughout the VVS; while this is well established within the literature, we were careless to simply assert this as a given without providing an account of these data. We appreciate the Reviewer for making this clear and we have changed the manuscript in several critical ways in order to avoid making unsubstantiated claims in the current version. We hope that these changes also make it easier for the casual reader to appreciate the logic in our analyses. First, in the introduction, we outline some of the prior experimental work that demonstrates how deep learning models are effective proxies for neural responses throughout the VVS. We also demonstrate this model-neural fit in the current paper using electrophysiological recordings, but also including comments about the limitation of these models raised by the Reviewer.

      In the introduction we also more clearly demarcate prior contributions from our recent computational work, and highlight how models approximate the performance supported by a linear readout of the VVS, but fail to reach human-level performance.

      Results from these analyses were essential to understanding the logic of the paper but previously (as noted by the Reviewer) this critical evidence was cited but not directly presented. We include a description to these we describe these data in the introduction more thoroughly, and substantial change Figure 1, in order to visualize these data (b).

      Moreover, we include a over of the methods and data used to generate these plots in the results and methods sections.

      While there is little reason to expect that off-the-shelf convolutional neural networks should predict neural responses from the ventral visual stream, we believe that these modifications to the manuscript (to the introduction and figure one, as well as the results and methods sections) make clear that these models are nonetheless useful methods for predicting VVS responses and the behaviors that depend on the VVS.

      To the authors' credit, they show detailed analysis on an image-by-image basis showing that in fine detail the model is not a good approximation of monkey choice behavior. This imposes limits on how much trust one should put in model performance as a predictor of whether the ventral visual cortex is sufficient to do a task or not. For example, suppose the authors had found that their model did more poorly than the monkeys (lesioned or not lesioned). According to their own logic, they would have, it seems, been led to the interpretation that some area outside of the ventral visual cortex (but not the perirhinal cortex) contributes to perception, when in fact it could have simply been that their model missed important aspects of ventral visual processing. That didn't happen in this paper, but it is a possible limitation of the method if one wanted to generalize it. There is work suggesting that recurrence in neural networks is essential for capturing the pattern of human behavior on some difficult perceptual judgments (e.g., Kietzmann et al 2019, PNAS). In other words, if the ventral model does not match human (or macaque) performance on some recognition task, it does not imply that an area outside the ventral stream is needed - it could just be that a better ventral model (eg with recurrence, or some other property not included in the model) is needed. This weakness pertains to the generalizability of the approach, not to the specific claims made in this paper, which appear sound.

      We could not agree more with the Reviewer on these points. It could have been the case that these models' lack of correspondence with known biological properties (e.g. recurrence) led them to lack something important about VVS-supported performance, and that this would derail the entire modeling effort here. Surprisingly, this has not been the case, as is evident in the clear correspondence between model performance and monkey data in Eldridge et al. 2018. Nonetheless, we would expect that other experimental paradigms should be able to reveal these model failings. And future work evaluating PRC involvement in perception must contend with this very problem in order to move forward with this modeling framework. That is, it is of critical importance that these VVS models and the VVS itself exhibit similar failure modes, otherwise it is not possible to use these models to isolate behaviors that may depend on PRC.

      A second issue is that the title of the paper, "Inconsistencies between human and macaque lesion data can be resolved with a stimulus-computable model of the ventral visual stream" does not seem to be supported by the paper. The paper challenges a conclusion about macaque lesion data. What inconsistency is reconciled, and how?

      It appears that this point was lost in the original manuscript; we have tried to clarify this idea in both the abstract and the introduction. In summary, the cumulative evidence from the human lesion data suggest that PRC is involved in visual object perception, while there are still studies in the monkey literature that suggest otherwise (e.g. Eldridge et al. 2018). In this manuscript, we suggest that this apparent inconsistency is, in fact, simply a consequence of reliance on information interpretations of the monkey lesion data.

      We have made substantive changes to the abstract so this is an obvious, central claim.

      We have also made substantive changes to the introduction to make resolving this cross-species discrepancy a more central aim of the current manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript describes experiments that lead to a potentially impactful result and most of the data seem very nice. The authors conducted a mutant screen to find the gene BbCrpa from a fungus resistant to cyclosporine A (CsA). Microscopy indicates that the mode of action is likely sequestration of the toxin in vacuoles, mediated through the P4-ATPase pathway. They also show that expression of BbCrpa in Verticillium renders that fungus resistant to CsA. The paper then takes a very large jump across kingdoms and toxins and asks if BbCrpa, expressed in plants, will confer resistance to a different toxin (cinnamon acetate) that is produced by Verticillium. They conduct disease assays on Arabidopsis and cotton and show promising results, but these assays are less thoroughly completed. They provide microscopic evidence that the transgenics accumulate CIA in vacuoles, which is consistent with the mode of action of the other systems. Overall, my assessment of the paper is that the authors may have a nice story, but the transition to plants needs to be better described and potentially supported by additional experiments. For example, the authors seem to conclude that this resistance mechanism will be a very broad spectrum. Is there a second toxin-producing pathogen that could be used to assess whether this is true?

      Thanks for your advice! To answer the question that "Is there a second toxin-producing pathogen that could be used to assess whether this is true?", we added the data of another t toxin-producing pathogens, Fusarium oxysporum and another V. dahliae race, L2-1, to support the conclusion that BbCrpa can confer resistance of plants against pathogens. As expected, the expression of BbCRPA in Arabidopsis and cotton could also significantly increase the resistance to the pathogens we tested. New data were shown in Figure 5-figure supplement 1G-J.

      Reviewer #2 (Public Review):

      The fungus B. bassiana is one of few fungal species resistant to cyclosporine A and tacrolimus, naturally occurring microbial compounds with antifungal and immunosuppressive properties. The authors studied the mechanism of this resistance and found a novel vesicle-mediated transport pathway that directs the compounds to vacuoles for degradation. This hitherto unknown mode of detoxification is initiated by the activity of a phospholipid flippase of the P4-ATPase type. Interestingly, transgenically expressing the fungal flippase in plant model systems induces a similar detoxification pathway and makes the plants resistant to certain fungal toxins of secondary metabolism.

      Strengths

      The genetic screening, isolation of cyclosporine A (CsA) resistant mutants, and characterization of the causative gene BbCrpa are very solid with two independent alleles, a synthetic knockout strain, and rescue of the mutant phenotype.

      BbCrpa protein function in detoxification is demonstrated convincingly by expression in another CsA-sensitive strain, as is its reliance on sites conveying ATPase activity for proper function. It is also functionally different from a relatively closely related P4-ATPase from yeast.

      Using fluorescently labeled CsA and tacrolimus (FK506), it is nicely demonstrated how the compounds are going through the anterograde pathway all the way to the vacuole.

      The authors demonstrate that vacuolar targeting is key for the detoxifying function of BbCrpa and identify the targeting motif that contains a ubiquitination site.

      A trans-species approach (actually, trans-kingdom) confers that BbCrpa can also enhance vacuolar targeting of small toxic compounds to vacuoles in plants, which is quite astounding, given that plant endomembrane transport has quite a number of differences from that of fungi.

      Weaknesses

      It is not clear at which temporal scale CsA is going through the different endosomal compartments.

      Thanks for your comments! We agree your idea that it is better to provide indication of the temporal scale of CsA entering into different endosomal compartments. Actually, we had tried to trace the distribution of CsA in cells many times. Unfortunately, the fluorescence of 5-FAM is weak and decreases fast compared with eGFP or mRFP protein, which made us difficult to capture the transient localization and moving trace of CsA in the cells. Nevertheless, the trail of BbCrpa, which carried CsA from the vesicles to early/late endosome, and vacuoles, can reflect the pathway of the cargo (Figure 3K, Supplementary file 1).

      Can it be ruled out that the fluorescently labeled CsA and the GFP-tagged BbCrpa are stripped off their label and we are seeing the free label only?

      Thanks! We accept your comments. In order to rule out the interference from the cleaved fluorescent proteins (i.e., eGFP and mRFP) or chemical compound (i.e. 5-FAM), we took eGFP/mRFP and 5FAM as control. New data about the localization of eGFP/mRFP and 5-FAM in B. bassiana hypha were provided in the revised manuscript. Our observation indicated that the distribution of fluorescent materials alone is different with the labeled ones, confirming the bona fide localization of the fusion proteins or compound. Please see Figure 2-figure supplement 2.

      Reviewer #3 (Public Review):

      In this manuscript, the authors have attempted to determine the molecular mechanisms underlying the resistance of an insect fungal pathogen Bauveria barbicans to cyclosporine A (CsA) and tacrolimus (FK506), known antifungal secondary metabolites that are also used extensively as immunosuppressing agents in medicine. By screening the random insertion mutant library of this pathogen, they identified the gene responsible for conferring resistance to CsA and FK506. The amino acid sequence of the gene identified it to be P4-ATPase, designated BbCrpa, which was hypothesized to be involved in vesicle-mediated transport. The identity of this gene as a CsA resistance gene was confirmed by demonstrating that disruption of this gene in B. barbicans confers susceptibility to CsA and FK506 and the expression of the wild-type gene in the BbCRPA knockout strain restores resistance to these compounds. In addition, expression of this gene in a plant pathogen Verticillium dahliae confers resistance to CsA and FK506.

      The authors hypothesized that CsA/FK506 detoxification in the resistant B. barbiana strain used is through the BbCRPA-mediated vesicle transport process transporting these toxic metabolites to vacuoles through trans-Golgi (TGN)-early endosome (EE)-late endosomes (LE) pathway. To test this hypothesis, they employed a dual labeling system using 5-carboxyfluorescein fluorescently labeled CsA and FK506 and fusions of red fluorescent proteins (RFP) with BbRab5 GTPase (a marker for early endosomes), BbRab7 GTPase (a marker for late endosome) and pleckstrin homology domain of human oxysterol binding protein (PHOSBP) (a marker for trans-Golgi). By looking at the distribution of fluorescein-labeled CsA and FK506 in the wild-type and ΔBbCRPA cells using confocal microscopy, the authors have provided compelling evidence that these metabolites are transported to the vacuole. The co-localization of CsA with endocytic marker proteins also appears to be convincing for the most part. The co-localization of CsA with mRFP:: PHOSBP as shown in Fig. 2D seems less compelling. Also, in the confocal micrographs presented in Fig. 2, the distinction between early and late endosomes seems less convincing. It seems that there is significant heterogeneity in the early endosome and late endosome populations in the fungal cells.

      1) The co-localization of CsA with mRFP::PHOSBP as shown in Fig. 2D seems less compelling.

      Thanks a lot! According to your suggestion, we repeated the observation and replaced the original figure with new one (Figure 2D). The new figures clear indicates the co-localization of CsA with mRFP::PHOSBP.

      2) Also, in the confocal micrographs presented in Fig. 2, the distinction between early and late endosomes seems less convincing. It seems that there is significant heterogeneity in the early endosome and late endosome populations in the fungal cells.

      Thanks! We agree with your comments! Rab5 is widely used as a marker for early endosomes, while Rab7 is used as a marker for late endosomes. Nevertheless, early endosome and late endosome are hardly to be distinguished strictly. According to your suggestion, we repeated the observation, and replaced the Figure 2F and 2L, and Figure 3E with new ones. Our results indicated that mRFP::BbRab5 appeared largely in the lumen of vacuoles and some punctaes (early endosomes), and mRFP::BbRab7 locates to vacuolar membrane and late endosomal compartments. These can be seen in our observations in Figure 3D and 3E, which are consistent with the observations in Fusarium graminearum described by Zheng et al. (Zheng et al., 2018, New Phytologist, 219: 654671, DOI: 10.1111/nph.15178).

      The authors addressed the question of whether BbCrpa acts as a component involved in vesicle trafficking through the trans-Golgi-endosomes to vacuoles. Ten different eGFP-BbCrpa fusion proteins were constructed and shown to provide detoxification of CsA and FK506. The BbCrpa is localized to the apical plasma membrane and spitzenkorper region of the germ tube. The evidence for localization of BbCrpa in trans-Golgi and vacuole is clear. However, the experimental data shown in Fig. 3D-F claiming localization of BbCrpa in EEs and LEs are somewhat difficult for this reviewer to interpret. It is also not clear to this reviewer why the two FM4-64 staining patterns in Fig. 3C and Fig. 3F are strikingly different. The evidence for co-localization of the fluorescein-labeled CsA or FK506 with RFP-labeled BbCrpa in vacuoles (Fig.3 H and J) is convincing. Figs. 3L-M depicting dynamic trafficking of BbCrpa from TGN to vacuoles using timelapse microscopy is interesting. In Fig. 3M, eGFP should be labeled eGFP::Drs2p. The authors have identified the N-terminal vacuole targeting motif in BbCrpa and shown that the C-terminal sequence from aa1326 to aa1359 is important for detoxification of CsA and FK506 in B. barbiana. In particular, the importance of three Tyr residues located in the C-terminal domain of the enzyme for CsA resistance is interesting.

      1) The experimental data shown in Fig. 3D-F claiming localization of BbCrpa in EEs and LEs are somewhat difficult for this reviewer to interpret.

      Thanks for your comments! P4-ATPases are implicated in the initiation of vesicle biogenesis and moves along with the vesicle (Panatala et al., 2015, Journal of Cell Science, 128: 2021-2032, DOI: 10.1242/jcs.102715; van der Mark et a., 2013, International Journal of Molecular Sciences, 14, 7897-7922, DOI: 10.3390/ijms14047897). In this study, the crucial issue to be addressed was the journey of CsA to the vacuole, which might be through BbCrpa-mediated TGN-EE-LE vesicle transport pathway. Therefore, we observed the localization of BbCrpa in TGN, EEs and LEs. It has been reported that small GTPase Rab5 is localized to the early endosomes (Bucci et al., 1992, Cell, 70: 715-728, DOI: 10.1016/0092-8674(92)90306-w), while Rab7 is to the late endosomal compartment (Vitelli et al., 1997, The Journal of Biological Chemistry, 272: 4391-4397, DOI: 10.1074/jbc.272.7.4391). Thus, Rab5 and Rab7 are used as marker proteins to indicate EEs and LEs, respectively. Nevertheless, Rab5 and Rab7 could also be observed in MVB (multivesicular bodies) or vacuoles (Toshima J.Y.,et al., 2014, Bifurcation of the endocytic pathway into Rab5-dependent and -independent transport to the vacuole, Nature Communication, 5:3498, DOI: 10.1038/ncomms4498; Zheng et al., 2018, New Phytologist, 219: 654-671, DOI: 10.1111/nph.15178). According to your suggestion, we repeated our observation and replaced Figure 3E with new one. In Figure3D, we can see mRFP::BbRab5 appeases largely in the lumen of vacuoles and some punctaes (early endosomes), and in Figure 3E mRFP::BbRab7 locates to late endosomal compartments and vacuolar membrane, which are consistent with the observations in Fusarium graminearum described by Zheng et al.(Zheng et al., 2018, New Phytologist, 219: 654671, DOI: 10.1111/nph.15178).

      2) It is also not clear to this reviewer why the two FM4-64 staining patterns in Fig. 3C and Fig. 3F are strikingly different.

      Thanks for your comments! FM4-64 is a styryl dye that can bind to the outer lipid leaflet of the plasma membrane and enter into cells through endocytosis (Scheuring et al., 2015, Methods in Molecular Biology, 1242:83-92, DOI: 10.1007/978-1-4939-1902-4_8). When the dye is internalized, it can be observed firstly in the membrane of vesicles and endosomal compartments, and then appears in the vacuolar membrane (Jelníková et al., 2010, Plant Journal, 61(5): 883-892, DOI: 10.1111/j.1365-313X.2009.04102.x; Löfke et al., 2013, Journal of Integrative Plant Biology, 55(9): 864-875, DOI: 10.1111/jipb.12097). In Figure 3C, we tried to show the evidence that eGFP::BbCrpa appears in vesicles that are stained by FM4-64, while in Figure 3F, we aimed to indicate eGFP::BbCrpa accumulates in mature vacuoles. Hence, FM4-64 staining patterns in Figure 3C and Figure 3F are somehow different.

      3) In Fig. 3M, eGFP should be labeled eGFP::Drs2p.

      Thanks for your reminder! We have modified it according to the suggestion. Please see Figure 3M.

      Finally, the authors overexpressed BbCrpa gene in transgenic Arabidopsis and cotton plants to show that transgenic plants expressing this enzyme are protected from the toxic effects of the toxin cinnamyl acetate (CA) produced by the fungal pathogen Verticillium dahlia which causes vascular wilt disease in these plants. The data reported in Fig. 5A show that the transgenic Arabidopsis seed is able to germinate in presence of CA, whereas the nontransgenic control seed is not able to germinate. Evidence is presented that CA accumulates in the vacuole in transgenic Arabidopsis. However, the seedlings emerging from transgenic seeds are only partially protected from CA (Fig. 5A). It is also clear from the data presented in Figs. 5B-G that expression of the BbCrpa gene in transgenic Arabidopsis and cotton affords protection from infection by V. dahlia although no evidence for the expression of this gene at the protein level is presented. However, it seems likely that the transgenic lines only show delayed disease symptoms and are not truly resistant to this pathogen. The authors did not state clearly if Verticillium wilt disease resistance assays were performed on homozygous transgenic plants and their corresponding null segregants as negative controls. They also fail to provide evidence that the transgenic Arabidopsis and cotton challenged with the pathogen are able to grow to maturity and set viable seeds.

      1)However, the seedlings emerging from transgenic seeds are only partially protected from CA (Fig. 5A).

      Thanks for your comments! In this study, at the concentration of 50 μg/ml, the germination of wildtype seeds of Arabidopsis was severely inhibited while the transgenic seeds were still able to germinate. However, the growth of transgenic seedling was suppressed obviously compared with that of the untreated seedlings (Figure 5A). According to your suggestion, in the revised manuscript, we used “tolerance”, rather than “resistance” to weaken the statement.

      2) However, it seems likely that the transgenic lines only show delayed disease symptoms and are not truly resistant to this pathogen. The authors did not state clearly if Verticillium wilt disease resistance assays were performed on homozygous transgenic plants and their corresponding null segregants as negative controls. They also fail to provide evidence that the transgenic Arabidopsis and cotton challenged with the pathogen are able to grow to maturity and set viable seeds.

      Thanks for your comments! In our routine procedure for generating transgenic plant lines, we identified non-transgenic plants in the segregative generation (usually in T1 generation) of transformats, and then used these non-transgenic plants (null lines) as control to rule out the somatic variation from tissue culture. In the meantime, the homologous transgenic plants were identified in the segregative generation and propagated by selfing. Relevant descriptions were added in the section of Methods & Materials (Lines 783-795).

      We agree with your opinion. The resistance displayed in seedlings does not always match that in maturity stage, because the resistance of host to Verticilium pathogen can be affected by environmental conditions, for example, temperature and nutrition. Nevertheless, in the case of our transgenic plants, the resistance to Verticilium disease is endowed from the detoxic function of BbCrpa. Theoretically, such transgenic traits will not be significantly affected by the environmental conditions if the expression of transgenes is stable. We had detected the expression level of BbCRPA during plants growth and in deferent generation (to T5 generation). The expression of BbCRPA gene was stable and the resistance to the diseases is descendible.

    1. Author Response

      Reviewer #1 (Public Review):

      This is a well-conceived and well-executed investigation of how activation loop autophosphorylation and IN-box autophosphorylation synergistically activate AURKB/INCENP. An elegant chemical ligation strategy allowed construction of the intermediate phospho-forms so that the contributions of each phosphorylation event to structure, dynamics, and activity could be dissected. Autophosphorylation at both sites serves to rigidify both AURKB and the IN-box, and to coordinate opening, twisting, and activation loop movements. Consistent with previous findings, both sites are necessary for enzymatic activity; further, this work finds that activation loop autophosphorylation occurs slowly in cis while INbox autophosphorylation occurs quickly in trans.

      Due to abundant previous work in the field, many of the conclusions of this paper were expected. However, that does not diminish the quality of the work, and the addition of how kinase dynamics contribute to activation is important for AURKB and many other kinases. The experimental results are clear and interpreted appropriately, with good controls. The computational work is also clearly explained and directly tied to the function of the enzyme, making it highly complementary to the experimental findings and to previously published structures.

      We thank the reviewer for positive words about our work.

      Some minor limitations of the study:

      1) Of note when interpreting the HDX data, there is no coverage of the peptide containing the activation loop autophosphorylation site T248 (Fig S2A), and as mentioned in the Discussion, the time scale of HDX is not able to capture differences in exchange in very flexible regions like the activation loop.

      The peptides spanning the region containing the phosphorylated Aurora BThr248 are not shown in our coverage map because they do not meet the stringent quality criteria for peptides that we used for HDX analysis (see Material and Methods). However, we have compared these peptides in phosphorylated and unphosphorylated enzyme complex manually and added a paragraph 4 on page 4.

      “The peptides spanning the region containing the phosphorylated Aurora BThr248 are not shown in our coverage map (Figure 1-figure supplement 2A) because they did not pass the stringent peptide quality filter based on intensity, and redundancy of the peptide. However, upon manual analysis, we did not detect any changes in deuterium uptake between the phosphorylated and unphosphorylated forms in this region. Deuterium exchange in this part of the protein (which is also observed in the peptides immediately upstream of Aurora BThr248, see Supplementary file 5) is very rapid, independently of enzyme phosphorylation, so that complete exchange occurs even at the earliest time points. This is in contrast to the second part of the activation loop, which includes the Aurora BaEF helix (labeled region 3 in Figure 1A; peptide 254-260 in Supplementary file 5), where we clearly see HDX protection upon phosphorylation. It is possible that phosphorylation causes dynamic changes on a very fast scale (seconds or faster) in the part of the protein encompassing Aurora BThr248, but we could not detect them due to the limitation of our approach, which operates on the scale of minutes.“

      Also, we analyzed the peptides covering this region to confirm the extent of phosphorylation in the loop (Figure 3-figure supplement 2A).

      2) Some data lack robust statistical analysis, which would make the findings more compelling.

      We have now included statistical analysis throughout the paper where it was possible.

      3) One point that might be clarified is how the occupancy of T248 was confirmed to be either fully phosphorylated in the [AURKB/IN-box]IN-deltaC or fully dephosphorylated in the IN-box K846N/R827Q mutant. Especially because T248 autophosphorylation is found to occur in cis, it is unclear how incubating the [AURKB/IN-box]IN-deltaC with traces of wild-type [AURKB/IN-box]all-P would ensure that T248 is phosphorylated.

      We confirmed phosphorylation occupancy of Aurora BThr248 by mass spectrometry in [Aurora B/IN-box]allP and [Aurora B/IN-box]loop-P but not for [AURKB/IN-box]no-P (Figure 3-figure supplement 2A). To achieve complete phosphorylation in the activation loop of [Aurora B/IN-box]IN-DC, we incubated this construct with fully active [Aurora B/IN-box]all-P. This is because, according to previous kinetic analysis, cisphosphorylation of Aurora BThr248 is only obligatory when the entire enzyme population is in the nonphosphorylated state. Once a fraction of the Aurora B population has been partially or fully activated, phosphorylation of Aurora BThr248 can also occur in trans, by the already activated enzyme. In other words, our model proposes an obligatory initial intramolecular step followed by propagation of activation in trans as reported by (Zaytsev, Segura-Peña at al, eLife.2016). We have now clarified this on page 11, paragraph 2 where we explain the results of the autoactivation kinetics.

      “It is noteworthy that phosphorylation of the activation loop in cis is the first necessary step in the autoactivation process, assuming a completely unphosphorylated enzyme pool. However, a partially active or fully active enzyme can phosphorylate the activation loop in trans. This type of activation mechanism with an initial intramolecular activation step followed by an intermolecular step of activation have been previously reported for PAK2 (J. Wang et al., 2011) and for Aurora B (Zaytsev et al., 2016).”

      Reviewer #2 (Public Review):

      This study presents a dynamic, multi-step model for the activation of Aurora-B kinase through the interaction with INCENP and autophosphorylation. This interaction is critical to the proper execution of chromosome segregation, and key details of the mechanism are not resolved. The study is an advance on previous studies on Aurora-B and the related kinase Aurora-C, primarily because it clarifies the roles of the different phosphorylation sites. However, major differences in the details of the molecular interactions are presented that are not clearly backed up by the evidence due to limitations in the approach, when compared to previous work based on crystal structures.

      Strengths. The experimental approach to the analysis of the Aurora-B/INCENP interaction is sound and novel and it is striking example of preparation of proteins in specific phosphorylation states, and of using HDX to characterise localised changes in the structural dynamics of a protein complex. The authors have generated two intermediate phosphorylation states of the complex, enabling them to dissect their contributions to the regulation of structural dynamics and activity of the complex.

      Weaknesses. The major weakness of the study is the molecular dynamics simulation. The resulting model of the complex differs from the crystal structure of the Aurora-C/IN-box structure in key details, and these are neither described clearly nor explained. The challenges/limitations of simulation of phosphorylated proteins should be described.

      We thank the reviewer for the positive words about our work and the criticism that helped us to improve our manuscript. We have now extended the MD studies that confirm our original observations regarding the entropic nature of the IN-box and the effect of phosphorylation on the structure and dynamics of [Aurora B/IN-box]. We clarify that the conformation of [Aurora B/IN-box]all-P observed in the simulations is not the final folded state, but a productive intermediate in the activation pathway of [Aurora B/IN-box]. For this reason, the differences from the [Aurora C/IN-box] crystal structure do not indicate flaws in the simulations. On the contrary, we believe that the data from the MD simulations provide crucial insights into the dynamical properties of the system that could not otherwise be assessed.

    1. Author Response

      Reviewer #2 (Public Review):

      1) It has been reported that PHD fingers can bind to DNA in addition to lysine-methylated histone H3. Can the authors address whether or not the enhanced selectivity of PHD-nucleosome interactions over PHD-peptide interactions is due to PHD-DNA binding?

      We apologize for not making this clearer in our initial manuscript. We did test the ability of our PHD readers to bind nucleosomal DNA of various lengths and observed no significant engagement (Figure 1 - figure supplement 1B). This is emphasized in the revised text.

      2) What's the binding affinities of PHD-nucleosome interactions and PHD-peptide interactions, respectively?

      The relative EC50 (EC50rel) for these interactions (Figure 1 - figure supplement 1C-D and Figure 2 - figure supplement 1H) are consistent with others using Alpha/dCypher technologies (e.g. doi.org/10.1101/ 2022.02.21.481373v1; which also contains a detailed description of EC50rel calculation and the difference between this value and an equilibrium Kd).

      3) Histone H4K5acK8ac is a well-known site-specific histone acetylation mark for gene transcriptional activation, much more so than histone H3 acetylation. Does H4K5K8 acetylation enhance PHD-H4K3me3 binding in nucleosome?

      We appreciate the reviewer for asking this question. In our studies, we tested H4K5ac and H4K8ac binding individually but do not have a nucleosome with the dual H4K5ac8ac nucleosome. Given the limited amount of time and resources we had for making more nucleosomes, we felt our efforts were better spent on developing heterotypic nucleosomes to answer the more striking cis vs. trans question posed by both reviewers.

      4) The authors provided the data showing cis histone H3 tail lysine acetylation effects on PHDH4K3me3 binding. What about trans histone H3 lysine acetylation effects?

      Thank you for this suggestion. To address this, we expended considerable resources to create new fully PTM-defined heterotypic (to accompany our homotypic) nucleosomes (note nomenclature to minimize confusion with asymmetric/symmetric DNA methylation) to directly test whether MLL1’s activity enhancement in the context of H3 tail acetylation occurs in cis or in trans. As shown in Figure 2D, enhancement of H3K4 methylation only occurs with heterotypic nucleosomes that have an available H3K4 residue with tail acetylation in cis (H3K4me3 • H3K9acK14acK18ac (hereafter H3triac)) and is not seen in H3K4 methylatable nucleosomes with tail acetylation in trans (H3 -> H3K4acK9acK14acK18ac (hereafter H3tetraac)). These exciting new findings greatly strengthen our study and provide more definitive mechanistic details of H3ac → H3K4me regulation.

    1. Author Response

      Reviewer #2 (Public Review):

      This manuscript reports an experiment involving learning and imaging of neural activity in rats. The goal was to test if scopolamine, which is an antagonist of acetylcholine receptors, could cause memory loss (amnesia). Two types of learning were tested: first, rats learned to prefer a short path, compared to a detour path, between two rewarded locations in a linear maze; second, in a subset of the experimental sessions, a shock zone was activated in the middle of the short path, and rats had to learn to avoid it. As a control, some sessions had a clear plexiglass barrier placed in the middle of the short path, which should not have aversive properties. The order of sessions was different for different groups of rats, but shock learning was always followed by a number of 'extinction' sessions without shock. In some groups, shock learning was accompanied by a systemic (intraperitoneal) scopolamine injection, 30 min before the start of the session. This manipulation was performed on most rats once but at a different slot in the sequence of sessions (sometimes before the drug-free shock learning, sometimes after it, sometimes in the absence of preceding barrier sessions, sometimes after them). In what follows, I might use the terms 'control group' and 'test group' to refer to sessions without scopolamine and sessions with it, respectively.

      The main behavioural results are that rats increase their visits to the short path with learning, then visit less the short path once the shock zone is active or when the barrier is there. When re-tested in later sessions, rats trained in the absence of the scopolamine injection still avoid the short path, while most of the rats that were given the scopolamine injection do not avoid it, suggesting a deficit in encoding or recall of the shock zone memory.

      In addition to these behavioural manipulations, the authors image the activity of dorsal CA1 hippocampal neurons using calcium imaging. They detect the existence of place cells, which increase their firing on specific portions of the short path of the maze (the long path data is not analysed). When comparing data before the shock training to during shock training, control place cells were more stable (i.e. had increased between-session correlations) and had more recurrent place fields (i.e. spatially active in one session then still active in another session) with respect to test data (rats injected with scopolamine). When comparing the pre-shock session and the first extinction session, place cell activity was less similar in the control rats (no scopolamine) compared to the scopolamine rats; but note that for scopolamine rats, extinction occurred earlier, so instead of using the first extinction session, the 4th session post-shock training was used to match the control data. Place cell activity has been shown to allow decoding of the animal's position; here, position decoding accuracy was lower around the shock zone in the control group compared to the scopolamine group.

      From these analyses, the manuscript proposes the following findings: 1) scopolamine injections impair avoidance learning, and 2) scopolamine affects the long-term response of place cells to the aversive experience (less "remapping"). These findings are interpreted to support the idea that 3) place cell remapping is involved in avoidance/aversive learning and 4) that scopolamine, as an antagonist of the muscarinic acetylcholine receptors in the hippocampus (or elsewhere?), produces amnesia.

      These findings, if properly supported, would be very interesting to a wide range of researchers interested in the neural bases of memory and learning, specifically aversive memory and spatial learning. The manuscript is well-written, has the advantage of using both male and female rats (which have consistent results), is one of the rare studies to date to perform calcium imaging of the hippocampus in rats, and records along learning of two simple tasks which seems relatively ecological and produce robust learning effects, and uses an experimental manipulation (injection of scopolamine) instead of purely correlational measures. The authors make some analytical effort in equalizing the number of trials across different sessions (even though I do not believe this fixes the existing confounds). I particularly appreciated the nice '3D' trajectory plots that show the unfolding of behaviour along a given session and that each individual's data are generally shown in the figures.

      However, the experiment and its analysis seem to have some major flaws both in the experimental design (which may be difficult to fix) as well as in the analysis (which might be easier to fix), which prevent proper interpretation of the results. Specifically:

      • To demonstrate finding 1 (that scopolamine specifically impairs avoidance learning), a control would be needed to show that the injection procedures do not impair general behaviour (e.g. motivation, attention, level of stress) as well as other forms of learning. Indeed, the control rats - as far as I understood - are not injected with saline, which would have been an appropriate control. One example of non-specific confounding effects of scopolamine is that it could, for example, reduce sensitivity to pain, thus to some extent decreasing the relevance of the shock to the rats, but many other interpretations are possible; most revolve around the idea that instead of scopolamine impairing learning, scopolamine might impair behaviour, which might, in turn, impair learning. Indeed, the scopolamine injection is shown to decrease running speed and the number of trials run, even before exposure to the shock. Related to this, the "short path preference" does not seem to be quantified properly: instead of simply using the number of visits to the short path, a better measure would be to compute a relative preference index quantifying visits to the short path with respect to visits to the long path [e.g. (num short - num long) / num total], to focus on the preference regardless of global changes in behavioural activity levels. In summary: the proposition that scopolamine specifically impairs avoidance learning has not been convincingly demonstrated; the possibility that it even impairs any form of learning is not currently demonstrated either.

      We have run an additional saline-only control group (along with additional scopolamine groups) to demonstrate that scopolamine does impair avoidance learning. As shown in the Supplement to Fig. 1, rats receiving saline alone show significantly greater post-training avoidance of the short path than mice receiving scopolamine.

      • For similar reasons, finding 2 (that the place cell response to aversive learning is affected by the scopolamine injection) is subject to the same lack of controls and existence of possible confounds noted above. Specifically, running more slowly and running fewer laps would have affected the overall amount of excitation of place cells during the session, which might affect plasticity, as well as the amount of reactivations/replay at the reward sites, which is likely to have effects in terms of memory consolidation. One way to potentially control for this would be to have the control group run the same amount of trials as the test group (but then session durations would be different); it is unclear how to prevent the difference in running speed. To be able to claim that the effects of scopolamine are specific to aversive learning, a control with either no learning (perhaps the long path data would be useful for this?) or appetitive learning (e.g. of a reward location, which also involves place field reorganization in some cases) would be useful.

      We recognize and understand the referee’s valid concerns about these potential confounds. The revised paper makes more clear that the scopolamine condition is designed as a control for whether an aversive stimulus is remembered or forgotten, and the barrier condition is a control for whether a novel path-blocking stimulus is aversive or neutral. As the referee points out, there are other confounding variables that may covary with these manipulated factors. The revised discussion (ll. 544-559) acknowledges potential confounds arising from learning-induced behavior changes, and argues that even though it is not possible to perfectly control for such changes, our current study does so more effectively than most prior studies because the rat’s behavior during isolated beeline trials is highly stereotyped and thus more similar across experimental conditions than in any prior study that we know of. The revised discussion also acknowledges the difficulty of dissociating acquisition versus extinction effects upon remapping (ll. 633-644). The significance of the barrier manipulation is now acknowledged more clearly in the abstract, introduction, and discussion.

      • Statement 3 implies a causal link between the two first statements, suggesting that place cell remapping would be necessary for the memory of an aversive experience or aversive location. Given the weakness of the arguments supporting the first 2 statements, this is also non convincingly demonstrated. In any case, the current paradigm would not be sufficient to make a causal link, but it might be sufficient to show a correlational link by showing a correlation between the amount of remapping and memory performance, such as presented in supplementary figure 5 (which would still be informative even if results from the scopolamine sessions were removed?).

      We agree with the referee’s point that our study’s evidence is correlational in nature. The revised manuscript prominently acknowledges this in the concluding sentence of the discussion’s opening paragraph (ll. 500-503), which now reads: “While these results do not definitively prove that place cell remapping is causally necessary for storing memories of aversive encounters, they provide correlational evidence that remapping occurs selectively under conditions where a motivationally significant (rather than neutral) stimulus occurs and is subsequently remembered rather than forgotten.”

      • Statement 4 - that scopolamine causes amnesia - is both not fully defined (what form of amnesia?) and not supported by the findings for the reasons mentioned above.

      It is unclear what remedy the referee is recommending for this concern. We do not present the idea that scopolamine is an amnesic drug as a novel conclusion of our study; rather, this is a widely accepted view in the literature that motivated our experimental design decision to use scopolamine as a tool for dissociating whether an aversive event was remembered or forgotten. We recognize that scopolamine’s effects on memory may vary with experimental conditions. In the revised discussion, we extensively compare and contrast our experimental findings with acute and chronic results from numerous prior studies using scopolamine and other cholinergic drugs (ll. 561-678). We hope this is sufficient to address the referees concerns.

      • In addition to these concerns, a study cited here (Sun et al 2021) mentions a few references in the discussion regarding how muscarinic receptor agonists might affect the link between spikes and calcium signals. Scopolamine is a muscarinic receptor antagonist and might thus have related/reversed effects. Thus, the technique used (calcium imaging) does not seem the best to address questions related to scopolamine. The current manuscript also mentions some findings that were not replicated (e.g. lack of over-representation of the shock zone) which are probably due to the fact that the finding relied on extra-field isolated spikes, which are less likely to be detected via calcium imaging.

      This is an excellent point which is now addressed in the revised discussion; it is acknowledged (ll. 536-539) that mAChRs can regulate calcium signals and thus that scopolamine may have different effects on spikes detected from calcium imaging versus electrophysiology, which in turn could account for discrepancies between our finding that place fields do not migrate to aversively reinforced locations and Milad et al.’s (2019) findings that they do. The Sun et al. (2021) paper used calcium imaging methods similar to ours, so this factor is less likely to account for the discrepancy between their prior finding that place fields were acutely disrupted by scopolamine and our current finding that they were not.

      • If anything, perhaps targeted injections in a specific brain region (e.g. dorsal CA1), instead of systemic injection, might give a more precise picture of the effects of scopolamine on place cells and spatial memory, but I do not know if this is technically possible.

      Unfortunately the necessary placement of the GRIN lens above the recording location prevented the direct application of scopolamine there via cannulae. To date only one series of experiments has demonstrated single unit place cell recordings with direct microdialysis (Brazhnik et al. 2003, 2004). To study the effects of aversive learning across many days, we needed to utilize a recording method capable of tracking many cells across long time periods. However, systemic scopolamine has been widely used to study both learning (Anagnostaras et al. 1999; Huang et al. 2011; Svoboda et al. 2017) and its effects on place cells (Douchamps et al. 2013; Newman et al. 2017; Sun et al. 2021), thus by utilizing this method we can directly compare our findings with previous work. We have added a paragraph to the discussion (ll. 665-678) in which it is explained why the main conclusions of our study (namely, that place cell cell remapping is related to storage of memories for aversive events) do not depend upon whether or not scopolamine’s pharmacological actions were localized to the hippocampus (almost certainly they were not).

      My conclusion would be that the experiment either needs to be redesigned to address the original question (effect of scopolamine on place cell firing and aversive learning) or that some of the data could be still used to address different questions which have not been addressed with calcium imaging before, e.g. learning of the short path, activity on the short path vs long path, effects on behaviour and place cell activity of learning & extinction of the barrier and shock zone avoidance; perhaps without focusing on the scopolamine manipulations, which seem to introduce many confounds.

    1. Author Response

      Reviewer #2 (Public Review):

      Kim et al. examined the properties of neuronal connections responsible for inhibitory cell activation to show that the characteristics examined were similar in humans and rodents. This is important, as it suggests that the many rodent studies carried out over the past decades are physiologically relevant to humans.

      Strengths

      1) Human brain tissues are difficult to obtain, hence the study provides valuable insights

      2) An impressive multipronged approach was used for cell classifications

      3) Despite the lack of novel findings, the revelation of the similarities between human and rodent synapses is important and has far-reaching implications. This important finding suggests the knowledge generated from rodent research is, at least partly, physiologically relevant to and transferrable to humans.

      Weaknesses

      1) The study is descriptive by design, and hence provides limited conceptual advances, especially with the retrospect that synaptic properties are similar between humans and rodents (although see strength #3). For example, very similar findings and techniques have already recently been reported by a number of the same authors in the Campagnola et al., Science 2022 paper.

      We agreed that stimulus protocols of connectivity assays with multiple patch-clamp recordings in this study had been adapted from the recent publication (Campagnola et al., Science 2022). In this previous study, especially for human synaptic connectivity data, the main cell type categorization was at the level of excitatory and inhibitory neurons which identified based on morphological features and observed PSP characteristics (e.g., direction of membrane potential changes) when it connected each other. However, we went further to identify interneuron subclasses in the connectivity assays using virally labeled slice cultures and post-hoc HCR staining in addition to intrinsic classifier, which is not investigated from the recent publication (Campagnola et al., 2022). Therefore, following scientific findings and their implications are not the same shown in the previous study and we think this study provides a significant advance of our understanding in human cortical circuits organization.

      2) Despite the fact that normal physiology was reported, the use of pathological human brain tissue could affect the results.

      We agreed that the use of pathological human brain tissue to investigate normal physiology is not ideal, however, as mentioned in the METHODS below (section of “Acute slice preparation”), our surgically resected neocortical tissues show minimal pathology, and we believe these tissue preparations can be used to address normal physiological properties of human neurons. Importantly, we saw no effect of disease state (epilepsy vs. tumor) on the intrinsic or synaptic properties that we measured. Our METHODS state that “Surgically resected neocortical tissue was distal to the pathological core (i.e., tumor tissue or mesial temporal structures). Detailed histological assessment and using a curated panel of cellular marker antibodies indicated a lack of overt pathology in surgically resected cortical slices (Berg et al., 2021).”. We also state in the RESULTS that “These tissues were distal to the epileptic focus or tumor, and have shown minimal pathology when examined (Berg et al., 2021). Brain pathology was evaluated using six histological markers that were independently scored by three pathologists. Surgically resected tissues have been used extensively to characterize human cortical physiology and anatomy (Berg et al., 2021).”. Lastly, this is the best possible human tissue available for us to conduct physiological experiments. It is an unavoidable caveat of this work that our healthy brain tissue was derived from a donor brain exhibiting a serious disease.

      3) The manuscript may not be easy to understand for the uninvited, because many concepts and abbreviations were not properly introduced.

      Thank you for pointing this oversight out. We updated our manuscript and made sure that we fully describe all abbreviations. We now changed the abbreviation of MPC back to multiple patch-clamp recording, and some other abbreviations such as LAMP5, SLC17A7, DLX are now better explained. We have also changed the order of multiple figures (i.e., Figure 5 – Figure supplements to Figure 3 – Figure supplements) and removed some complicated figures (e.g., Figure 1 – Figure supplement 1) to present the data in a fashion that can be understood by a more general reader.

      4) The statistical treatment is not ideal, so some conclusions may not be valid.

      We performed additional statistical analyses as suggested and implemented in the text of the RESULTS.

      Furthermore, we also made additional Figure supplements (Figure 4 – Figure supplement 3, Figure 4 – Figure supplement 4, Figure 6 – Figure supplement 2, and Figure 6 – Figure supplement 3) to support our conclusions.

      5) The mixed usage of acute and cultured slices is not ideal and likely affects the outcome.

      We agree that the mixed usage of acute and cultured slices is not ideal, and it could affect the interpretation of outcome. Therefore, we performed additional analyses to see if there is any correlated change of synaptic property (i.e., paired pulse ratio) along the days after slice culture (now implemented in Figure 4 – Figure supplement 4 and Figure 6 – Figure supplement 3) and we didn’t find any significant correlation. However, we noticed the short-term synaptic dynamics are rather differentiated between acute and slice culture condition shown in Figure 4 – Figure supplement 1d. We think this is due to sampling bias rather than tissue preparation difference and these points are now more carefully described in the DISCUSSION as “This difference we observed in this study, i.e., more facilitating synapses were detected in slice cultures than in acute slices, could either reflect an acute vs. slice culture difference. However, we believe it is more likely to reflect a selection bias for PVALB neurons when patching in unlabeled acute slices, and that the AAV-based strategy with a pan-GABAergic enhancer allows a more unbiased sampling of interneuron subclasses whose properties are preserved in culture. In support of this, PPR analysis as a function of days after slice culture shows no relationship to acute versus slice culture preparation (Figure 4 – Figure supplement 4, Figure 6 – Figure supplement 3). Furthermore, we have observed that viral targeting of GABAergic interneurons greatly facilitates sampling of the SST subclass in the human cortex compared to unbiased patch-seq experiments (Lee et al., 2022), and this selection bias likely explains synapse type sampling differences in cultured slices compared to acute preparations.”.

    1. Author Response

      We would like to extend our thanks to the reviewers who took the time to carefully read our paper and provide thoughtful insights and suggestions on how to strengthen our conclusions. All reviewers agreed that our study presented strong data supporting a role for triglyceride lipase brummer (bmm) in regulating testis lipid droplets and spermatogenesis in Drosophila, and that our findings advance our understanding of lipid biology during sperm development. Reviewers also made several helpful suggestions on how to strengthen our manuscript even further. Below, we provide a brief outline of our plans to revise this manuscript in response to reviewer comments.

      The majority of reviewer comments will be addressed by text changes, rearranging figures to add images, and making a model to visually represent our findings. Together, these changes will ensure we clearly communicate our data and conclusions with readers, and properly contextualize our findings. See below for details on our planned revisions.

      Reviewer #1 (Public Review):

      In this study, the authors investigate the role of triglycerides in spermatogenesis. This work is based on their previous study (PMID: 31961851) on triglyceride sex differences in which they showed that somatic testicular cells play a role in whole body triglyceride homeostasis. In the current study, they show that lipid droplets (LDs) are significantly higher in the stem and progenitor cell (pre-meiotic) zone of the adult testis than in the meiotic spermatocyte stages. The distribution of LDs anti-correlates with the expression of the triglyceride lipase Brummer (Bmm), which has higher expression in spermatocytes than early germline stages. Analysis of a bmm mutant (bmm[1]) - a P-element insertion that is likely a hypomorphic - and its revertant (bmm[rev]) as a control shows that bmm acts autonomously in the germline to regulate LDs. In particular, the number of LDs is significantly higher in spermatocytes from bmm[1] mutants than from bmm[rev] controls. Testes from males with global loss of bmm (bmm[1]) are shorter than controls and have fewer differentiated spermatids. The zone of bam expression, typically close to the niche/hub in WT, is now many cell diameters away from the hub in bmm[1] mutants. There is an increase in the number of GSCs in bmm[1] homozygotes, but this phenotype is probably due to the enlarged hub. However, clonal analyses of GSCs lacking bmm indicate that a greater percentage of the GSC pool is composed of bmm[1]-mutant clones than of bmm[rev]-clones. This suggests that loss of bmm could impart a competitive advantage to GSCs, but this is not explored in greater detail. Despite the increase in number of GSCs that are bmm[1]-mutant clones, there is a significant reduction in the number of bmm[1]-mutant spermatocyte and post-meiotic clones. This suggests that fewer bmm[1]-mutant germ cells differentiate than controls. To gain insights into triglyceride homeostasis in the absence of bmm, they perform mass spec-based lipidomic profiling. Analyses of these data support their model that triglycerides are the class of lipid most affected by loss of bmm, supporting their model that excess triglycerides are the cause of spermatogenetic defects in bmm[1]. Consistent with their model, a double mutant of bmm[1] and a diacylglycerol O-acyltransferase 1 called midway (mdy) reverts the bmm-mutant germline phenotypes.

      There are numerous strengths of this paper. First, the authors report rigorous measurements and statistical analyses throughout the study. Second, the authors utilize robust genetic analyses with loss-of-function mutants and lineage-specific knockdown. Third, they demonstrate the appropriate use of controls and markers. Fourth, they show rigorous lipidomic profiling. Lastly, their conclusions are appropriate for the results. In other words, they don't overstate the results.

      We thank the Reviewer for their positive assessment of our paper.

      There are a few weaknesses. Although the results support the germline autonomous role of bmm in spermatogenesis, one potential caveat that the mdy rescue was global, i.e., in both somatic and germline lineages. The authors did not recover somatic bmm clones, suggesting that bmm may be required for somatic stem self-renewal and/or niche residency. While this is beyond the scope of this paper, it is possible that somatic bmm does impact germline differentiation in a global bmm mutant.

      In the revised manuscript, we will more clearly delineate when we used global versus germline-only loss of mdy to rescue bmm mutant phenotypes in the testis. We will also acknowledge the possibility that somatic bmm may play a role in germline differentiation in a global bmm mutant.

      Regarding data presentation, I have a minor point about Fig. 3L: why aren't all data shown as box plots (only Day 14 bmm[rev] does). Finally, the authors provide a detailed pseudotime analysis of snRNA-seq of the testis in Fig. S2A-D, but this analysis is not sufficiently discussed in the text.

      We will make text and presentation changes in the revised manuscript to describe our data more clearly, and will add text to describe our pseudotime analysis of single-cell RNA seq data in more detail.

      Overall, the many strengths of this paper outweigh the relatively minor weaknesses. The rigorously quantified results support the major aim that appropriate regulation of triglycerides are needed in a germline cell-autonomous manner for spermatogenesis.

      This paper should have a positive impact on the field. First and foremost, there is limited knowledge about the role of lipid metabolism in spermatogenesis. The lipidomic data will be useful to researchers in the field who study various lipid species. Going forward, it will be very interesting to determine what triglycerides regulate in germline biology. In other words, what functions/pathways/processes in germ cells are negatively impacted by elevated triglycerides. And as the authors point out in the discussion, it will be important to determine what regulates bmm expression such that bmm is higher in later stages of germline differentiation.

      We agree with the reviewer about the many interesting future directions for this project. We will therefore add a model figure in the revised manuscript to visualize our findings and highlight remaining questions about how bmm and triglycerides support normal spermatogenesis in Drosophila.

      Reviewer #2 (Public Review):

      Summary:

      Here, the authors show that neutral lipids play a role in spermatogenesis. Neutral lipids are components of lipid droplets, which are known to maintain lipid homeostasis, and to be involved in non-gonadal differentiation, survival, and energy. Lipid droplets are present in the testis in mice and Drosophila, but not much is known about the role of lipid droplets during spermatogenesis. The authors show that lipid droplets are present in early differentiating germ cells, and absent in spermatocytes. They further show a cell autonomous role for the lipase brummer in regulating lipid droplets and, in turn, spermatogenesis in the Drosophila testis. The data presented show that a relationship between lipid metabolism and spermatogenesis is congruous in mammals and flies, supporting Drosophila spermatogenesis as an effective model to uncover the role lipid droplets play in the testis.

      We thank the Reviewer for their positive assessment of our paper.

      Strengths and weaknesses:

      The authors do a commendably thorough characterization of where lipid droplets are detected in normal testes: located in young somatic cells, and early differentiating germ cells. They use multiple control backgrounds in their analysis, including w[1118], Canton S, and Oregon R, which adds rigor to their interpretations. The authors employ markers that identify which lipid droplets are in somatic cells, and which are in germ cells. The authors use these markers to present measured distances of somatic and germ cell-derived lipid droplets from the hub. Because they can also measure the distance of somatic and germ cells with age-specific markers from the hub, these results allow the authors to correlate position of lipid droplets with the age of cells in which they are present. This analysis is clearly shown and well quantified.

      The quantification of lipid droplet distance from the hub is applied well in comparing brummer mutant testes to wild type controls. The authors measure the number of lipid droplets of specific diameters, and the spatial distribution of lipid droplets as a function of distance from the hub. These measurements quantitatively support their findings that lipid droplets are present in an expanded population of cells further from the hub in brummer mutants. The authors further quantify lipid droplets in germline clones of specified ages; the quantitative analysis here is displayed clearly, and supports a cell autonomous role for brummer in regulating lipid droplets in spermatocytes.

      Data examining testis size and number of spermatids in brummer mutants clearly indicates the importance of regulating lipid droplets to spermatogenesis. The authors show beautiful images supported by rigorous quantification supporting their findings that brummer mutants have both smaller testes with fewer spermatids at both 29 and 25C. There is also significant data supporting defects in testis size for 14-day-old brummer mutant animals compared to controls. The comparison of number of spermatids at this age is not significant, which does not detract from the the story but does not support sperm development defects specifically caused by brummer loss at 14 days. Their analysis clearly shows an expanded region beyond the testis apex that includes younger germ cells, supporting a role for lipid droplets influencing germ cell differentiation during spermatogenesis.

      We thank the reviewer for pointing out this inaccuracy in our manuscript. In the revised manuscript we will choose more precise language to describe defects in sperm development in 14-day-old bmm mutants.

      The authors present a series of data exploring a cell autonomous role for brummer in the germline, including clonal analysis and tissue specific manipulations. The clonal data indicating increased lipid droplets in spermatocyte clones, and a higher proportion of brummer mutant GSCs at the hub are convincing and supported by quantitation. The authors also show a tissue specific rescue of the brummer testis size phenotype by knocking down mdy specifically in germ cells, which is also supported by statistically significant quantitation. The authors present data examining the number of spermatocyte and post-meiotic clones 14 days after clonal induction. While data they present is significant with a 95% confidence interval and a p value of 0.0496, its significance is not as robust as other values reported in the study, and it is unclear how much information can be gained from that specific result.

      We thank the reviewer for raising this point. In the revised manuscript we will display the p-value clearly to ensure our statistical output is clear for readers to evaluate our conclusions regarding bmm mutant clones 14 days after clone induction.

      The authors do a beautiful job of validating where they detect brummer-GFP by presenting their own pseudotime analysis of publicly available single cell RNA sequencing data. Their data is presented very clearly, and supports expression of brummer in older somatic and germline cells of the age when lipid droplets are normally not detected. The authors also present a thorough lipidomic analysis of animals lacking brummer to identify triglycerides as an important lipid droplet component regulating spermatogenesis.

      Impact:

      The authors present data supporting the broad significance of their findings across phyla. This data represents a key strength of this manuscript. The authors show that loss of a conserved triglyceride lipase impacts testis development and spermatogenesis, and that these impacts can be rescued by supplementing diet with medium-chain triglycerides. The authors point out that these findings represent a biological similarity between Drosophila and mice, supporting the relevance of the Drosophila testis as a model for understanding the role of lipid droplets in spermatogenesis. The connection buttresses the relevance of these findings and this model to a broad scientific community.

      Reviewer #3 (Public Review):

      In this manuscript, Chao et al seek to understand the role of brummer, a triglyceride lipase, in the Drosophila testis. They show that Brummer regulates lipid droplet degradation during differentiation of germ and somatic cells, and that this process is essential for normal development to progress. These findings are interesting and novel, and contribute to a growing realisation that lipid biology is important for differentiation.

      We thank the Reviewer for their positive assessment of our manuscript.

      Major comments:

      1) The data in Figs 1 and 2, while helpful in setting the scene, do not add much to what was previously shown by the same group, namely that lipid droplets are present in both early germ cells and early somatic cells in the testis, and that Bmm regulates their degradation (PMID: 31961851). Measuring the distance of lipid droplets from the hub, while helpful in quantifying what is apparent, that only stem and early differentiated stages have lipid droplets, is not as informative as the way data are presented later (Fig. 2I), where droplets in specific stages are measured. Much of this could be condensed without much overall loss to the manuscript.

      We thank the reviewer for this comment and will condense the first part of the paper in our revised manuscript.

      2) It would be important to show images of the clones from which the data in Fig. 2I are generated. The main argument is that Bmm regulates lipid droplets in a cell autonomous manner; these data are the strongest argument in support of this and should be emphasised at the expense of full animal mutants (which could be moved to supplementary data).

      We thank the reviewer for this comment, and will add an image in our revised manuscript showing lipid droplets in bmm mutant spermatocyte clones.

      Similarly, the title of Fig. S2 ("brummer regulates lipid droplets in a cell autonomous manner") should be changed as the figure has no experiments with cell (or cell-type)-specific knockdowns/mutants. This figure does show changes in lipid droplets in both lineages in bmm mutants, so an appropriate title could be "brummer regulates lipid droplets in both germ and soma".

      We thank the reviewer for this comment, and will adjust the S2 figure legend title in the revised manuscript.

      3) Interestingly, the clonal data show that bmm is dispensable in germ cells until spermatocyte stages, as no increase in lipid droplet number is seen until then. This should be more clearly stated, as it indicates that the important function of Bmm is to degrade lipid droplets at the transition from spermatogonial to spermatocyte stages. This is consistent with the phenotypes observed in which late stage germ cells are reduced or missing. However, the effect on niche retention of the mutant GSCs at the expense of neighbouring wildtype GSCs is hard to explain. Are lipid droplets in mutant GSCs larger than in control? Is there any discernible effect of bmm mutation on lipids in GSCs? Additionally, bam expression is delayed, suggesting that bmm may have roles on cell fate in earlier stages than its roles that can be detected on lipid droplets.

      We thank the reviewer for this comment. We will include more text in the revised manuscript to clarify the key role bmm plays in regulating lipid droplets at the spermatogonia-spermatocyte transition. We will also add more detail and potentially data to our description of how bmm affects lipid droplets in cells at the earliest stages of germline development.

      4) The bmm loss-of-function phenotype could be better described. Some of the data is glossed over with little description in the text (see for example the reference to Fig. 3A-C). For instance, in the discussion, the text states "loss of bmm delays germline differentiation leading to an accumulation of early-stage germ cells" (p13, l.259-60). However, this accumulation has not been clearly shown, or at least described in the manuscript. Most of the data show a reduction (or almost complete absence) of differentiated cell types. This could indeed be due to delayed differentiation, or alternatively to a block in differentiation or to death of the differentiated cells. The clonal data presented show a decrease in the number of cells recovered, but do not allow inferences as to the timing of differentiation, making it hard to distinguish between the various possibilities for the lack of differentiated spermatids. Apart from data showing that GSCs are more likely to remain at the niche, no further data are shown to support the fact that mutant germ cells accumulate in early stages. While additional experiments could help resolve some of these issues, much of this could also be resolved by tempering the conclusions drawn in the text.

      We thank the reviewer for these comments. In the revised manuscript we will temper our conclusions regarding bmm’s precise role in spermatogenesis by discussing different mechanisms (e.g. differentiation or death) that could lead to the phenotypes we observe.

      5) In the discussion (p.14, l-273 onwards), the authors suggest that products of triglyceride breakdown are important for spermatogenesis. However, an alternative interpretation of the results presented here (especially those using the midway mutant) could be that triglycerides impede normal differentiation directly. Indeed, preventing the cells' ability to produce triglycerides in the first place can rescue many of the defects observed. A better discussion of these results with a model for the function of triglycerides and their by-products would be a great improvement to this manuscript.

      We thank the reviewer for this comment. To ensure our data is clearly communicated with readers, we will add a model to the paper suggesting how triglyceride and its by-products influence spermatogenesis.

      Together, these changes will strengthen our overall finding that bmm-mediated regulation of testis triglyceride is important for normal sperm development. Because our findings in flies align with and extend data from rodent models, the developmental mechanisms we uncovered about how triglyceride lipase bmm regulates testis lipid droplets and sperm development will likely operate in other species.

    1. Author Response

      Reviewer #1 (Public Review): 

      “I recommend that the authors revisit their calculation methods to provide a more convincing conclusion on the presence of positive epistasis for fitness in their dataset.” 

      The reviewer is right that the present description of the fitness calculation can be found insufficient. Below we provide relevant derivation. It will be included in our planned revision, probably as a supplementary text: 

      Expected fitness effect of multiple mutations

      Fitness is the number of offspring divided by the number of progenitors, w\=No/Np. This can be the number of cells left by one cell (including itself in the case of budding cells) over a unit of time. 

      Assume that an organism carries multiple mutations—α, β, … ω—which are in heterozygous loci, their wild-type counterparts are marked universally with +. The fitness effect of a single mutation is wα/+, and so on. Fitness can be converted to relative fitness, i.e., expressed as a quotient of the wild-type fitness, wα/+/w+/+, and so on. Under the multiplicative model of mutation accumulation, an expected joint effect of multiple mutations on relative fitness is a product of individual quotients: 

      wexp/w+/+ = (wα/+/w+/+) (wβ/+/w+/+) … (wω/+/w+/+). 

      When a population is continuously growing, log-transformation of fitness is typically applied as it equates the rate of growth. In particular, it could be the number of doublings completed over a unit of time: 

      log2(wexp/w+/+) = log2[(wα/+/w+/+) (wβ/+/w+/+) … (wω/+/w+/+)]. 

      After replacing the above log multiplicative formula with its log additive equivalent, all its terms can be normalized by dividing by log2 fitness of the wild type which turns them into relative doubling rates, for example, log2(wα/+)/log2(w+/+)=rDRα/+. The joint effect of multiple mutations is then equal to 

      rDRexp = 1 + (rDRα/+  ̶ 1) + (rDRβ/+  ̶ 1) + … + (rDRω/+  ̶ 1) 

      or 

      rDRexp = 1 + ∑d  

      where d\=rDR ̶ 1 (see Fig. 2B in the main text). 

      Reviewer #2 (Public Review):

      “The initiation and interpretation of the results were apparently performed in a vacuum of a century of work on genomic balance.” 

      Indeed, we neither introduce nor discuss results obtained with organisms other than the budding yeast. We accept that researchers working with other beasts may expect to see such considerations. We will introduce them into a revised submission, to the extent allowable for non-review articles. The suggestions provided by the reviewer will be followed. 

      “If there is an increase in the general transcriptome size, then there might not be much reduction of the proteosome subunits as claimed and the increases might be somewhat less than indicated.” 

      - together with – 

      “A second experiment that would clarify the results would be to perform estimates of the general transcriptome size. If the general transcriptome size is actually increased, the claims of reduced expression of the proteosome might need to be revised.” 

      Multicellular organisms may be well heterogenic across their bodies, in terms of the cell number and (transcriptome) composition. We believe that any proper sampling of their mRNA requires careful absolute, and not only relative, quantification. We worked with clones prepared to be mostly homogeneous (about two-three divisions under conditions promoting strong growth). We maintain that we are allowed to rely on chromosomal averages expressed as proportions of the total mRNA. Our monosomic counts were very strictly around 50% of those predicted for euploids, we do not see any danger of erroneous sampling or calculation in our wonderfully simple case. Regarding the problem of a possible general increase in the transcriptome size which would help to compensate for the decrease in the proteasomic mRNAs, we adopted a strictly linear interpretation. That is, in a doubled transcriptome not only mRNAs for the proteasome but also all other proteins (prey species for the proteasome) would be doubled and thus no relief in proteolysis would happen. We follow here previous findings that in yeast the fractions of individual mRNAs are reflected in the fractions of ribosome-bound mRNA fragments and (at least roughly) mature proteins (e.g., the cited by us Larrimore et al. 2020). 

      “The claim of Torres et al that there are no global modulations in trans is counter to the knowledge that transcription factors are typically dosage sensitive and have multiple targets across the genome … Taken as a whole it would seem to suggest that there are many inverse relationships of global gene expression with chromosomal dosage in both yeast disomies and monosomies.” 

      Well, the debate about mRNA compensation in relation to aneuploidy in yeast has been intense and sometimes heated. Disparate claims can be found but we are left with an overwhelming impression that the relation between the total amount of mRNA and the number of chromosome copies is pretty much (perhaps not ideally) linear. We will consider our wording again. We will admit that such a rigid relation is somewhat unusual compared to other eukaryotes. But again, we see strict halves of mRNA for the monosomic chromosomes.  

      “To clarify the claims of this study, it would be informative to produce distributions of the various ratios of individual gene expression in monosomy versus diploid as performed by Hou et al. 2018.” 

      Such distributions are already prepared and will be likely presented in a revised msc. 

      “The authors claim there are no genes that are compensated on the varied chromosome but considering how many genes are upregulated across the genome, it would seem that a subset are probably upregulated on the cis chromosome as well and approach the diploid level, i.e. are dosage compensated.” 

      Perhaps we misstated our conclusions somewhere but it was obvious to us that some genes were upregulated on the cis chromosome (monosomic), some other were downregulated, the net result was the average 50% (Fig. 3). 

      Reviewer #3 (Public Review): 

      “1) In Figure 3b (and line 179) …  What the data really show is that the level of overexpression is not correlated with the fitness effect of the deletion (since all the p values are not significant). The authors need to correct their conclusions.” 

      That’s right! We already mended it as we are preparing for revision. 

      “2) Why are some monosomic strains removed from the transcriptomics analysis, especially when the chromosome IV and XV strains show very strong positive epistasis? The authors need to provide an explanation here.” 

      We run out of money. In more scientific terms, we believe our sample of eight strains is unbiased and sufficient. It was truly randomly chosen. We were glad to see that it covers both slightly and most strongly affected (with profound epistasis) monosomics. All of them displayed parallel shifts in the transcriptome (RP up, proteasome down). We judged we could stop here, the additional five strains would be unlikely to change our main conclusions. 

      “3) The authors stated that diploidy observed in chromosome VII and XIII strains were due to endoreplication after losing the marked chromosomes (lines 97 and 117). Isn't chromosome missegregation an equally possible explanation? Since monosomic cells are generated by chromosome missegregation during mitosis, another chromosome missegregation event may occur to rescue the fitness (or viability) of monosomic cells in these strains.” 

      We believe that it happened in this way as the reviewer suggests, at least in most cases. By “endoreduplication”, we understand any event making two chromosomes of one, not necessarily additional DNA replication. We will check our text to make it clear in this respect.

    1. Author Response

      Reviewer #2 (Public Review):

      The authors dissected the effects of mycolacton on endothelial cell biology and vessel integrity. The study follows up on previous work by the same group, which highlighted alterations in vascular permeability and coagulation in patients with Buruli ulcer. It provides a mechanistic explanation for these clinical observations, and suggests that blockade of Sec61 in endothelial cells contributes to tissue necrosis and slow wound healing. Overall, the generated data support their conclusions and I only have two major criticisms:

      • Replicating the effects of mycolactone on endothelial parameters with Ipomoeassin F (or its derivative ZIF-80) does not demonstrate that these effects are due to Sec61 blockade. This would require genetic proof, using for example endothelial cells expressing Sec61A mutants that confer resistance to mycolactone blockade. The authors claimed in the Discussion that they could not express such mutants in primary endothelial cells, but did they try expressing mutants in HUVEC cell lines? Without such genetic evidence all statements claiming a causative link between the observed effects on endothelial parameters and Sec61 blockade should be removed or rephrased. The same applies to speculations on the role of Sec61 in epithelial migration defects in discussion. Data corresponding to Ipomoeassin F and ZIF-80 do not add important information, and may be removed or shown as supplemental information.

      • While statistical analysis is done and P values are provided, no information is given on the statistical tests used, neither in methods nor results. This must be corrected, to evaluate the repeatability and reproducibility of their data.

      We respectfully but fundamentally disagree with the comments regarding the Sec61 dependence of the effects that we observed. We showed that loss of glycocalyx and basement membrane components underpinned the phenotypic changes in endothelial cells (morphological changes, loss of adhesion, increased permeability, and reduced ability to repair scratch wounds). We demonstrated that we could phenocopy permeability increases and elongation phenotype by knocking down the type II membrane protein B3Galt6, and reverse the adhesion defect by exogenous provision of the secreted laminin-511 heterotrimer.

      Our conclusion that mycolactone mediates these effects via Sec61 inhibition is not based solely on the use of alternative inhibitors but is built on several pillars of evidence:

      First, the proteomics data conforms entirely to predictions based on the topology of affected vs. non-effected proteins, and agrees with independently published proteomic datasets from T lymphocytes, dendritic cells and sensory neurons (ref.12), as well as biochemical studies performed using in vitro translocation assays (ref.11,34). Furthermore, the pattern of membrane protein down regulation observed in our experiments fits perfectly with established models of protein translocation mechanisms, particularly with respect to the lack of effect on specific topologies of multipass membrane proteins, tail anchored- and type III membrane proteins (ref.34-36).

      Second, since Sec61 very highly conserved amongst mammals and is found in all nucleated cells, it is hard to conceptualise a framework in which mycolactone targets Sec61 in some cells and not others, as this reviewer suggests might be the case for epithelial cells [noting that the work being referred to (ref.29) predates our 2014 work showing that mycolactone is a Sec61 inhibitor (ref.7)]. Indeed, mycolactone has been shown to target Sec61 in multiple independent approaches including forward genetic screens involving random mutagenesis and CRISPR/Cas9 (ref.10, PMID: 35939511). Genetic evidence has previously been provided for the Sec61 dependence of mycolactone effects in epithelial cells (ref.10,17). We have unpublished genetic evidence that the rounding and detachment of epithelial cells due to mycolactone is reduced when resistance mutations are over expressed, and will consider including this in the next version of the manuscript.

      Third, given this weight of evidence, one would be hard-pressed to provide an alternative explanation for the specific down-regulation of glycosaminoglycan-synthesising enzymes and adhesion/basement membrane molecules while most cytosolic and non-Sec61 dependent membrane proteins are unchanged or upregulated. However, seeking to be as rigorous as possible we have here shown that a completely independent Sec61 inhibitor produces the same phenotype at the gross and molecular level. Ipomoeassin F (Ipom-F) is a glycolipid, not a polyketide lactone, yet they both compete for binding with cotransin in Sec61α (ref.6). There is significant overlap in the cellular responses to mycolactone and Ipom-F, including the induction of the integrated stress response (ref.17, PMID: 34079010), which we observed again in the current data, providing further evidence that this approach is useful when genetic approaches are technically unattainable.

      Therefore, we are confident the effects seen on endothelial cells are Sec61-dependent. We are happy to provide more detail on our lengthy attempts at over-expressing mycolactone resistant SEC61A1 genes in HUVECs; primary endothelial cells derived from the umbilical vein. We are highly experienced in this area, and have previously stably expressed these proteins in epithelial cell lines, reproducing the resistance profile (ref.10,17). Notably though, these cells do not have normal ‘fitness’ in the absence of challenge. Since endothelial cells (and endothelial cell lines; PMID: 12560236) are extremely hard to transfect with plasmids, with efficiency routinely 5-10% (including in our hands), we developed a lentivirus system. We were eventually (after multiple attempts using different protocols) able to transduce primary HUVECs with constructs expressing GFP (at an efficiency of about 10-20%) and select/expand these under puromycin selection. Never-the-less, we never recovered any cells that expressed the flag-tagged SEC61A1 wild type or SEC61A1 carrying the resistance mutant D60G. We also attempted to select D60G-transduced cells with mycolactone epimers, an approach that can help the cells compete against non-transduced cells in culture flasks (ref.10). We concluded that primary endothelial cells are unable to tolerate the expression of additional Sec61α, and this was incompatible with survival.

      It’s also important to note that most endothelial cell specialists would agree that endothelial cell lines are not good models of endothelial behaviour. We tested the HMEC-1 cell line, but found it did not express prototypical endothelial marker vWF in the expected way. Therefore we focussed our efforts on primary endothelial cells. Should we be able to overcome the dual challenge of the necessity to work in primary cells, and the difficulty of over-expressing Sec61, we will update this paper at a later date with this data, and will also expand the above arguments.

      We apologise for the embarrassing oversight of not including information about the statistical analyses we used, which of course we will correct in full in the revised version. However, we would like to provide this information to readers of the current version of the manuscript. All data were analysed using GraphPad Prism Version 9.4.1:

      Figure 1: one-way ANOVA with Dunnett’s (panel A) or Tukey’s (panel B) correction for multiple comparisons

      Figure 2 supplement: one-way ANOVA with Tukey’s correction for multiple comparisons (analysed panel)

      Figure 3: one-way ANOVA with Tukey’s (panel B) or Dunnett’s (panel E&F) correction for multiple comparisons

      Figure 4: one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 5 and supplement: one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 6: one-way ANOVA with Dunnett’s correction for multiple comparisons (analysed panel)

      Figure 6 supplement: one-way ANOVA with Dunnett’s correction for multiple comparisons (all analysed panels)

      Figure 7: two-way ANOVA with Tukey’s correction for multiple comparisons (all analysed panels; panels B&C also included the Geisser Greenhouse correction for sphericity)

      Figure 7 supplement: Panels A&D used a repeated measures one-way ANOVA with Dunnett’s correction for multiple comparisons (panel D also included the Geisser Greenhouse correction for sphericity). Panels B,C&E used a two-way ANOVA with Tukey’s correction for multiple comparisons (panels B&C also included the Geisser Greenhouse correction for sphericity)

    1. Author Response:

      We would like to thank the reviewers and editor for their insightful comments and suggestions. We will update the manuscript accordingly. We are particularly glad to read that our software package constitutes a set of “well-written analysis routines” which have “the potential to become very valuable and foundational tools for the analysis of neurophysiological data”. Both reviewers have identified a number of weaknesses in the manuscript, and we would like to take this opportunity to provide a response to some of the remarks and clarify the objectives of our work. We would like to stress that this kind of toolkit is in continual development, and the manuscript offered a snapshot of the package at one point during this process. Since the initial submission several months ago, several improvements have been implemented and further improvements are in development by our group and a growing community of contributors. The manuscript will be updated to reflect these more recent changes, some which will directly address the reviewers’ remarks.

      It was first suggested that the manuscript should better showcase the value of the analysis pipeline. As noted by the first reviewer, the online repository (i.e. GitHub page) conveys a better sense of how the toolbox can be used than the present manuscript. Our original intention was to illustrate some examples of data analysis in Figure 4 by adding the corresponding Pynapple command above each processing step. Each step takes a single line of code, meaning that, for example, one only needs to write three lines of code to decode a feature from population activity using a Bayesian decoder (Fig. 4a), or to compute a cross-correlograms of two neurons during specific stimulus presentation (Fig. 4b), or to compute the average firing rate of two neurons around a specific time of the experimental task (Fig. 4c). In our revision, we will include code snippets which will clearly show the required steps for each of these analyses. In addition, we will more clearly point the reader to the online tools (e.g. Jupyter notebooks), which offer an easier and clearer way to demonstrate the use of the toolbox.

      Another remark concerns our claim that the package does not have dependencies. We agree that this claim was not well-worded. Our intention was to say that the package exclude dependencies such as scikit-learn, tensorflow or pytorch, which are often used in signal processing and which can be tedious to install. Pynapple still depends on a few packages including the most common ones: Numpy, Scipy, and Pandas. We will rephrase this statement in the manuscript and emphasize the importance of minimal dependencies for long-term backwards-compatibility in scientific computing.

      We will complete the bibliography to make sure we properly reference all the packages designed for similar purpose. To note, some are not citable per se (i.e. no associated paper) but will be discussed.

      It was suggested that the manuscript should better describe the integration of Pynapple into a full experimental data pipeline. This is an interesting point, which was briefly mentioned in the third paragraph of the discussion. Pynapple was not originally designed to pre-process data. However, it can load any type of data stream after the necessary pre-processing steps. Overall, this modularity is a key aspect of the Pynapple framework, and this is also the case for the integration with data pre-processing pipelines, for example spike sorting in electrophysiology and detection of region of interest in calcium imaging. We do not think there should be an integrated solution to the problem but, instead, to make it possible that any piece of code can be used for data irrespective of how the dataset was acquired. This is why we focused on making data loading straightforward and easy to adapt to any situation. This feature enables any user with any data modality and any long-established (often in-house) pre-processing scripts/software to utilize Pynapple in the analysis phase of their pipeline. Overall, not imposing a certain format compatibility from data acquisition phase is a strength for any analysis package.  

      Finally, the reviews raised the issue of data and intermediate result storage. We agree that this is a critical issue. In the long term, we do not believe that the current implementation of NWB is the right answer for data involved in active analysis, as it is not possible to overwrite a NWB file. This would require the creation of a new NWB file each time an intermediate result is saved, which will be computationally intensive and time consuming, further increasing the odds of writing error. Theoretically, users who need to store intermediate results in a flexible way could use any methods they prefer, writing their own data files and wrappers to reload these data into Pynapple object. However, it is desirable for the Pynapple ecosystem to have a standardized format for storing data. We are currently improving this feature by developing save and loads methods for each Pynapple core object. We aim to provide an output format that is very simple to read in future Pynapple releases. This feature will be available in the coming weeks and will be described in the revised manuscript.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We would like to thank all Reviewers for their careful evaluation of our work. Below please find our responses and comments.

      Reviewer #1 (Recommendations For The Authors):

      1) The detection of cell-released GLP-1 is addressed in an indirect, averaged way in Fig. 2 - Supplement 1. This question seems like a good opportunity for an antagonist experiment (Exendin-9), which presumably would require much lower concentrations than those used to antagonize a saturating dose of GLP-1. It would also be much more convincing if GLPLight1 could be used to detect stimulated release of GLP-1 from the GLUTag cells.

      We tried multiple times to acutely stimulate GLUTag cells using Forskolin and IBMX, but unfortunately we did not observe any robust fluorescence increase of GLPLight1. The only observation that was consistent was the higher baseline fluorescence of GLPLight1, and the reduced maximal response to saturating GLP-1 when GLPLight1 expressing HEK cells were cultured overnight with GLUTag cells. We considered this assay to be at best qualitative and — despite the aforementioned attempts — could not determine quantitative values.

      2) The excitation-ratiometric response of the sensor, shown in Fig. 1D, is usually accompanied by strong pH-dependence of sensor function. It would be valuable to characterize this pH-dependence, using permeabilized cells in which the pH is changed; the ability of small (0.2-0.5 unit) pH changes to produce changes in fluorescence, as well as to affect the dynamic range of the sensor, should be characterized. This will prevent the misidentification of agents that affect cellular pH as having (for instance) an inhibitory effect on the binding of GLP-1 to GLPLight.

      The pH sensitivity of cpGFP-based sensors is a valid concern. However, considering that the cpGFP module from GLPLight1 is intracellular (and thus largely protected from potential extracellular pH changes) we assume that GLPLight1 signal should be robust in most in-vivo or cell-based assays. In fact we have previously characterized this for a similarly-built neuropeptide sensor (PMID: 35145320) and believe that this will be the case also for GLPLight1.

      3) The reported Kd for Exendin-9 is in the low nM range. Please explain the partial response at 1000x the concentration (including a discussion of the Kd of GLP-1 itself, as well as its off kinetics, and a comparison of this assay to the assays used previously).

      The partial response is due to the presence of 1 uM GLP-1 in the imaging buffer, which is in constant competition with Exendin-9 for the binding to GLPLight1. Because GLP-1 has similar affinity as Exendin9 (see for example PMIDs: 34351033 and 21210113) and both are present at saturating concentration, we did expect to observe a partial response from GLPLight1. In this study, we did not exactly determine the on and off kinetics of both GLP-1 and Exendin9 on the GLPLight1 sensor due to technical challenges: to perform these experiments, we would need to set up a perfusion system where we could remove the unbound ligand and either wash off the bound ligand with buffer or compete it out with an antagonist. Unfortunately, we currently do not have access to such a set up.

      4) Are the turn-on kinetics in Fig. 2C limited by drug application or by association? Are the on-rates much slower for the lower concentrations used for Fig. 2C? This is important for knowing how fast responses are likely to be at the lower concentrations likely to be achieved by endogenous release.

      If we consider Fig 2B and 2C, we assumed the on-kinetics to be mostly driven by association since the ligand is expected to be homogeneously distributed.

      The on-rate kinetics are indeed slower when lower concentrations of GLP-1 are used as shown in (Figure 2b) where we observe a TauOn of 4.7s with 10 uM GLP-1 and much slower kinetics when GLP-1 is applied a 1 uM for example (Figure 3d). As a result, we chose to incubate the ligand with GLPLight1 expressing cells for at least 30 minutes before the measurement of the dose-response to be close to equilibrium.

      5) The parameters for the fitted dose-response curves in Fig 2C should be listed. The ~4x discrepancy between the dose-response in HEK-293 cells and neurons should be discussed. Are there known auxiliary subunits, dimerization, or lipid dependence that might account for this? It seems important to understand this if the sensors are to be used in an assay that may compare different systems.

      We added the EC50 values to Fig 2C as requested. We did not consider a 4x discrepancy to be significant, because the measurement error in the EC50 region is relatively high and this difference seemed to be within the error range. In fact, the 95% confidence interval ranges are 7.8 to 11.1 nM in Neurons and 23.8 to 32.1 nM for HEK cells, if we consider the upper and lower boundaries of each, the difference drops to around 1-fold. We also performed a statistical test to compare the two fits (Extra sum of squares F-test) that confirmed the two fits were not significantly different (P value = 0.3736). Of course, the interaction partners and membrane composition are different in HEK cells and neurons and probably have an influence on the EC50 of GLPLight1, but their exact influence is unclear.

      6) It seems surprising that removal of the endogenous N-terminal secretory sequence is actually helpful for membrane expression. Do the authors have any suggested explanation for this?

      GLPLight1 contains an N-terminal hemagglutinin (HA) secretory motif. The hmGLP1R sequence that we chose also contained an endogenous secretory sequence that most likely interfered with the membrane transport mechanism and resulted in a lower sensor expression with both secretory sequences. We thus decided to keep the HA instead of endogenous to remain consistent with other sensors created in-house.

      7) In Fig. 1, supplement 3, are the transient responses real? Do they occur with the control construct?

      While we have not measured the G-protein recruitment on GLPLight-ctr, we have often observed this phenomenon for various receptors and ligands. The transient responses are thus most likely an artifact after manual addition of the ligand possibly due to:

      -       Temperature difference

      -       Exposure of the plate to ambient light before resuming measurement (phosphorescence)

      -       Re-suspension of the cells affecting the proximity to the detector

      -       Other unknown variables

      If these responses were real, we would also expect them to be more sustained over time.

      8) Please include a sentence or two explaining the luminescence complementation assay, and a reference.

      We updated the results section of the manuscript with a section describing the luminescence complementation assay along with a reference:

      “Next, we compared the coupling of GLPLight1 and its parent receptor (WT GLP1R) to downstream signaling. We first measured the agonist-induced membrane recruitment of cytosolic mini-G proteins and β-arrestin-2 using a split nanoluciferase complementation assay (Dixon et al., 2016). In this assay both the sensor/receptor and the mini-G proteins contains part of a functional luciferase (smBit on the sensor/receptor and LgBit for Mini-G proteins) that becomes active only when these two partners are in close proximity (Wan et al., 2018).”

      Bravo to the authors for already making the sensor plasmids available at addgene.com. It would be helpful to include the plasmid IDs and/or a URL in the manuscript.

      We would like to thank Reviewer #1 for noticing this. We have updated the data availability section of the manuscript and added the AddGene plasmid numbers of the constructs generated in this study.

      Reviewer #2 (Recommendations For The Authors):

      1) There are some parts of the introduction that need clarification. For example, GLP1 is quoted as an anorexigenic peptide, however, that is probably only true for centrally- derived GLP1. There is no evidence that enteroendocrine-derived GLP1 (the major pool) is anorexigenic- it is likely to be substantially degraded by DPPIV before reaching the brain. In any case, the discovery of GLP1 was always one of glucose-dependent insulin secretion, with the brain system being described decades later. Overall, the intro needs to be slightly reframed. While the tools presented here are more useful for assessment of central GLP1-releasing circuitry, they are ultimately based upon GLP1R signaling that is much better validated in the periphery.

      We have slightly reframed the introduction accordingly.

      2) "The human GLP1R (hmGLP1R) is a prime target for drug screening and drug development efforts, since GLP-1 receptor agonists (GLP1RAs) are among the most effective and widely-used weight-loss drugs available to date (Shah and Vella, 2014)." GLP1R was for two decades the breakthrough drug for treatment of type 2 diabetes mellitus and correction of glucose tolerance as assessed through HbA1c. It is only through reporting on millions of patients receiving GLP1RA that the weight loss effects were noted, leading to Phase1-3 trials and eventual approval for obesity indication. Again, some slight reframing of the introduction is required here.

      Also for this point, we have slightly reframed the introduction accordingly.

      3) GLP1 was applied at a maximal dose of 10 uM, which is 10-fold higher than maximal. Can the authors confirm absence of cytotoxic effects of exposing to peptide at such concentration? Ex4 (9-39) at such concentrations is usually cytotoxic at least in primary tissue.

      We did not observe any obvious cytotoxic effect of GLP-1 at this concentration in HEK293T cells or Neurons.

      4) "As expected, GLPLight1 responded to both GLP1RAs with almost maximal activation, on par with GLP1 (Figure 2a)." Such a claim is difficult to interpret without concentration-response curves, since the maximal concentration of liraglutide and semaglutide might not have been achieved in these experiments.

      We agree with this statement is difficult to interpret without further clarification. We know from the literature that GLP-1, liraglutide and semaglutide all have very high affinity to the hmGLP1R (PMID: 31031702). We also proved that GLPLight signal saturates at concentrations above 1 uM of GLP-1 (figure 2C), we thus applied a 10x excess of all ligands and considered this signal as maximal.

      5) "These results indicate that GLPLight1 can serve as a direct readout of pharmacological drug action on the hmGLP1R with higher temporal resolution than previously available approaches, such as downstream signaling assays (Zhang et al., 2020)." Many investigators use cAMP imaging to investigate GLP1R signaling, which is arguably of similar spatiotemporal resolution, also with the advantage of FRET quantification in some cases (e.g. EpacVV). Direct GLP1R signaling can also be inferred using cell lines heterologously-expressing GLP1R. Thus, the advantage of the current probes is that they can be used to readout direct GLP1R activation in native cells/tissues where promiscuous class B binding might limit signaling measures or where endogenous GLP1 release needs to be investigated.

      We have edited the manuscript text accordingly.

      6) "State-of-the-art techniques for detecting endogenous GLP-1 or glucagon release in vitro from cultured cells or tissues consist of costly and time-consuming antibody- based assays (Kuhre et al., 2016) or analytical chemistry procedures (Amao et al., 2015)." Agreed, but non-specificity/cross-reactivity of such assays is more prohibitive/problematic (e.g. against glicentin).

      We have edited the introduction accordingly.

      7) The studies using co-culture of GLUTag and GLP1Light1-HEK293 cells, whilst interesting, are not entirely convincing in their current form. Firstly, co-culture could influence GLP1Light expression levels (can the authors label FLAG?). Secondly, specificity of the response is not tested e.g. by adding Ex4 (9-39). Thirdly, titration with GLUTag conditioned media is not performed.

      We partially addressed this issue in the answer to comment #1 from Reviewer #1. We previously performed a FLAG staining of GLPLight1 in the presence or absence of GLUTag cells and we did not notice any obvious difference. This goes in line with the fact that GLPLight1 is signaling inert, and the presence of GLP1 should not interfere with the surface expression of the sensor. We also checked that HEK293T cells did not express high levels of GLP1R according to the BioGPSCell line Gene Expression profile (https://maayanlab.cloud/Harmonizome/gene_set/HEK293/BioGPS+Cell+Line+Gene+Expression+Profiles).

      We also tried to add GLUTag media after stimulation in bolus to GLPLight1 expressing cells and observed no response. This indicated that the “sniffer” cells must be present in close proximity to GLUTag cells for an extended period of time to observe any substantial difference in response, justifying our choice of experimental setup.

      8) "Given that our photocage was placed at the very N-terminus of photo-GLP1, our results show that this caging approach prevents the peptide's ability to activate GLP1R but, at the same time, preserves its ability to interact with the ECD." An alternative hypothesis is that PhotoGLP1 does activate GLP1R, but this is undetectable with the sensitivity of GLP1Light. PhotoGLP1 cAMP concentration-response assays are needed (uncaged versus cage) to properly characterize and validate the compound (as would be standard for any newly-described GLP1R peptide ligand).

      While we agree that there is a chance that Photo-GLP1 could activate GLP1R at high concentrations, we think that the characterization of Photo-GLP1 has to be determined by the end user directly with the technique of choice (GLPLight1 in our case) in order to get a reliable comparison of potency and efficacy. We modified the text accordingly to more accurately reflect the direct conclusions from our data, as follows:

      “our results show that this caging approach prevents the peptide's ability to activate GLPLight1”.

      9) "Surprisingly, GLPLight1 shows a fluorescent response in all three uncaged areas, while its fluorescence remained unaltered throughout the rest of the FOV, indicating high spatial localization of the response to GLP-1 (Figure 3f)." Why is this surprising?

      We agree that this result is, indeed, not surprising and would like to thank Reviewer #2 for spotting this mistake, which has now been corrected in the manuscript.

      10) The localized PhotoGLP1 experiments are interesting and show the utility of the ligand. There is however activation outside of the region of uncaging, which would argue against a pre-bound ECD mode of action. Possibly some PhotoGLP1 is pre- bound to the ECD, and some is freely diffusing? Alternatively, the scan area might be below the diffraction limit/accuracy of the microscope?

      We would like to thank Reviewer #2 for this comment and agree with their observation. There could be some free Photo-GLP1 that gets photo-activated and binds regions around the uncaging area (similar to what has been observed for Photo-OXB:,PMID: 36481097). The activation around the uncaging area could also be due to lateral diffusion of the activated receptor on the membrane. There is also most likely some light diffraction at the uncaging area that could account for this phenomenon. To increase the spatial resolution, future studies could involve uncaging during sensor imaging via two-photon microscopy.

      11) What was the rationale for caging native GLP1, which is then susceptible to DPPIV-mediated degradation? Would the N-terminal cage and first 2 amino acids also not be cleaved by DPPIV, thus rendering the tool of limited in vivo application? Conversely, PhotoGLP1 provides a template for similar light-activated (stabilized) GLP1R agonists such as Ex4 or liraglutide.

      Thank you for making us aware of this (in vivo) limitation. We designed photoGLP1 as a tool for neurobiological experiments in the brain, where DPPIV expression would be low compared to peripheral organs (https://www.proteinatlas.org/ENSG00000197635-DPP4/tissue). We also envisage that the presence of the photocage would be enough to hinder the binding to DPP4 that cuts the first 2 AA. This hypothesis, however, was never tested experimentally, and we, therefore, acknowledge the limitation in the manuscript. We would furthermore like to thank the reviewers for his comment on additional photo-caged GLP1 agonists, which could be developed future studies.

      12) It wasn't clear how GLP1Light could be used as a HTS screen for drug discovery? Surely, conventional systems (e.g. GLP1R + BAR/Ca2+/cAMP reporting) allow signal bias, an important component of GLP1RA action, to be assessed. Or could GLP1Light1 be used as a pre-screen to exclude any ligands that do not orthosterically bind GLP1R?

      We would like to thank Reviewer #2 for this comment and would like to offer some clarification. We indeed thought that GLPLight1 could be used as a first line of screening to exclude ligands that do not bind in the orthosteric pocket. It is also a rather flexible method as the fluorescence increase of those sensors can be monitored using various techniques/devices that are available in most labs (e.g. microscopy, plate reader, flow cytometry).

      13) Limitations of GLP1Light1 and PhotoGLP1 are not acknowledged in the discussion.

      We would like to thank Reviewer #2 for pointing out the lack of description of the limitations of these tools, which have now been added to the Discussion.

      14) Full characterization of PhotoGLP1 is missing, to include UV/Vis, Tr and HRMS.

      PhotoGLP1 was fully characterized by UV/Vis and HRMS, and all experimental and analytical data was uploaded as supplementary data when the manuscript was initially submitted for publication in eLife.

      Reviewer #3 (Recommendations For The Authors):

      1) The ~1000 fold lower EC50 for GLP1 of GLPLight1 compared with native GLP1R needs to be openly acknowledged as a major limitation of the sensor, as this will substantially reduce the types of experiment for which it will be useful. Because it needs 1000 times higher GLP1 levels than wild type GLP1R to be activated, it is unlikely, for example, to be useful for monitoring the dynamics of activation of native GLP1R in vivo. The claim that the sensor could be used for in vivo imaging for fibre photometry is therefore an exaggeration.

      We would like to first thank Reviewer #3 for this comment and to further provide some clarification. We recognized that the data presented in this manuscript might have been confusing when comparing the affinity of GLP1R (using cAMP) and GLPLight1 (using the fluorescence increase because there is no coupling to cAMP). We believe that the low EC50 measured in the cAMP assay cannot accurately be compared to GLPLight1 response because it is an enzymatically amplified process. In order to support this claim, we included another set of experiments where we titrated agonist- induced recruitment of miniGs protein to the GLP1R receptor and found an EC50 of 3.8 nM for native GLP-1 using this assay (added as panel l in Figure1 Supplement 3). We thus confirmed that the nature of the assay itself has a drastic influence on the EC50 measured and it is not unusual to observe 100x fold difference of EC50 for the same receptor-ligand pair.

      We believe that the miniGs protein recruitment is a better comparison to GLPLight1 because it is not enzymatically amplified. This assay reveals that GLPLight1 has around 8-fold lower affinity to GLP1 compared to its parent receptor, which is in line with the EC50 loss observed previously for other GPCR-based sensors of this class. We are thus confident that GLPLight1 has to potential to be used in vivo under specific circumstances, specifically in brain tissue. We elaborated on this point in the Discussion part of the manuscript.

      2) Fig2 suppl 1 is described as demonstrating a reduced response of GLPLight1 to GLP-1 when HEK cells with were cultured with GLUTag cells. However, it is speculation to conclude that this is because GLP1Light1 was partially pre-activated by endogenous GLP-1, without demonstrating the response of GLPLight1 before and after GLUTag cell stimulation. Unless additional data are generated, the presented data do not convincingly demonstrate that GLP1Light1 can detect GLP1 released from GLUTag cells.

      We would like to thank Reviewer #3 for this comment which has been addressed already in the replies to Comment#1 from Reviewer #1 and Reviewer #2.

      3) The authors should openly acknowledge that photo-uncaging the GLP1 probe might not be very helpful for monitoring the temporal dynamics of the GLP1-GLP1R interaction, because unless all the photocaged glp1 is released by the light stimulus, the activation of photo-released GLP1 will be slowed by the remaining caged GLP1, and the dynamics will be slower than for native GLP1. This makes it unsuitable for many temporal questions, although it might be useful to deliver GLP1 in a spatial restricted manner.

      We do agree that the biggest advantage of Photo-GLP1 is its ability to be activated in a very localized manner. We also agree that the presence of caged Photo-GLP1 will influence the binding of the uncaged GLP-1. Nevertheless, there is still an advantage of using Photo-GLP1 in some assays such as pharmacological activation on brain slices. In fact, we have shown for our Photo-OXB molecule that the perfusion of OXB was much slower at eliciting neuronal depolarization compared to uncaging of Photo- OXB (see PMID: 36481097). We think that this was mainly due to the slow diffusion kinetics of the peptide into the brain tissue. We also think that uncaging can provide a more controlled activation with varying laser power and uncaging duration.

      4) To claim (as currently in the discussion) that GLPLight1 has potential to be used for investigating the dynamics of endogenous GLP1, the authors would need to compare the dynamics of the GLP1Light sensor with wild type GLP1R. We do not know that its activation dynamics will reproduce native glp1r.

      We would like to thank Reviewer #3 for this comment and would like to offer some clarification. Since GLPLight1 does not couple to intracellular signaling, it was impossible to compare its activation kinetics to GLP1R WT using the same assay. However, we can offer a relative comparison since we know that GLPLight1 takes around 50 seconds to be activated using 1 µM GLP-1 (figure 2B) and that it takes a similar time for GLP1R to be activated in the miniG protein recruitment assay (Fig 1 Supplement 3) using 100 nM GLP-1. Considering that GLPLight1 has a lower affinity than the GLP1R (8-10x lower), we think that the activation kinetics of both the sensor and GLP1R are comparable.

      Additional comments:

      1) In fig 2A,B, it is not clear whether the trace shows a partial reversal of GLP1- triggered activation by Ex9, or Ex9-independent receptor desensitization. A control trace is required to show the kinetics of GLP1-triggered activation without the addition of Ex9.

      We would like to thank Reviewer #3 for this comment. We can exclude the possibility of Ex9-independent desensitization because GLPLight1 has been shown to be signaling inert to all G-proteins, Beta arrestin-2 and cAMP. Moreover, we have observed that the fluorescence signal was stable for more than 30 minutes for the GLP-1 titrations, even at high concentrations of ligand.

      2) It would be helpful if the pEC50 for WT GLP1 were also shown in table 1, for comparison with the GLP1 mutants.

      We would like to thank Reviewer #3 for this comment, and we have now added the respective pEC50 for WT GLP1 to Table 1.

      3) Fig2 suppl 1. The methods and analysis for this figure are inadequately explained. To show that the HEK-GLPLight1 cells are responding to GLP1 released from GLUTag cells, the GLPLight1 response needs to be shown before and after GLUTag cell stimulation with an agent that should trigger GLP-1 release.

      We would like to thank Reviewer #3 for this comment which has been partially addressed already in the replies to Comment#1 from Reviewer #1 and Reviewer #2.

      Since we did not observe any response to acute stimulation of GLUTag cells we considered the high glucose concentration present in the culture media being a stimulation agent for GLUTag cells, which has been previously reported (PMID: 17643200).

      4) Fig 3g and others: The end of the photo activation period needs to be represented correctly on the timeline. In 3g, the bar that should indicate when photoactivation was applied does not end at the zero time point (which is labelled as the time relative to photoactivation).

      We would like to thank Reviewer #3 for pointing this out. The shaded area representing the photo-activation has been matched accordingly.

      5) Discussion para 1: the authors claim their data show that ligand induced activation of human GLP1R occurs more slowly than others similar GPCR sensors - they should give actual data to substantiate this claim, since the time course of glp1r activation has not been analysed and compared with other sensors in the manuscript.

      We added data to support this claim to the discussion: “As a reference, other previously-characterized class-A GPCR-based neuropeptide biosensors showed sub- second activation kinetics (Duffet et al., 2022a; Ino et al., 2022).”

      6) Methods: what wavelength was used for recording emission from GLP1Light1? The excitation wavelength is given, but I can't see the emission wavelength(s). In fig 1d, the excitation and emission spectra should be depicted in different colours/line properties, otherwise this figure is very confusing.

      We updated figure1d and changed the colors to improve data visualization. Regarding the missing wavelength, we would like to clarify that both wavelengths were already described in the methods section as: “The excitation and emission spectra were measured at λem =560nm and λex\= 470nm, respectively, on a TECAN M200 Pro plate reader at 37 °C. “. We would be happy to rewrite this paragraph, if necessary, shall it remain unclear to the reader.

    1. Author Response:

      Reviewer #1 (Public Review):

      This manuscript features a key technical advance in single-molecular force spectroscopy. The critical advance is to employ a click chemistry (DBCO-cycloaddition) for making a stable covalent connection between a target biomacromolecule and solid support in place of conventional antigen-antibody binding. This tweak dramatically improves the mechanical stability of the pulling system such that the pulling/relaxation can be repeated up to a thousand times (the previous limit was a few hundred cycles at best). This improvement is broadly applicable to various molecular interactions and other types of single-molecule force spectroscopy allowing for more statistically reliable force measurements. Another strength of this method is that all conjugation steps are chemically orthogonal (except for Spy-catcher conjugation to the termini of a target molecule) such that the probability of side reactions could be reduced.

      The reliability of kinetic and thermodynamic parameters obtained from single-molecule force spectroscopy depends on statistics, that is, the number of pulling measurements and their distribution. By extending the number of measurements, this robust method enables fundamental/critical statistical assessment of those parameters. That is, it is an important and interesting lesson from this study that ~200 repeats can yield statistically reasonable parameters.

      The authors carried out carefully designed optimization steps and inform readers of the critical aspects of each. The merit, quality, and rigor as a method-oriented manuscript are impressive. Overall, this is an excellent study.

      We appreciate for the positive evaluation for our work. Additionally, the minor suggestions were helpful to improve our manuscript. Thank you!

      Reviewer #2 (Public Review):

      In this study, the authors have developed methods that allow for repeatedly unfolding and refolding a membrane protein using a magnetic tweezers setup. The goal is to extend the lifespan of the single-molecule construct and gather more data from the same tether under force. This is achieved through the use of a metal-free DBCO-azide click reaction that covalently attaches a DNA handle to a superparamagnetic bead, a traptavdin-dual biotin linkage that provides a strong connection between another DNA handle and the coverslip surface, and SpyTag-SpyCatcher association for covalent connection of the membrane protein to the two DNA handles.

      The method may offer a long lifetime for single-molecule linkage; however, it does not represent a significant technological advancement. These reactions are commonly used in the field of single-molecule manipulation studies. The use of multiple tags including biotin and digoxygenin to enhance the connection's mechanical stability has already been explored in previous DNA mechanics studies by multiple research labs. Additionally, conducting single-molecule manipulation experiments on a single DNA or protein tether for an extended period of time (hours or even days) has been documented by several research groups.

      One of the unique features of our work is the development of a robust single-molecule tweezer method that is applicable to membrane proteins, rather than simply making another stable system. As re-written in Introduction, it is not straightforward as we have to consider the membrane reconstitution. We believe that our work is expected to overcome the bottleneck in membrane protein studies that arises when using single-molecule tweezer methods.

      To improve the delivery of the contextual information, we revised Introduction, Results, and Discussion. The first four paragraphs in the Introduction briefly review previous tweezer methods with an improved stability and delineate where our work is placed. In the first paragraph of the Results, we also briefly discussed how and why our DBCO tethering strategy differs from previous DBCO methods. In the first paragraph of the Discussion, we compared the previous methods regarding the stability improvement.

      Additionally, the revised manuscript now includes new findings – the full dissection of structural transitions of a helical membrane protein, the observation of hidden helix-coil transitions at a constant force, and the estimation of kinetic pre-exponential factors. We believe that the new findings provide important insights into membrane protein folding, in addition to the usefulness of our method itself for membrane protein studies. We extensively edited the main text and Methods accordingly. Relevant figures are Figures 6 and 7, Figure 6–figure supplements 1–3, and Figure 7–source data 1.

      Reviewer #3 (Public Review):

      The authors describe a method to tether proteins via DNA linkers in magnetic tweezers and apply it to a model membrane protein. The main novelty appears to be the use of DBCO click chemistry to covalently couple to the magnetic bead, which creates stable tethers for which the authors report up to >1000 force-extension cycles. Novel and stable attachment strategies are indeed important for force spectroscopy measurements, in particular for membrane proteins that are harder and therefore less studied in this regard than soluble proteins, and recording >1000 stretch and release cycles is an impressive achievement. Unfortunately, I feel that the current work falls short in some regards to exploring the full potential of the method, or at least does not provide sufficient information to fully assess the performance of the new method. Specific questions and points of attention are included below.

      We appreciate for the positive evaluation. We were able to largely improve our manuscript while preparing our responses to the comments. Thank you!

      - The main improvement appears to be the more stable and robust tethering approach, compared to previous methods. However, the stability is hard to evaluate from the data provided. The much more common way to test stability in the tweezers is to report lifetimes at constant force(s). Also, there are actually previous methods that report on covalent attachment, even working using DBCO. These papers should be compared.

      As shown in Figure 4E, we evaluated the robustness of our method in a way suggested by you – the lifetime measurement at a constant force. Specifically, ~12 hours at 50 pN. Definitely, our tweezer approach established here is the most robust method for membrane protein studies. Please refer to the section “Assessing robustness of our single-molecule tweezers” in page 7 and line 31.

      We discussed the previous covalent methods for which quantitative data are presented in light of the system stability. Please refer to the first paragraph of Discussion. We also briefly discussed how and why our DBCO tethering strategy differs from previous DBCO methods, in the first paragraph of Results.

      - The authors use the attachment to the surface via two biotin-traptavidin linkages. How does the stability of this (double) bond compare to using a single biotin? Engineered streptavidin versions have been studied previously in the magnetic tweezers, again reporting lifetimes under constant force, which appears to be a relevant point of comparison.

      The papers in this comment showed that the tethering lifetimes of biotin-streptavidin variants were affected by the asymmetric bead anchoring point. However, the situation does not apply to our work as we do not anchor traptavidin to beads. Besides, the stability comparison between the single- and double-biotin systems is not the main point of our work, so we do not have the answer to the question. However, we cited the reference in the first paragraph of Discussion where we discuss the system stability.

      - Very long measurements of protein unfolding and refolding have been reported previously. Here, too, a comparison would be relevant.

      We briefly discussed the relevant previous works in the first paragraph of Discussion.

      In light of this previous work, the statement in the abstract "However, the weak molecular tethers used in the tweezers limit a long time, repetitive mechanical manipulation because of their force-induced bond breakage" seems a little dubious. I do not doubt that there is a need for new and better attachment chemistries, but I think it is important to be clear about what has been done already.

      The sentence is in Abstract, so we also had to consider the conciseness. By simply adding the phrase “used for the membrane protein studies”, we can place our work into a more proper context.

      In page 2 and line 3, “…However, the weak molecular tethers used for the membrane protein studies have limited long-time, repetitive molecular transitions due to force-induced bond breakage…”

      - Page 5, line 99: If the PEG layer prevents any sticking of beads, how do the authors attach reference beads, which are typically used in magnetic tweezers to subtract drift?

      The PEG layer consists of biotin-PEG and methyl-PEG at a 1:27.5 molar ratio. As the reference beads are coated with streptavidin, they are attached to the PEG layer by the regular biotin-streptavidin interaction. In page 19 and line 7, you can refer to “…The polystyrene beads are attached to the PEG surface via biotin-streptavidin interaction. The beads are used as reference beads for the correction of microscope stage drifts…”

      - Figure 3 left me somewhat puzzled. It appears to suggest that the "no detergent/lipid" condition actually works best, since it provides functional "single-molecule conjugation" for two different DBCO concentrations and two different DNA handles, unlike any other condition. But how can you have a membrane protein without any detergent or lipid? This seems hard to believe.

      We explained the raised point in page 6 and line 18,

      “…Indeed, the best condition was in the absence of any detergents or lipids (Figure 3; no detergents/lipids only during the conjugation step). This situation is possible because membrane proteins are sparsely tethered to the chamber surface, which kept them from aggregating. However, not using detergents or lipids means that the membrane proteins are definitely deformed from their native folds. Therefore, we sought an optimal solubilization condition for membrane proteins during the DBCO-azide conjugation step...”

      Figure 3 also seems to imply that the bicelle conditions never work. The schematic in Figure 1 is then fairly misleading since it implies that bicelles also work.

      The buffer conditions shown in Figure 3 are those ONLY during the DBCO-azide conjugation step. In this step, the bicelle conditions did not work. Therefore, after the conjugation in 0.5% DDM, the buffer was exchanged with a bicelle solution. This process is shown in Figure 2 and the finally assembled system is depicted in Figure 1.

      To clarify this point, we put a note “Buffer conditions only during the DBCO-azide conjugation step” just above the buffer conditions in Figure 3. You can also find for the relevant exchange step in page 6 and line 31, “…Following a 1 h incubation of the beads in the single-molecule chamber at 25°C, unconjugated beads were washed, and the detergent micelles were exchanged with bicelles to reconstitute the lipid bilayer environment for membrane proteins…”

      - When it comes to investigating the unfolding and refolding of scTMHC2, it would be nice to see some traces also at a constant force. As the authors state themselves: magnetic tweezers have the advantage that they "enable constant low-force measurements" (page 8, line 189). Why not use this advantage?<br /> In particular, I would be curious to see constant force traces in the "helix coil transition zone". Can steps in the unfolding landscape be identified? Are there intermediates?

      Yes, please refer to Figure 6. We were able to dissect three distinct transitions from the fully unstructured state to the native state, including the helix-coil transitions. We also reconstructed the folding energy landscape using a deconvolution method.

      Please refer to the pertinent sections in the main text, which are titled “Structural transitions and folding energy landscape over extended time scales” and “Mechanistic dissection of folding transitions”.

      - Speaking of loading rates and forces: How were the forces calibrated? This seems to not be discussed.

      We wrote an additional section in Methods titled “Instrumentation of single-molecule magnetic tweezers”, where we discuss the force calibration. For the actual force calibration data, please see Figure 4–figure supplement 1A.

      In page 20 and line 10, “…The mechanical force applied to a bead-tethered molecule was calibrated as a function of the magnet position using the formula F = k_B_T∙L/δx_2 derived from the inverted pendulum model96, where _F is the applied force, k_B is the Boltzmann constant, _T is the absolute temperature, L is the extension, and _δx_2 is the magnitude of lateral fluctuations…”

      And how were constant loading rates achieved? In Figure 4 it is stated that experiments are performed at "different pulling speeds". How is this possible? In AFM (and OT) one controls position and measures force. In MT, however, you set the force and the bead position is not directly controlled, so how is a given pulling speed ensured?<br /> It appears to me that the numbers indicated in Figures 4A and B are actually the speeds at which the magnets are moved. This is not "pulling speed" as it is usually defined in the AFM and OT literature. Even more confusing, moving the magnets at a constant speed, would NOT correspond to a constant loading rate (which seems to be suggested in Figure 4A), given that the relationship between magnet positions and force is non-linear (in fact, it is approximately exponential in the configuration shown schematically in Figure 1).

      You are correct, so we simply modified the “pulling speed” to “magnet speed” in the figure caption. The loading rates provided in the figure (with the notation <>) were average loading rates in 1–50 pN to provide rough estimates. We actually specified it in the caption as “average force-loading rate”. However, this can be misleading at a glance, so we just deleted all the loading-rate values in the figure and caption.

      - Finally, when it comes to the analysis of errors, I am again puzzled. For the M270 beads used in this work, the bead-to-bead variation in force is about 10%. However, it will be constant for a given bead throughout the experiment. I would expect the apparent unfolding force to exhibit fluctuations from cycle to cycle for a given bead (due to its intrinsically stochastic nature), but also some systematic trends in a bead-to-bead comparison since the actual force will be different (by 10% standard deviation) for different beads. Unfortunately, the authors average this effect away, by averaging over beads for each cycle (Figure 4). To me, it makes much more sense to average over the 1000 cycles for each bead and then compare. Not surprisingly, they find a larger error "with bead size error" than without it (Figure 5A). However, this information could likely be used (and the error corrected), if they would only first analyze the beads separately.

      We might be wrong, but there seems to be a misunderstanding. First, we added Figure 5–figure supplement 1 where you can see individual traces. As expected, the levels of unfolding forces/sizes appear consistent during the progress of pulling cycles. Second, the advantage of averaging for different beads is that you can effectively remove the bead size effect. This “averaging-out” is the key strategy in our kinetic analysis. Based on the error estimation, if you average the values of kinetic parameters obtained from different beads, you can then estimate them with reasonably small errors despite the bead size variations. This becomes more evident after initial hundreds of pulling cycles. The errors for 200 and 1000 cycles are of only ~1% difference, indicating that you do not need to blindly run the pulling cycles. These results are based on the “averaging-out” strategy, which is the merit of our analysis. For more details, please see the section in the main text titled “Assessing statistical reliability of pulling-cycle experiments”, where relevant figures, figure supplements, and Method sections are referred.

      What is the physical explanation of the first fast and then slow decay of the error (Figure 5B)? I would have expected the error for a given bead after N pulling cycles to decrease as 1/sqrt(N) since each cycle gives an independent measurement. Has this been tested?

      If the sampling was from one population (here, unfolding probability profile), the error would follow a 1/√n decay as expected for the standard error. In our analysis, however, we estimated the expected “mean” errors, regardless of detailed shapes of the unfolding probability profiles. To this end, we sampled the data from different possible profiles (shown in Figure 5–figure supplement 5). We then averaged all the error plots to obtain the plot of the mean errors during progress of pulling cycles (black curve in Figure 5D). In this case, the plot does not have to follow the standard error curve represented by the factor 1/√n.

      We tested this by fitting with the model function of y = A/√n, for various lower limit of N = 10, 30, 50, 100, 300, and 500 in the regression analysis (Figure 5–figure supplement 6). The results of the reduced chi-square (χ2) used for a goodness-of-fit test (χ2 = 1 for the best fit) indicates that the two-term exponential model (χ2 = 1.60) shows a better fit than the reciprocal square root model (χ2 = 2.30–6.01). The regression model adopted in our analysis is a phenomenological model that more properly describes the error decay curve. The trend of the first fast and then slow decay is not unusual because it is also expected for the reciprocal square root model – the plot 1/√n decays fast and then slowly, too (Figure 5–figure supplement 6).

    1. Author Response:

      eLife assessment

      The authors present an exciting idea about how to integrate morphogens into a gene regulatory network with the dynamics of morphogenesis and cell movement. It represents a novel methodology, but in its current form the hypotheses, data and relationships described do not provide a sufficiently compelling model to disentangle cause and effect or elucidate the impact of cell movements on differentiation dynamics the zebrafish mesoderm.

      Our aim in this work was not to disentangle causal relationships between signalling, cell movements and gene-regulatory interactions. As discussed in the specific responses below, and in the discussion of the pre-print, this would require precise experimental manipulations within the context of a modelling framework that enables multi-scalar integration of each of these three dynamic components. What we do present here is a) computational methodology to reverse-engineer GRNs in the context of tissue morphogenesis (Spiess et al.,) and b) experiments to narrow down a candidate GRN capable of recapitulating gene expression dynamics in vitro and in vivo (Fulton et al.,). We see this as the first step in tackling the causal relationships of cell movements, signalling and cell fate decision making and propose a working model for future studies to build on.

      Reviewer #1 (Public Review):

      In the manuscript " Cell Rearrangement Generates Pattern Emergence as a Function of Temporal Morphogen Exposure" by Fulton et al., the authors set out to link cell dynamics and single-cell gene expression states, in order to understand the dynamics of cell differentiation. This important challenge is tackled by studying somitogenesis in the zebrafish embryo and combining reverse-engineering gene regulatory networks (GRNs) with cell tracking data. The differentiation of the presomitic cells is evaluated by the differential tbx marker expression through in situ HCR and antibody staining, and live imaging of reporters. Through mathematical modelling taking into consideration the HCR tbx data, live reporter data of the morphogen activity, and the 3D tracking data at different stages, the authors find a candidate model of a gene regulatory network that recapitulates both in vivo and in vitro patterns of the dynamics of cell differentiation. Using this live-modelling approach, the authors move on to question the impact of cell movement on gene expression and conclude that pattern emerges as a function of cell rearrangements tuning the temporal exposure of the cells to the morphogen gradients.

      The major strength of the manuscript is the development of a unique method for addressing cell differentiation dynamics by combining static gene expression data with live cell dynamics. Bridging spatiotemporal information is key to understanding tissue and embryo development and this work provides a great basis for it. A potential weakness is how one selects which of the GRNs predicted from the live-modelling is physiologically relevant to the system of interest, since it requires fitting techniques.

      The major goal of the paper is mostly achieved. This is evident by the proposed model predicting well the dynamics of differentiation both in vivo and in vitro. To fully support the conclusion that cell rearrangements are necessary for patterning, the addition of functional experiments targeted in this direction might be beneficial.

      We agree with the reviewer that functional evidence for a role of cell rearrangement in pattern formation is lacking from the pre-print. We will adjust our title and conclusions to reflect this in a revised version.

      Reviewer #2 (Public Review):

      Fulton et al. seek to understand the interplay between "morphogen exposure, intrinsic timers of differentiation, and cell rearrangement" that together regulate the differentiation process within the presomitic mesoderm tissue (PSM) in developing Zebrafish embryos. A combination of live-cell microscopy to measure cell movements, static measurements of gene expression, and computational and mathematical methods was used to develop a model that captures the observed differentiation profile in the PSM as a function of cell rearrangements and morphogen signaling.

      The authors motivate their investigation into the link between cell rearrangements and differentiation by first comparing differentiation timing in vitro and in vivo. The authors report that a subset of cells differentiating in vitro do so synchronously while cells differentiating in vivo do so with a wide range of differentiation trajectories. By following a small group of photo-labeled cells, it is suggested that the variation of differentiation timing in vivo is related to variation in cell movements in the tissue. To explain these observations in terms of gene expression within single cells, a novel method to combine cell tracks with fixed measurements of gene expression is first used to estimate gene expression dynamics (AGET) in live cells within a tissue. A final ODE-based gene regulatory network (GRN) model is selected based on a combination of data fitting to AGETs and tissue level measurements, further in vitro experiments, and literature criteria. Importantly this model incorporates information from diverse experimental sources to generate a single unified model that can be potentially used in other contexts such as predicting how differentiation is perturbed by genetic mutations affecting cell rearrangement. The authors then use this GRN model to explain how cells starting from the same position in the PSM can have different fates due to differential movement along the A-P axis. Lastly, the model predicts and, the authors experimentally validate, that the expression of differentiation markers can be heterogeneously expressed between neighboring PSM cells.

      The presented research addresses the important topic of patterning regulation accounting for individual cell motion. contributes to larger tissue patterns, this work may directly contribute to our understanding of how regulation across biological scales. Additionally, the methodology to estimate AGET is especially intriguing because of its potential applicability to a wide variety of developmental processes.

      However several issues weigh down the strengths of this paper. First, some conclusions and interpretations in the paper do not obviously follow the data and require further clarification. Second, the authors should consider alternative explanations and models and include some discussion about instances where the final GRN model may not fit as well. Finally, the current manuscript lacks clarity in its presentation and this makes it difficult to follow and understand.

      Major concerns:

      1. A key conclusion made in this paper is that differentiation times show a high variability even when neighboring PSM cells are compared. This is based on the photoconversion experiment shown in Figure 2A-C, where a group of cells is labeled and over time, a trail of labeled cells is visible. It is crucial to understand which compartment is labeled, i.e. progenitor vs. maturation zone vs. PSM. If cells in the progenitor/marginal zone are labeled, the underlying reason for the trailing effect is not a difference in differentiation time, but rather, a difference in the timing of when cells exit the progenitor zone. This needs to be distinguished in my view. In other words, while the timing of progenitor zone exit varies (needs to), once cells are within the PSM, do they still show a difference in differentiation timing? From previous experimental evidence I would expect that in fact, PSM cells differ only very little in differentiation timing. My statement is based on previously published labeling experiments done in posterior PSM cells, not tail bud cells (in chick embryos), which showed that labeled neighboring PSM cells were incorporated into the same adjacent somites, without evidence of a 'trail' (see figure 4H in Dubrulle et al. 2001). In the case of single cell labeling, it was found that these are actually incorporated into the same somite (or adjacent one), even if labeled in the posterior PSM (Stern et al. 1988). The situation in zebrafish appears similar (see Griffin & Kimelman 2002 and Müller et al. 1996). Additionally, the scheme in Figure 2K suggests that the trailing effect reflects a sequential exit from the progenitor zone that is controlled and timed.

      We place the labels in a region of the taibud containing tbxta and tbx16 positive mesodermal progenitors and not in the PSM. Therefore, we are examining the timing of exit, and show this is correlated with the onset of tbx6 expression. Taken together with previous work (Thomson et al., 2021; 10.1016/j.cdev.2021.203748), it demonstrates that in zebrafish embryos, non-directional cell movements generate a progressive exit of cells from the progenitor region in the tailbud towards the PSM. We will make these points clear in a revised version of the manuscript.

      2. The data on cell movement needs to be presented more clearly. Currently, this data is mainly presented in Figure 3D, which does not provide a good description of the cell movements. Visualization of the single cell tracks and the different patterns that are in the tissue along with the characterization of the movement/timescales is needed to better communicate the data and to tie it to the main conclusions.

      A thorough analysis of the tracking data and cell movements in the tailbud are presented in a previous paper (Thomson et al., 2021; 10.1016/j.cdev.2021.203748), and is cited in the pre-print.

      3. The conclusion "As a result of their different patterns of movement, and therefore different Wnt and FGF dynamics, the simulated T-box gene expression dynamics differ in both cells." (Line 249) is not convincing: what part of the data shows that it is not the other way around, i.e. the signaling activities control the movement? The way I understand the rationale of this analysis: the authors take the cell movement tracks as a given input into the problem, and then ask, what signaling environment is the cell exposed to? The challenge with this view is two-fold: first, the authors seem to assume that a cell moves into a new environment and is hence exposed to a different level of signal, while in reality, these signaling gradients act short-range and maybe even at a cellular scale and hence a moving cell would carry Wnt-ligands with it, essentially contributing to the signaling environment. This aspect of 'niche construction' seems to be missing. Second, it has been shown (in chick embryos) that cell movement is, in turn, controlled by signaling levels, how would this factor into this model?

      See response to reviewer 1, we have revised our conclusions to make it clear that we are not demonstrating a causal role of cell movements in this process. We instead provide a modelling framework to interrogate these complex multi-scale interactions.

      4. On the comparison with the in vitro model:<br /> A. The interpretation of cells differentiating synchronously or coherently in vitro seems inconsistent with the data presented in figure 1. To me figure 1F/G does not seem compatible with the previous figure 1D/E since 1F seems to describe cells that upregulate tbx6 over a range of times, in a manner analogous to what is reported in vivo, i.e. figure 2.

      We agree that once initiated, tbx6 expression is variable between individual cells as shown in Figure 1. Our conclusion is that, whatever the rate of increase in expression, cells initiate their increase at the same time (200 mins). We will make this clear in a revised version.

      B. The authors conclude that in vitro, single PSM cells differentiate 'synchronously' and hence differently to what is seen in vivo, where the authors conclude that there is a "range of time scales". As noted above, the situation in vivo can be explained by a timed exit from the progenitor zone, while PSM differentiation is proceeding similarly in all PSM cells. In this view, what is seen in vitro is that all those cells that undergo PSM differentiation, initiate this process in culture more synchronously but it is the exit from the progenitor state, not the dynamics of differentiation, that might be regulated differently in vivo vs. in vitro.

      We agree with this statement- the process we are examining is the timing of tbx6 onset, a proxy for the timing of switching from a progenitor to a PSM cell state. However, we don’t see how this is different from the ‘dynamics of differentiation’ as these processes are directly related.

      C. Another important point to clarify is that the overall timing of differentiation is entirely different in the in vitro experiment: as has been shown previously (Rohde et al. 2021, Figure S12) both the period of the clock and the overall time it takes to differentiate is very substantially increased, in fact, more than doubled. This aspect needs to be taken into account and hence the conclusion: "Our analysis revealed that cells undergo a range of temporal trajectories in gene expression, with the fastest cells transiting through to a newly formed somite in 3 hours; half the time taken for cells to fully upregulate tbx6 in vitro (Figure 2K-L).)" (line 142) appears misleading, as it seems to emphasize how fast some cells in vivo differentiate. However, given the overall slowing down seen in vitro, which more than doubles the time it takes for differentiation (see Rohde et al. 2021, Figure S12), this statement needs to be refined.

      This is indeed an interesting observation and will be discussed in a revised version.

      5. The GRN proposed in this work includes inhibition of ntl/brachyury by Fgf (Figure 3f). However, it has been shown that Fgf signaling activates, not inhibits, ntl (see for instance dnFgfr1 experiments in Griffin et al., 1995). This does not seem compatible with the presented GRN, can the authors clarify?

      Experiments in which signalling and/or transcription function are disrupted in vivo are very different interpret from analysing the impact of gene expression alone. As discussed, and highlighted by the reviewers, there exists a complex interplay where signals can impact cell movements and vice versa. What we propose in this work is a working model of this process through which this interplay can be explored.

      6. The authors use static mRNA in situ hybridization and antibody stainings to characterize Wnt and Fgf signaling activities. First, it should be clarified in Figure 3A that this is not based on any dynamic measurement (it now states Tcf::GFP, as if GFP is the readout, so the label should be GFP mRNA). Second, and more importantly, it is not clear how this quantification has been done. Figure 3C shows a single line, while the legend says n=6 and "all data plotted"..can this be clarified? Without seeing the data it is not possible to judge if the profiles shown (the mean) are convincing. As this experimental result is used to inform the model and the remainder of the paper, it is of critical importance to provide convincing evidence, in this case, based on static snapshots.

      This will be clarified in a revised version of the paper.

      7. Although the AGET analysis and this specific GRN model development are of interest and warrant the explanation the authors have provided, I would be careful not to overstate the findings. In particular, I believe the word "predicted" is used too loosely throughout the manuscript to describe the agreement between model and experiments. For example, my understanding of Figure 4, and what is described in the supplemental diagram, is that the in vitro experiments are used to further refine the model selection process. Therefore, it should not be stated as a prediction of the selected model. This is not to say the final model is not predictive, but it's difficult to assess the predictive power of this model since it hasn't been tested in independent experimental conditions (e.g. by perturbing cell movement and using the model to predict the expected differentiation boundary).

      We will take care with the use of the term ‘predicted’ in a revised version of the paper. The reviewer is correct that this result was used to select from an existing set of GRNs.

      Reviewer #3 (Public Review):

      Fulton et al. look to apply approaches for tackling the readout of gene regulatory networks (GRNs) to a system where cell position itself is continually changing. The objective is highly laudable. GRN analysis has proven to be a powerful approach for understanding how cell fates are determined by morphogenetic inputs, but it has thus far been applied in a limited number of systems. Here, the authors look to substantially extend the application of GRNs to more dynamic systems. The theoretical and experimental approaches are integrated to achieve the analysis of the GRN. In principle, this has wide potential impact and applicability to other systems.

      Unfortunately, in its current form, the manuscript does not do justice to the central aims of the authors. The manuscript is unclear in nearly all sections, and figures and analysis can be substantially improved. The quantifications are not shown in a fitting manner. The modelling itself stands as the strongest part of the manuscript, but improvements are needed. Currently, the main claims of the authors cannot be evaluated based on the quality of the presented data.

      This reviewer has provided a list of minor corrections that will greatly improve a revised version of the manuscript for our next submission.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Soto-Feliciano et al. investigate the tumor suppressive role of MLL3 in hepatocellular carcinoma (HCC). The authors used a variety of techniques including hydrodynamic tail vain injection (HTVI), CRISPR deletion, and shRNA to disrupt MLL3 expression in mouse models. They clearly show that MLL3 acts as a tumor suppressor in the context of MYC-induced HCC. They show that MLL3 acts by activating the Cdkn2a locus. Genomic analysis showed that MLL3 binds to enhancers and promoters, and specifically interacts with the Cdkn2a promoter. When MLL3 was downregulated, Cdkn2a levels fell and this corresponded to changes in relevant histone marks targeted by MLL3. The authors were also able to show that reintroduced MLL3 expression in a dox inducible system could rescue CDKN2A locus expression, which in turn reduced colony formation and induced apoptosis. Human genomic correlation showed that MLL3 and Cdkn2a mutations are generally mutually exclusive. Overall, the conclusions of the manuscript are well supported by a logical series of experiments with good controls and orthogonal approaches. While it would be useful to examine another HCC model such a CTNNB1-driven model, the current paper is convincing in its conclusions.

      We thank the reviewer for their positive and constructive comments and suggestions. Our study primarily used MYC as the driving oncogene for two reasons: first, in an initial in vivo screen of 12 candidate tumor suppressors, MLL3 was the strongest hit that its loss cooperated with the Myc oncogene to drive HCC (Figure 1—figure supplement 1); second, in human HCCs, KMT2C (gene encoding MLL3) mutations and deletions co-occur with MYC gains and amplification.

      Based on the reviewer’s suggestion, we examined MLL3 loss in conjunction with CTNNB1 activation, using HTVI of a transposon containing the constitutively active Ctnnb1. However, we did not observe oncogenic cooperation between Ctnnb1 activation and Kmt2c loss; no mice developed liver tumors by the experimental endpoint (5 months post HTVI, Figure 1—figure supplement 3). Additionally, analysis of genomic data from human HCCs showed no significant co-occurrence between CTNNB1 and KMT2C alterations (Figure 1A). These results suggest that, similar to other epigenetic regulators, the tumor suppressive function of MLL3 is likely oncogene-specific. Our in vivo screen results that nominated MLL3 as a tumor suppressor also reinforce this functional interaction with MYC oncogene. We have updated the text to reflect the context specificity of MLL3 as a tumor suppressor in our study.

      Reviewer #2 (Public Review):

      Soto-Feliciano et al. have characterized the function of MLL3 in hepatocellular carcinoma (HCC) suppression. MLL3 is recurrently mutated in human HCC. The authors show that Mll3 mutations cooperate with Myc overexpression to drive HCC cancer in mice. They identify Cdkn2a as a critical direct target of MLL3. Overall, the manuscript makes a compelling case that MLL3 is a bona fide HCC tumor suppressor, that it directly binds and activates the Cdkn2a locus, and that Cdkn2a acts downstream of MLL3 to suppress HCC initiation.

      The strengths of the paper include mouse modeling techniques that clearly demonstrate a role for MLL3 in suppressing Myc-driven HCC, a detailed characterization of MLL3 binding sites and target gene expression, and the combined weight of several functional studies showing that MLL3 induces apoptosis in hepatocytes/HCC by inducing p16 and ARF. The major conclusions appear well-supported by the data.

      The paper does have some weaknesses. Some of the genomic data require clarification. Furthermore, the authors draw broad conclusions about an epistatic relationship between MLL3 and CDKN2A based on mutually exclusive mutation patterns in human cancers. Those conclusions are not as well-supported as the mechanistic conclusions. The incidences of MLL3 and CDKN2A mutations in HCC are both relatively low (1% and 5% respectively), so it seems difficult to draw any conclusions from mutually exclusive profiles.

      One additional criticism is that the paper is a bit reductive. The link to CDKN2A offers a satisfying explanation for how MLL3 suppresses HCC, but the model may oversimplify the functions of MLL3.

      We thank the reviewer for their constructive comments and suggestions, which we addressed as follows with point-by-point response provided below. We agree with the concerns regarding the mutational analyses of KMT2C and CDKN2A in human cancers and the working model of the manuscript. We have removed the majority of the mutational analyses from the Results section. Importantly, our latest integrative analyses of RNA-seq and MLL3 ChIP-seq revealed other potential downstream effectors of MLL3 tumor suppressive functions (Figure 3A and Figure 3—figure supplement 1B). We have modified the Results and the Discussion to reflect this more nuanced view of MLL3 function in cancer. Nonetheless, we believe that other data continue to support our conclusion that CDKN2A is a dominant effector of MLL3 tumor suppressive functions in our model.

      Reviewer #3 (Public Review):

      The enhancer chromatin-modifying enzyme MLL3 functions as a tumor suppressor in multiple human cancers, however, the mechanisms underlying its tumor suppressive function remain unclear. The manuscript of SotoFeliciano et al. focused on Myc-driven liver cancer and aimed to address and fill the gap. The authors used an elegant genetic design and approach to manipulate the overexpression of the Myc oncogene and knockout of the Mll3 tumor suppressor gene in mouse liver cancer models. Their genetic mouse models showed that loss of Mll3 constrains Myc-driven liver tumorigenesis, with tumors having a slightly later onset compared to mice with Myc overexpression in conjunction with p53 inactivation. Because MLL3 is a major histone-modifying enzyme for enhancer-associated H3K4 monomethylation and is responsible for enhancer activation and the following target gene transcription, they performed ChIP-seq analysis to study the roles of Mll3 in Myc-driven mouse liver cancer. Interestingly, their ChIP-seq studies revealed that loss of Mll3 preferentially limits Mll3 enrichments at promoters and thereby attenuates promoter-associated H3K4 trimethylation and target gene transcription, whereas the unchanged Mll3 genomic binding between the two genotypes (Myc;sgTrp53 and Myc;sgKmt2c) is largely located within enhancer (intergenic) regions. They further demonstrated that the cdkn2a locus is a genomic and transcriptional target of Mll3 in Myc-driven mouse liver cancer. Supporting their findings, genomic inactivations of MLL3 and CDKN2A displays mutual exclusivity in human liver cancer and many other cancer types. Furthermore, they described a possible mechanism for MLL3's role in MYC-driven liver cancer that MLL3 mediates MYC-induced apoptosis in a CDKN2A-dependent manner by manipulating Myc overexpression, Mll3 function, and Cdkn2a regulation in their genetic mice models. This manuscript describes a potential function of MLL3 in the control of tumor suppressor gene expression via modulating their promoter chromatin landscapes. More importantly, loss of normal function of MLL3 or the downstream effector CDKN2A may impair MYC-induced apoptosis, and in turn, lead to MYC-induced tumorigenesis.

      Overall, the manuscript is well written, organized, and focused on an interesting topic, and with data presented supports the authors' claims.

      We thank the reviewer for their positive and constructive comments and suggestions.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript represents a substantial and well-executed body of work that contributes new data on 32 hymenopteran genomes, systematically identifies viral endogenization and domestication events, and tests whether this phenomenon is more common in hymenopteran species with specific lifestyles, eg. endoparasitism. The authors developed a pipeline to identify endogenization that improves upon previously described pipelines and is more comprehensive for the identification of endogenization events from a variety of virus types. Significant findings include the identification of previously undocumented cases of viral endogenization in several hymenopteran species and also moderate statistical support for a higher rate of dsDNA virus endogenization and domestication in endoparasitoids.

      1) The authors have tested whether the lifestyle of hymenopteran species (endoparasitism, ectoparasitism, or free-living) is related to the incidence of virus endogenization and domestication. Addressing this kind of question has only become possible with the availability of genome sequences from many taxa so that any results can be statistically supported by appropriate sample sizes. It appears that the authors have not included new genomic data from hymenopteran genomes that have been published since 2019, which are of similar or better quality than the data used in this manuscript. A number of taxa with endogenous viruses (and also without) have become available since then. The best solution would be for the authors to use their pipeline to incorporate the new data, which may have an impact on their findings and could even strengthen their conclusions about virus domestication being more common in endoparasitoids. If this is not possible, the authors should at least justify their decision not to include the most recent data and discuss how it could affect their results.

      The first step of our pipeline is to extract all candidate loci from each genome. Then all these loci are clustered and further analyzed to infer endogenization and domestication (sequence alignments, phylogeny, dN/dS, genomic context, mapping…). Thus, adding new genomes requires re-run the whole analysis from the very beginning which represents a huge amount of work and computational resources together with their associated carbon costs. Additionally, this work was part of Benjamin Guinet’s Phd project which was defended on 21 Marsh 2023. In conclusion, we will not be able run again the whole pipeline in all the genomes published since 2019.

      2) Please summarize in the main manuscript (results or discussion) what the limitations of the pipeline to detect EVEs and dEVEs are - what are important factors to consider, including the availability of closely related "free-living" viruses, and of closely related wasp species for dN/dS analyses.

      We added a paragraph in the discussion section to discuss the limitations of our pipeline as follow:

      “Because the identification of EVEs necessitates the availability of related viruses in the database, we should see these numbers as an underestimation of the real number of EVEs. In addition, our pipeline necessitates the availability of either related species sharing the same EVEs (or at least the presence of paralogs within a single species) or the availability of RNAseq data to infer domestication. Because these last conditions were only met for 701 out 1261 EVEs, the results we obtained here regarding domestication should be seen as an underestimation of the prevalence of the phenomenon.”

      3) In this manuscript, a description of the methods that precede the results would make it much easier to appreciate the results shown. It appears that this is allowed in cases where it makes sense, according to the author's instructions.

      The first paragraph of the result section was intended to give this overview on the methods used. However, we tried to give more details on the methods in this paragraph. We hope that this will increase readability of the paper.

      4) The sensitivity and specificity of methods analysis are commendable, as is the availability of substantial supplementary data and scripts on GitHub. However, more effort could be made to align numbers reported in the text and in figures so that readers can verify support for the conclusions described.

      To align numbers reported in the text and the figures, we added a new excel sheet within the supplementary file 6 named “Figure_data” in which we report the data used to build the figures 2A, 3A and 3B.

      Reviewer #2 (Public Review):

      Guinet et al address the question of whether the divergent lifestyles in hymenopteran insects determine the rates of acquisition and domestication of viral genetic elements. As endoparasitoids are intimately associated with their hosts and often develop as broods herein, they predicted that the acquisition rate is higher compared to free-living and ectoparasitoid hymenopterans. Following viral domestication in the new recipient wasp genome, these viral elements have been shown to contribute to endoparasitism by promoting the delivery of secreted compounds in insect hosts (where immature wasps develop). Because of this functional importance, the authors predicted that the rate of domestication is also higher in endoparasitoid wasps. I was impressed with the solid and rigorous approach that was followed to test these two hypotheses. The authors carefully ruled out confounding factors, including contamination of genome assemblies. Previously characterized hymenopteran genomes were included as positive controls to assess the developed pipelines. There was also great merit in using a Bayesian model to study endogenization within the phylogenetic framework. To summarize, this multi-pronged strategy to mine animal genomes for viral genetic elements has the potential of becoming a new benchmark for future studies.

      Although the authors do partially achieve their aim of coupling endogenization with an endoparasitoid lifestyle, I am afraid some of the assumptions and generalizations hinder a more solid conclusion. I feel that categorizing hymenopterans either as free-living, endoparasitoids, or ectoparasitoids is an oversimplification. Many of the authors' arguments to associate endogenization with endoparasitoids also apply to free-living eusocial hymenopterans. Both endoparasitoid and eusocial insects can be relatively more exposed to viruses because of intimate conspecific interactions within confined spaces. As endoparasitoids intimately interact with their host, so do eusocial insects with their social guests (melittophiles, myrmecophiles, and termitophiles). Perhaps, you could even argue that some gregarious insects also fit the bill. I would be interested to see whether the conclusions hold when "free-living" is further subdivided and "eusocial" is a separate category.

      To answer this question, we reran the study by separating the free-living category into "eusocial" and "free-living" subcategories. All of the new eusocial assignations and their accompanying bibliographies have been added to the Supplemental file 1 under the columns "lifestyle2" and "ref-lifestyle2". All of the new GLM results have been added to the Supplemental file 6. We also made a new violin plot figure names “Figure 4-figure supplement 3” which contains the GLM coefficients distribution of the model run on A only and A to D scaffolds.

      We also added a few lines in the M&M to explain this analysis “The same analysis was carried out by splitting the free-living category into two sub-categories, namely eusocial and free-living. A new glm model was then built (GLM(Number EVEs ~ free-living + eusocial + endoparasitoid + ectoparasitoid * Branch_length, family = zero inflated neg binomial). (Lines 754-756).

      Overall, the new models that included free-living eusocial hymenopterans revealed the exact same patterns as found in the main analysis. We added a new section entitled “Conclusions hold when eusociality is taken into account” to report the results.

      In conclusion, when "free-living" is further subdivided, the mains findings still hold.

      Second, I wonder why the authors did not include Wolbachia infection as an explanatory variable to explain the endogenization rate. Wolbachia bacteria infect the insect germline and are often associated with phages. These phages could thus be a major source of viral genetic elements. Having said that, I do not see any Symbioviridae, the phylogenetic clade in which these phages reside (https://doi.org/10.1371/journal.pgen.1010227), in Figure 2B - so perhaps this is a minor point.

      In this study we chose to concentrate our attention on eukaryotic viruses, since we reasoned that they have better opportunities to integrate into the insect genomes du to their intimate relationship. This is the reason why we eliminated from our database all phage proteins (as specified in line 511).

      Finally, in addition to the dsDNA virus - endoparasitoids relationship, the authors also detect a link between ssRNA viruses and free-living hymenopterans. (Maybe eusociality is biasing these results?)

      Thanks to reviewer’s comment, we realize that the sentence referring to this point was misleading. In our initial analysis, ectoparasitoid species showed in fact less domestication events involving ssRNA viruses compared to all other lifestyles (Figure 4-figure supplement 1-L). We clarified this sentence in the main text as follow: “except for a lower rate of domestication of ssRNA viruses in ectoparasitoids compared to other lifestyles (Figure 4-figure supplement 1-L)” (lines 195-196).

      The same effect was observed when including eusociality (see Figure 4-figure supplement 3-L).

    1. Author Response

      Reviewer #1 (Public Review):

      Estimating the effects of mutations on the thermal stability of proteins is fundamentally important and also has practical importance, e.g, for engineering of stable proteins. Changes can be measured using calorimetric methods and values are reported as differences in free energy (dG) of the mutant compared to wt proteins, i.e., ddG. Values typically range between -1 kcal/mol through +7 kcal/mol. However, measurements are highly demanding. The manuscript introduces a novel deep learning approach to this end, which is similar in accuracy to ROSETTA-based estimates, but much faster, enabling proteomewide studies. To demonstrate this the authors apply it to over 1000 human proteins.

      The main strength here is the novelty of the approach and the high speed of the computation. The main weakness is that the results are not compared to existing machine learning alternatives.

      We thank Prof. Ben-Tal for taking the time to assess our work, and for his comments and suggestions below.

      Reviewer 2 (Public Review):

      Summary:

      This work presents a new machine-learning method, RaSP, to predict changes in protein stability due to point mutations, measured by the change in folding free energy ΔΔG.<br /> The model consists of two coupled neural networks, a 3D selfsupervised convolutional neural network that produces a reduceddimensionality representation of the structural environment of a given residue, and a downstream supervised fully-connected neural network that, using the former network's structural representation as input, predicts the ΔΔG of any given amino-acid mutation. The first network is trained on a large dataset of protein structures, and the second network is trained using a dataset of the ΔΔG values of all mutants of 35 proteins, predicted by the biophysics-based method Rosetta.

      The paper shows that RaSP gives good approximations of Rosetta ΔΔG predictions while being several orders of magnitude faster. As compared to experimental data, judging by a comparison made for a few proteins, RaSP and Rosetta predictions perform similarly. In addition, it is shown that both RaSP and Rosetta are robust to variations of input structure, so good predictions are obtained using either structures predicted by homology or structures predicted using AlphaFold2.<br /> Finally, the usefulness of a rapid approach such as RaSP is clearly demonstrated by applying it to calculate ΔΔG values for all mutations of a large dataset of human proteins, for which this method is shown to reproduce previous findings of the overall ΔΔG distribution and the relationship between ΔΔG and the pathological consequences of mutations. The RaSP tool and the dataset of mutations of human proteins are shared.

      Strengths:

      The single main strength of this work is that the model developed, RaSP, is much faster than Rosetta (5 to 6 dex), and still produces ΔΔG predictions of comparable accuracy (as compared with Rosetta, and with the experiment). The usefulness of such a rapid approach is convincingly demonstrated by its application to predicting the ΔΔG of all single-point mutations of a large dataset of human proteins, for which using this new method they reproduce previous findings on the relationship between stability and disease. Such a large-scale calculation would be prohibitive with Rosetta. Importantly, other researchers will be able to take advantage of the method because the code and data are shared, and a google colab site where RaSP can be easily run has been set up. An additional bonus is that the dataset of human proteins and their RaSP ΔΔG predictions, annotated as beneficial/pathological (according to the ClinVar database) and/or by their allele frequency (from the gnomAD database) are also made available, which may be very useful for further studies.

      Weaknesses:

      The paper presents a solid case in support of the speed, accuracy, and usefulness of RaSP. However, it does suffer from a few weaknesses.

      The main weakness is, in my opinion, that it is not clear where RaSP is positioned in the accuracy-vs-speed landscape of current ΔΔGprediction methods. The paper does show that RaSP is much faster than Rosetta, and provides evidence that supports that its accuracy is comparable with that of Rosetta, but RaSP is not compared to any other method. For instance, FoldX has been used in large-scale studies of similar size to the one used here to exemplify RaSP. How does RaSP compare with FoldX? Is it more accurate? Is it faster? Also, as the paper mentions in the introduction, several ML methods have been developed recently; how does RaSP compare with them regarding accuracy and CPU time? How RaSP fares in comparison with other fast approaches such as FoldX and/or ML methods will strongly affect the potential usefulness and impact of the present work.

      Second, this work being about presenting a new model, a notable weakness is that the model is not sufficiently described. I had to read a previous paper of 2017 on which this work builds to understand the self-supervised CNN used to model the structure, and even so, I still don't know which of 3 different 3D grids used in that original paper is used in the present work.

      A third weakness is, I think, that a stronger case needs to be made for fitting RaSP to Rosetta ΔΔG predictions rather than experimental ΔΔGs. The justification put forward by the authors is that the dataset of Rosetta predictions is large and unbiased while the dataset of experimental data is smaller and biased, which may result in overfitting. While I understand that this may be a problem and that, in general, it is better to have a large unbiased dataset in place of a small biassed one, it is not so obvious to me from reading the paper how much of a problem this is, and whether trying to fix it by fitting the model to the predictions of another model rather than to empirical data does not introduce other issues.

      Finally, the method is claimed to be "accurate", but it is not clear to me what this means. Accuracy is quantified by the correlation coefficient between Rosetta and RaSP predictions, R = 0.82, and by the Mean Absolute Error, MAE = 0.73 kcal/mol. Also, both RaSP and Rosetta have R ~ 0.7 with experiment for the few cases where they were tested on experimental data. This seems to be a rather modest accuracy; I wouldn't claim that a method that produces this sort of fit is "accurate". I suppose the case is that this may be as accurate as one can hope it to be, given the limitations of current experimental data, Rosetta, RaSP, and other current methods, but if this is the case, it is not clearly discussed in the paper.

      We thank the reviewer for their detailed comments and suggestions.

      As discussed in our general comments above and also below, we have now added additional benchmarking, making it easier to compare the accuracy of RaSP with other methods. Regarding the model description, we have now added a more detailed description of also the 3D CNN.

      Regarding whether to fit the model to experiments or computational data, we agree that it is not clear cut that the former would also not work. Indeed, a main problem is that in both cases it is hard to answer which approach is better because of the scarcity of experimental data. One major problem with the larger sets of experimental data is, as we mention, the bias and variability; another is the provenance. While some databases exist, they are rarely exactly raw data, and for example may contain ∆∆G values estimated from ∆Tm values. In the revised manuscript we now explain better why we chose to target Rosetta, but also acknowledge that one might also have used experiments.

      As to the question of accuracy, we agree completely that the methods could be better. One problem, however, is that it is very difficult to answer how much better because of problems with experiments. As mentioned also by reviewer 1, variation across different experiments suggest that even a “perfect” predictor would only achieve Pearson correlation coefficients in the range 0.7–0.8 (https://doi.org/10.1093/bioinformatics/bty880). Clearly, this is an issue with imperfect data curation (it is possible to measure ∆∆G quite accurately), but in the absence of larger and better curated experiments, one will not expect much better accuracy than what we report here. This is now discussed in the revised manuscript.

      Reviewer 3 (Public Review):

      The authors present a machine learning method for predicting the effects of mutations on the free energy of protein stability. The method performs similarly to existing methods, but has the advantage that it is faster to run. Overall this is reasonable and a faster method will likely have some potential uses. However, not improving performance beyond the reasonable but not great performance of existing methods of course makes this a less useful advance. The authors provide predictions for a set of human proteins, but the impact of their method would be much greater if they provided predictions for all substitutions in all human proteins, for example. In places the text somewhat overstates the performance of computational methods for predicting free energy changes and is potentially misleading about when ddGs are predicted vs. experimentally measured. In addition, the comparison to existing methods is rather slim and there isn't a formal evaluation of how well RASP discriminates pathological from benign variants.

      We thank the reviewer for taking time to read our work and for their various suggestions.

    1. Author Response

      Reviewer #1 (Public Review):

      Alignment between high dimensional data which express their dynamics in a subspace is a challenge which has recently been addressed both with analytic-based solutions like the Procrustes transformation, and, most interestingly, via deep learning approaches based on adversarial networks. The authors have previously proposed an adversarial network approach for alignment which relied on first dimensionally-reducing the binned neural spikes using an autoencoder. Here, they use an alternative approach to align data without use of an initial dimensional-reduction step.

      The results are fairly clear - the Cycle-GAN approach works better than their previous ADAN approach and one based on dimensionality reduction followed by the Procrustes transform. In general, a criticism of this entire field is to understand what alignment teaches us about the brain or how it specifically will be used in a BCI context.

      There are a few issues with the paper.

      1.) To increase the impact of their work, the investigators have now used it to align data in multiple types of tasks. There was an unanswered question about this related to neuroscience - does alignment in one task predict alignment for another?

      This is a great question! We anticipate that it will be challenging for an alignment learned on one task to be used on another task, because we know that M1 decoders trained on data from one behavior often do not generalize when tested using a different behavior (Naufel et al., 2019)*. The same nonlinearities that prevent zero-shot decoding across tasks are also likely to impair the ability of an aligner trained on data from one task to successfully align data from another task. Furthermore, the results of Naufel et al. indicate that even if neural alignment is successful, we would need a decoder already trained on the new task to produce reliable predictions-- in which case the data needed to train that decoder could simply be used for alignment. A systematic study of the relation between the ability to align and decode from data is well warranted, but beyond the scope of our current work.

      *Naufel, S., Glaser, J. I., Kording, K. P., Perreault, E. J., & Miller, L. E. (2019). A muscle-activity-dependent gain between motor cortex and EMG. Journal of neurophysiology, 121(1), 61-73.

      Action in the text: none.

      2) Investigators use decoding as a way of comparing alignment performance. The description of the cycle GAN was not super detailed, and it wasn't clear whether there was any dynamic information stored in the network that might create questions of causality in actual use. It seems that input is simply the neural activity at a current time point rather than neural activity across the trial, which would alleviate this concern. However, they mention temporal alignment but never describe in detail whether all periods of spikes are properly modeled by the system or if only subsets of data (specific portions of task or non-task time) will work. Perhaps this is more a question of the Wiener filter, for which precise details are missing.

      As intuited by the reviewer, we did only use the neural activity at a current time point as the inputs for Cycle-GAN training, so the system is causal and can be used in real time. We have modified the text to clarify this.

      We apologize for any confusion caused by our use of the term "temporal alignment", which was for the sake of consistency with earlier-published, CCA-based alignment methods (e.g., in Gallego et al., 2020), but is indeed confusing. In the revised manuscript, we have switched to the term ‘trial alignment’ which we believe better reflects this pre-processing step, and we have included additional explanations in the introduction.

      Importantly, while CCA-style trial alignment is not required by our methods, we do still preprocess our data to exclude behaviors not related to the investigated task. Since monkeys were resting or performing task-irrelevant movements during inter-trial period, we chose to use data only from trial start to trial end, but without any explicit trial matching or alignment (see Appendix 1 - Behavior tasks). In the revised manuscript, we now show that our methods still works well when applied even to the continuous recordings, with Cycle-GAN significantly outperforming both ADAN and PAF.

      Action in the text (page 2, lines 72-74): clarifying CCA description and replacing “temporal alignment” with “trial alignment”.

      Action in the text (page 5, lines 191-192): stating that ADAN and Cycle-GAN have no knowledge of dynamics.

      Action in the text (page 6, lines 258-272): documenting performance on full-day recordings without trial matching.

      Action in the text (page 13, lines 647-649): again, stating that Cycle-GAN has no knowledge of dynamics.

      3) In general, precise details of the algorithms should have been provided.

      We appreciate the reviewer noting this-- in the submitted manuscript, the full descriptions of Cycle-GAN and ADAN were included as supplementary methods in Appendix 4, but we did not extensively reference this and it may have been missed. In the revised manuscript, we added more references to Appendix 4 and in the Methods section of the main text. We provided further details on the choice of hyperparameters for each method (including PAF) in Appendix 4 itself.

      Action in the text (page 13, lines 643-644): added “For a full description of the ADAN architecture and its training strategy, please refer to “ADAN based aligner” in Appendix 4 and (Farshchian et al., 2018).”

      Action in the text (page 14, lines 669): added “Further details about the Cycle-GAN based aligner are provided in “Cycle-GAN based aligner”, Appendix 4.” Action in the text (Appendix 4 Tables 1-2): We have added a summary table of hyperparameters for each method in Appendix 4 (ADAN: Appendix 4 Table 1; CycleGAN: Appendix 4 Table 2).

      4) Cross validation for day-0 alignment is not explained.

      As mentioned above, the training and validation details of day-0 models were included in Appendix 4, which was not extensively referenced in the manuscript and may have been missed. We have now added more references to the Appendix in the revised manuscript.

      Action in the text (page 13, lines 627-629): added “(Note that this LSTM based decoder is only used for latent space discovery, not the later decoding stage that is used for performance evaluation (see “ADAN day-0 training” in Appendix 4 for full details)).”

      5) Details of statistical tests is not provided.

      We apologize for this omission. In the revised manuscript, we have added a section in the methods summarizing all the statistical tests. In addition, we added the sample sizes for each stat reported in the results section.

      Action in the text (page 15, lines 754-768): new Methods section added.

      6) (minor) The idea that for neurons that have disappeared that the CycleGAN can "infer their response properties", seems an incorrect description. A proper description should be that it "hallucinates" their response properties?

      We prefer to avoid the term “hallucinate”, due to its recent increased (appropriate) use in the context of large language models describing content generation that is “nonsensical or unfaithful to the provided source content” (as per the Wikipedia article on hallucination in AI). The synthetized “responses” of vanished neurons are not nonsensical, but are indeed, inferred: they are the model’s best estimate of how these neurons would have responded, had they been observed. While not explored further here, this prediction could be of potential scientific use: a strong discrepancy between predicted and observed activity might be a clue to look for further evidence of learning or remodeling of neural representations of behavior.

      Action in the text: none.

      Reviewer #2 (Public Review):

      In this manuscript, the authors use generative adversarial networks (GANs) to manipulate neural data recorded from intracortical arrays in the context of intracortical BCIs so that these decoders are robust. Specifically, the authors deal with the hard problem where signals from an intracortical array change over time and decoders that are trained on day 0 do not work on day K. Either the decoder or the neural data needs to be updated to achieve the same performance as initially. GANs try to alter the neural data from day K to make it indistinguishable to day 0 and thus in principle the decoder should perform better. The authors compare their GAN approach to an older GAN approach (by an overlapping group of authors) and suggest that this new GAN approach is somewhat better. Major Strengths are multiple datasets from behaving monkeys performing various tasks that involve motor function. Comparison between two different GAN approaches and a classical approach that uses factor analysis. The weakness is insufficient comparison to another state-of-the-art approach that has been applied on the same dataset (NoMAD, Karpowicz et al. BioRxiv.)

      The results are very reasonable and they show their approach, Cycle GANs, does slightly better than the traditional GAN approach. However, the Cycle GANs have many more modules and also as I understand it performs a forward backward mapping of the day - 0 and day - k and thus theoretically better. But, it seems quite slow.

      We are concerned that the reviewer may have mistaken the Cycle-GAN training time (the time it takes to find an alignment, Figure 4B) with its inference time (the time it takes to transform data once an alignment has been found). Whereas inference time is critical for practical deployment of a model, we argue that Cycle-GAN's somewhat longer training time is not a substantial barrier to use: it is still reasonably fast (a few minutes) and training will only need to be performed on the order of once per day. We have modified the y-axis label of Figure 4B to make this distinction clearer.

      We have also now added information on the inference speed of trained models to the paper: we find that both Cycle-GAN and ADAN perform the inference step in under 1 ms per 50 ms sample of data – this is because the forward map in both models consists of a fully connected network with only two hidden layers. We also note that while forward-backward mapping between days does occur during Cycle-GAN training, only the forward mapping is performed during inference.

      Action in the text (page 7, lines 303-306): added inference time for Cycle-GAN and ADAN.

      I think the results are interesting but as such, I am not sure this is such a fundamental advance compared to the Farashcian et al. paper, which introduced GANs to improve decoding in the face of changing neural data. There are other approaches that also use GANs and I think they all need to be compared against each other. Finally, these are all offline results and what happens online is anyone's real guess. Of course, this is not just a weakness of this study but many such studies of its ilk.

    1. Author Response

      Reviewer #1 (Public Review):

      The study by Yang et al. reports a new mechanistic role of vinculin in inhibiting the Mef2c nuclear translation and sclerostin expression in osteocytes and promoting bone formation. The authors showed the reduction of vinculin in aged bone human bone samples. A 10kb DMP-1-Cre mouse model was generated that deleted vinculin in osteocytes. They found that vinculin deletion caused bone loss and decreased bone formation associated with increased sclerostin expression. This increase does not affect the protein level of transcription factor Met2c but interestingly enhances nuclear translocation. Vinculin is interested in Mef2c and appears to retain Mef2c in the cytosol. As expected, as a component of the mechanosensory focal adhesion complex, bone formation via tibial loading was decreased in vinculin deletion. Intriguingly, the bone loss associated with estrogen deficiency through ovariectomy was attenuated. Overall, the study unveiled an important role concerning a key player of focal adhesion and the study was well designed and executed. The paper would be strengthened by including a more thorough discussion including variables such as male vs. female, and cortical vs. trabecular bone as the vinculin deletion appeared to primarily affect trabecular bone while mechanical loading exerts anabolic effects on both bone types. The effect of estrogen deficiency effect is interesting and is worth some discussion.

      Strengths:

      The paper shows a novel mechanism that vinculin retains Mef2c in the cytosol via protein interaction to prevent it from migrating to the nucleus and increases transcription of sclerostin, an inhibitory factor for Wnt/β-catenin signaling, a critical pathway for osteoblast activity and bone formation. They employed various in vivo and in vitro models as well as human tissue samples including generating conditional knockout of vinculin in osteocytes in vivo and vinculin gene knockdown in MLO-Y4 cells. They also used physiological/pathological relevant models, tibial loading, and ovariectomy to study the role of vinculin under mechanical loading and estrogen deficiency. The adopted standard techniques to study bone properties include microCT, bone formation, bone histomorphometry, histochemistry as well as biochemical assays such as immunoprecipitation, ChIP assays, etc.

      The study is comprehensive and thorough and the noticeable uniqueness is that after observing the phenotypes from in vivo data, they further explored the underlying mechanisms using cell models. The experiments in general are well-designed and presented with adequate repeats and statistical analysis. The paper is also logically written and the figures were clearly labeled.

      We highly thank the reviewer for his/her positive comments and helpful suggestions.

      Minor weaknesses:

      More discussion is necessary concerning the potential difference in responses between male and female. Most of the studies were conducted in male mice except ovariectomy mice.

      During the revision, we have added new results from µCT analysis. Our new results showed that vinculin loss significantly reduced bone mass in 6-month-old female mice (Figure 2-figure supplement 1. a-d).

      It is interesting that the cKO of vinculin in osteocytes primarily affects trabecular bones with limited effect on cortical bones. However, sclerostin is increased in cortical bones. The promotion of bone formation by mechanical loading appears to affect both cortical and trabecular bones. If focal adhesion is a key mechanosensory complex, how to reconcile the different responses in the cKO model?

      We thank the reviewer for raising this good point. In fact, we do not know why there was no marked cortical bone loss in cKO mice. During the revision, we performed three-point bending analysis to determine whether vinculin loss impaired in the mechanical properties of the long bone and found that the ultimate force and total energy absorption before fracture were decreased in the femur of 3-month-old male cKO compared to those in control mice (Figure 3-figure supplement 1. e, f). Furthermore, our new results from the calcein double labeling experiments showed that both MAR and BFR of femur cortical bones were slightly but significantly reduced in cKO mice relative to those in control mice (Figure 3-figure supplement 1. a-c).

      The OVX response is interesting and it is worthwhile to elaborate more regarding the potential underlying mechanism and what's the relationship between estrogen and mechanical loading and if the action of estrogen on vinculin shares any similar mechanisms with mechanical loading, etc.

      We feel that the relationship between estrogen and mechanical loading could be quite complex, which deserves further investigation in the future. Thank you for this good point.

      Reviewer #3 (Public Review):

      This study by Wang et al. investigates the role of the focal adhesion protein vinculin in osteocytes and its effect on bone mass. First, they showed decreased levels of vinculin in osteocytes in trabecular bone from aged individuals compared to young, suggesting a potential role for vinculin in regulating bone mass with aging. Next, they deleted vinculin in late osteoblasts and osteocytes in young and older mice and found decreased bone mineral density and trabecular bone mass. This was due to impaired bone formation, which the authors attributed to increased sclerostin levels. Further in vitro experiments showed that vinculin regulates sclerostin via the transcription factor Mefc2. Conditional knockout of vinculin in late osteoblasts and osteocytes had no effect on the bone of mice lacking Sost, further implicating an essential role for sclerostin in mediating the effects of vinculin in osteocytes. Interestingly, the vinculin conditional knockout mice had an impaired response to mechanical loading, suggesting an important role for vinculin in the osteocyte mechanoresponse. Finally, the authors showed that while ovariectomy increased osteoclast formation and bone resorption in control mice, it had no effect on the bone of the vinculin conditional knockout mice.

      Overall, the authors show convincing data for the important role of vinculin in osteocytes in regulating the anabolic effects of bone formation under physiological conditions. They also show that osteocyte vinculin may be a regulator of bone resorption under conditions mimicking postmenopausal osteoporosis. However, not all of the conclusions are fully supported by the data.

      Strengths:

      The use of both in vivo and in vitro approaches to determine the role of vinculin in osteocytes provides compelling evidence for its importance under basal conditions and in regulating the anabolic effects of mechanical loading. The in vitro assays nicely demonstrate a potential mechanism through Mef2c/ECR5.

      The creation of the vinculin and Sost double conditional knockout mouse model provides further convincing evidence for the causative role of sclerostin in the effects of vinculin knockout in osteocytes.

      The use of both young and older male mice links nicely with the human samples where vinculin expression appears to be reduced in osteocytes with aging. The authors need to be careful in describing 14-month-old mice as aged though, as these mice would not be typically thought of as old.

      Weaknesses:

      The methods section is lacking in basic details (e.g., there is no information on the CRISPR deletion of Vcl in the MLO-Y4 cells). While referencing their previous papers is fine, a brief description of the methods should be included in this paper.

      During the revision, we have added the method of CRISPR deletion of Vcl in the MLO-Y4 cells (page 22, line 16-20).

      While much of the data linking vinculin to sclerostin is convincing, it is surprising that the authors show decreased trabecular bone volume in the vinculin cKO mice, yet show increased sclerostin levels in the cortical bone. If increased sclerostin is responsible for impaired bone formation in the vinculin cKO mice, why is there no cortical bone phenotype?

      Please see our response to above Point 3.

      It would be important for the authors to also show the sclerostin immunostaining in the trabecular bone of these animals.

      During the revision, we utilized IHC to demonstrate that cKO mice also displayed increased level of the sclerostin protein in cortical bone compared to the control group (Figure 4-figure supplement 1. a, b).

      The authors do not provide any potential explanation for the effects of vinculin cKO in the ovariectomized mice. Under physiological conditions, osteocyte vinculin has no effect on osteoclast number or bone resorption. How is osteocyte vinculin affecting osteoclasts after ovariectomy? Are there differences in the osteocyte expression of Rankl or Opg in response to the loss of estrogen in the vinculin cKO and control mice?

      During the revision, we used IHC staining to measure the expression of Rankl and Opg in the cortical bone of control and cKO mice treated with or without OVX surgery. The results showed that in cKO mice, the increase in Rankl induced by OVX was lower than that in the control group, while the increase in Opg induced by OVX was higher than that in the control group. The Rankl/Opg ratio was decreased in cKO mice induced by OVX (Figure 7-figure supplement 1. a-d).

      From their in vitro experiments, the authors deduce that loss of vinculin affects osteocyte attachment. However, their images would suggest that it is the formation of dendrites that is strongly inhibited in the cells lacking vinculin. It is surprising that no investigation of osteocyte dendrite number or connectivity was performed in the vinculin cKO mice. This is particularly important as a decrease in osteocyte dendrites and connectivity has been observed in the bones of aged mice (see Tiede-Lewis et al., Aging. 2017) and osteocyte dendrites are important for mechanosensation.

      Please see our response to above Point 4.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript investigates the mechanisms of 'summiting disease' using a previously characterised Drosophila model. The authors also show that E. muscae infiltrates the brain likey through a defective blood-brain barrier and populates regions of the brain in the medial protocerebrum. It likely releases metabolites into the haemolymph of summiting flies that has the ability to induce summiting in uninfected flies. They also show that a burst of locomotor activity precedes death. To understand the circuit basis of this, they perform a screen of more than a hundred neuronal lines and genes to identify an active DPN1>pars intercerebralis neurons> corpora allata>JH axis as being invovled in the summiting behaviour while not affecting death.

      Thank you for your succinct summary of our paper.

      Reviewer #2 (Public Review):

      In this study, the authors aim to uncover the neuroanatomical and metabolite underpinnings of an intriguing phenomenon observed in some insects due to the infection of fungal pathogens. They very cleverly develop a high-throughput assay to examine and quantify this behaviour in a tractable model organism - Drosophila melanogaster which the authors have previously shown to also exhibit this phenomenon. They characterize the details of this behaviour and clearly show the temporal gating of this summiting-followed-by-death behavior to occur shortly before the dusk transition. They go on to examine using a candidate (over 200) screen approach potential neuronal circuits and genes based on the hypothesis that they may be related to 'arousal and gravitaxis'. They narrow down to a line that is restricted to the PI based on the fact that it has a significant effect on the summiting behaviour and that it is known to affect locomotion. They can demonstrate that flies when a subset of PI neurons (R19G10) are transiently activated, they will show summiting even without exposure to the pathogen. Based on Syt-eGFP staining they conclude that PI communicates with the carpora cardiaca (CA). They also show that CA itself is necessary for this behavior, but cannot demonstrate the role of Juvenile hormones using their pharmacological methods.

      The authors then describe an automated classifier to identify an upcoming summiting behaviour. Further, they use this real-time classifier to stage different steps of the summiting and match it to the extent of pathology observed by microscopy. They also ask whether the constituents of the hemolymph differ between the summiting and not-yet summiting flies for which they conduct metabolome analysis of the hemolymphs. They are also able to show that cross-injection of uninfected or infected but not summiting flies can be induced to show summiting-like behaviour upon injection with the hemolymph. Finally, they propose the sequence by which the fungal pathogen may modulate the behaviours of the host fly so as to execute this highly gated act of increased locomotion prior to death.

      This is a good summary of our findings.

      Strengths

      • The detailed characterization of the behaviour in D melanogaster and development of the high-throughput behavioural arena.

      • Development of the automated classifier which appears to accurately predict this behaviour.

      • Narrowing down to a small group of PI neurons having a strong impact on this behaviour although sufficiency is not clearly demonstrated.

      Thank you for highlighting these areas of our paper. With respect to demonstrating sufficiency of the PI neurons, we believe this actually an area of comparative strength for the manuscript. With thermogenetic and complementary optogenetic experiments, we demonstrated that activating the PI-CA neurons induces a burst of locomotion consistent with that seen during summiting. A similar burst of locomotion is seen when thermogenetically activating DN1ps. These experiments demonstrate that activity in these neurons is sufficient to induce a pattern of activity like that seen during summiting. In future studies, once the molecular effectors are identified, we may be able to show that fungal alteration of the physiology of these neurons alone is sufficient to induce a burst of locomotion, but that experiment is beyond our current capabilities and beyond the scope of this study.

      Weaknesses

      • The evidence of temporal (circadian) gating is weak despite the proposed DN1p - PI - CA connections.

      • The eventual modification of the behavior to enable enhanced locomotion and negative geotaxis to occur appears to be mediated by yet unknown factors

      • The metabolite analysis did not help to narrow down to candidates that can be speculated to cause this behaviour.

      With respect to evidence for temporal gating, in this study we did not aim to address the underpinnings of the timing of summiting behavior in this study and did not mean to suggest that the timing of summiting behavior is explained by DN1ps being fly clock neurons. As previously stated in response to high level comments from the editor, we interpret the data presented here as evidence that host neurons (which just happen to be clock neurons) are manipulated by the fungus to inducethe characteristic burst of pre-death locomotor activity that we believe is the key feature of summiting. We have added the following paragraph in the discussion (see Host circadian and pars intercerebralis neurons mediate summiting) to clarify this point:

      “Our data indicate that the host circadian network is involved in mediating the increased locomotor activity that we now understand to define summiting. However, our data do not speak to how the timing of this behavior is determined in the zombie-fly system. That is, we have yet to address the mechanisms underlying the temporal gating of summiting and death. Our observation that E. muscae-infected fruit flies continue to die at specific times of day in the absence of proximal lighting cues (Fig 1-S1) suggests that the timing of death is under circadian control and aligns with previous work in E. muscae-infected house flies (Krasnoff et al., 1995). Given that molecular clocks are prevalent across the tree of life, it is likely that two clocks (one in the fly, one in E. muscae) are present in this system. Additional work is needed to determine if the host clock is required for the timing of death under free-running conditions and to assess if E. muscae can keep time.”

      We agree that there are many unknown factors at play in this behavior. These include molecular effectors produced by the fungus that alter the physiology of host neurons, and the specific mechanisms by which JH release from the CA alters locomotion. We have endeavored to transparently present what we do and don’t know at this time and hope to be able to address these additional elements in subsequent studies.

      It is true that we were unable to determine the identity of compounds driving summiting behavior. However, our analysis did serve to inform which compounds may play a role in summiting by virtue of their overabundance. While we do not yet know the structure of these compounds, their consistent detection in our samples and our new knowledge of their molecular weight with very high accuracy means that these are prime candidates to isolate, purify and functionally test moving forward.

      Reviewer #3 (Public Review):

      The fungus Entomophthora muscae infects flies and in turn manipulates the flies to produce a summiting behavior that is believed to enhance spore dispersal that happens upon the eventual death of the fly. In this study, the authors undertake a Herculean effort to identify the neural pathways that are manipulated by the fungus to cause summiting. In a major advance, the authors develop techniques that allow them to track behaviors of infected flies over the course of several days. This allows them to investigate summiting behaviors that occur just prior to death with unprecedented detail. In their analysis, the authors find that summiting flies show a burst of increased locomotion just prior to death. Importantly, they show that this burst of locomotion is not seen in flies that are dying from other causes (starvation or desiccation). The burst of locomotion is also found to coincide with an increase in elevation that occurs with summiting, but other results indicate that a change in elevation may be an indirect consequence of increased locomotion. With this new knowledge in hand, the authors screen for genes and neuronal pathways that either disrupt or enhance the burst of locomotion that is characteristic of summiting. These experiments clearly indicate that neurons and genes controlling circadian rhythms play a major role in summiting behaviors. The authors focus their attention on a particular subset of clock neurons (DN1p) as potentially mediating summiting behavior. It is worth noting that DN1p neurons have been implicated in a variety (and in some cases contradictory) of circadian processes and that the interpretation of manipulations of these neurons may be an oversimplification. In particular, prior studies have implicated these cells in temperature entrainment/compensation so interpreting thermogenetic manipulations of these cells might be complicated. The authors also zoom in on a specific region of the brain containing neurons of the pars intercebralis, since they find infiltration by the fungus in this region and the effects of drivers targeting the PI. Converging and convincing lines of evidence to suggest that the PI neurons output to the corpora allata and effects of summing may be mediated by the CA. The already impressive series of experiments are further clinched by the development of a machine vision-based classifier that allows the authors to automatically identify summiting flies so that they may be collected for metabolomic analyses. The authors are automatically emailed and seemingly roused themselves in the middle of the night in order to obtain the precious flies they needed. They find a bunch of compounds that appear in summiting flies and even inject hemolymph from the infected animals into naive flies to find that circulating compounds can affect behaviors. Overall, this paper is a tour de force that addresses a system of long-standing interest and brings it into the modern age. Many new questions are now raised for the future by this fascinating study.

      Thank you for your gracious summary of our work and for recognizing the multifaceted approach we have taken to begin to understand the mechanistic basis of summiting. We agree that there are many new questions raised by this work and hope to address them in future publications.

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript by Gochman and colleagues reports the discovery of a very strong sensitization of TRPV2 channels by the herbal compound cannabidiol (CBD) to activation by the synthetic agonist 2aminoethoxydiphenyl borate (2-APB). Using patch-clamp electrophysiology the authors show that the ~100-fold enhancement by micromolar CBD of TRPV2 current responses to low concentrations of 2-APB reflects a robust increase in apparent affinity for the latter agonist. Cryo-EM structures of TRPV2 in lipid nanodiscs in the presence of both drugs report two-channel conformations. One conformation resembles previously solved structures whereas the second conformation reveals two distinct CBD binding sites per subunit, as well as changes in the conformation of the S4-S5 linker. Interestingly, although TRPV1 and TRPV3 are highly homologous to TRPV2 and both CBD binding sites are relatively conserved, the CBD-induced sensitization towards 2-APB is observable only for TRPV3 but not for TRPV1. Moreover, the simultaneous substitution of non-conserved residues in the CBD binding sites and the pore region of TRPV1 with the amino acids present in TRPV2 fails to confirm strong CBD-induced sensitization. The authors conclude that CBD-dependent sensitization of TRPV2 channels depends on structural features of the channel that are not restricted to the CBD binding site but involve multiple channel regions.

      These are important findings that promote our understanding of the molecular mechanisms of TRPV family channels, and the data provide convincing evidence for the conclusions.

      We appreciate the supportive evaluation of the reviewer.

      Reviewer #2 (Public Review):

      In this manuscript, Gochman et al. studied the molecular mechanism by which cannabidiol (CBD) sensitizes the TRPV2 channel to activation by 2-APB. While CBD itself can activate TRPV2 with low efficacy, it can sensitize TRPV2 current activated by 2-APB by two orders of magnitude. The authors showed, via single-channel recording, that the CBD-dependent sensitization arises from an increase in Po when the channel binds to both CBD and 2-APB. The authors then used cryo-EM to investigate how CBD binds to TRPV2 and identified two CBD binding sites in each subunit, with one site being previously reported and the other being newly discovered.

      TRPV1 and TRPV2 are two channels closely related to TRPV2. All three channels can be activated by CBD and 2-APB, but only TRPV2 and 3 are strongly sensitized by CBD. To understand the molecular basis of the different sensitivity to CBD, the authors compared the residues within the CBD binding sites and generated mutants by swapping non-conserved residues between TRPV1 and TRPV2. They then performed patch-clamp recordings on these mutants and found that mutations on non-conserved residues indeed influenced the CBD-dependent sensitization, thereby supporting the observed CBD binding sites.

      Unexpectedly, the authors did not identify the binding site of 2-APB, despite its robust effect in electrophysiology recordings, especially when combined with CBD. Although previous structural studies of TRPV2 have reported 2-APB binding sites, the associated densities in these studies were not wellresolved. Therefore, the authors called on the field to re-examine published structural data with regard to the 2-APB binding sites.

      Overall, this is an important study with well-designed and well-conducted experiments.

      We appreciated the supportive comments of the reviewer.

      Reviewer #3 (Public Review):

      In this paper, Gochman et al examine TRPV1-3 channel sensitization by CBD, specifically in the context of 2-APB activation. The authors primarily used classic electrophysiological techniques to address their questions about channel behavior but have also used structural biology in the form of cryo-EM to examine drug binding to TRPV2. The authors have carefully observed and quantified sensitization of the rat TRPV2 channel to 2-APB by CBD. While this sensitization has been reported previously (Pumroy et al, Nat Commun 2022), the authors have gone into much more detail here and carefully examined this process from several angles, including a comparison to some other known methods of sensitizing TRPV2. Additionally, the authors have also revealed that CBD sensitizes rat TRPV1 and mouse TRPV3 to 2-APB, which has not been reported previously. Up to this point, the work is well thought through and cohesive.

      The major weakness of this paper is that the authors' efforts to track down the structural and molecular basis for CBD sensitization neither give insight into how sensitization occurs nor provide a solid footing for future work on the topic. The structural work presented in this paper lacks proper controls to interpret the observed states and the authors do nothing to follow up on a potentially interesting second binding site for CBD. Overall, the structural work feels detached from the rest of the paper. The mutations chosen to examine sensitization are based on setting up TRPV1 in opposition to TRPV2 and TRPV3, which makes little sense as all three channels show sensitization by CBD, even if to different extents. The authors chose their mutations based on the assumption that response to CBD is the key difference between the channels for sensitization, yet the overall state of each channel or the different modes of activation by 2-APB seem to be more likely candidates. As a result, it is not particularly surprising that none of the mutations the authors make reduce CBD sensitization in TRPV2 or increase CBD sensitization in TRPV1.

      A difficulty in examining TRPV1-3 as a group is that while they are highly conserved in sequence and structure, there are key differences in drug responses. While it does seem likely that CBD would bind to the same location in TRPV1-3, there is extensive evidence that 2-APB binds at different sites in each channel, as the authors discuss in the paper. Without more basic information about where 2-APB binds to each channel and confirmation that CBD does indeed bind TRPV1-3 at the same site, it may not be possible to untangle this particular mode of channel sensitization.

      We appreciate this reviewer’s perspective and we too were disappointed that our approach did not yield more definitive answers to why some TRPV channels are more sensitive to CBD. We have revised the results and discussion sections to more clearly articulate what we think our results reveal. We have also added a section to the discussion to present the idea that the differential sensitivity of TRPV channels to CBD may have more to do with where 2-APB binds and how it activates the channel than CBD. These challenging points are all excellent and they have helped us to present our message more clearly.

    1. Author Response

      Reviewer #1 (Public Review):

      In this study, authors examine immune signatures from patients that experienced mild, moderate, or severe COVID-19 symptoms and followed them for months to evaluate whether there was a correlation between their immune activation phenotypes, disease severity, and long COVID. Authors observed higher T cell activation/proliferation marker expression in blood samples of patients with severe disease whereas other cell types were more or less unchanged. The authors also examined the cytokine profile of the patient's serum samples to determine the potential drivers of T cell activation phenotypes. Authors then perform T-cell responses to viral peptides to determine the differences in activation phenotypes with disease severity.

      The major strengths of the paper appear in the evaluation of the appropriate cohort of human samples and following them over a period of months. Additionally, the authors perform detailed T-cell analysis in an unbiased way to determine any possible activation correlations with disease severity. The authors also perform antigen-specific T-cell analysis via peptide stimulation which adds to the overall findings. However, there are a number of drawbacks that need to be mentioned. Firstly, the phenotypes of T cells prior to the 3-month time-point are not known. Hence, there is no information on baseline or during the early phase of infection. Secondly, the response is largely obtained from blood. How much information about T cells in blood correlate with lung disease is a matter of concern. Analysis of lungs, where actual disease manifestation is ideal, however close to impossible in the human cohort. Alternatively, analysis of local lymph node aspirate or nasal swabs could be useful. Thirdly, the claim that bystander T cell activation plays a role seems loose, specifically the IL-15 in vitro data. Moreover, the analysis of T cells seems very focused on activation/proliferation phenotypes. Alternative T cell phenotypes such as regulatory, IL-10 producing, or FoxP3 expression are not extensively analyzed.

      Major points

      1) In Figure 1, the CD4 T cell activation phenotypes do not seem consistent across the groups. Why does moderate vs. severe show increases in CXCR3 expression but not mild vs. severe? The same goes for other markers. Performing T cell stimulation with class II peptides specific for CoV-2 and looking at IFN etc. to determine antigen-specific T cells and then gating on these activation/proliferation markers may be a better way to observe differences.

      Figure 1 shows activation phenotypes of total CD4+ T-cells. We performed similar analysis on SARS-C0V-2 spike-specific CD4+ T cells as suggested by Reviewer 1 (using 15-mer peptides overlapping by 10 amino acids which are able to stimulate both CD4+ and CD8+ T cells- see Figure 5), but we did not observe differences between the groups (data not shown). Importantly, as reported in the discussion (page 18 from “Our data does not support the persistence of SARS-CoV-2 antigens at 3 months….”) we did not observe significant activation of spike-specific CD4+ or CD8+ T cells which suggests that T cell activation in these patients at 3 months is not driven by persistence of SARS-CoV-2 spike antigens.

      2) One major drawback is the control patients. It would have helped to include a batch of samples from uninfected patients. Or to have the plasma/blood from patients before COVID-19 symptoms. This way there is a baseline for each group that could be compared. It is difficult to draw broad conclusions across the group at 3 months if we do not know their baseline phenotypes.

      We did not have access to blood samples from these patients prior to COVID-19 infection. However, we have now added an analysis of matched samples from the same patients at 12 months post infection (N=33, see Figure 2- figure supplement 1 and also response to Reviewer 2). These data show a significant decrease in T cell activation at 12 months compared to 3 months. T cell activation has decreased to largely undetectable “baseline” levels at 12 months, that are similar between patients who had experienced mild, moderate or severe COVID-19. This lack of T cell activation at 12 months likely reflects the T cell profiles that patients will have had prior to COVID-19 infection.

      3) Although the authors focused on activating/proliferating markers to correlate with disease severity, this analysis does not consider alternate T cell phenotypes such as the ones with regulatory or anti-inflammatory phenotypes. Did authors detect differences in T cells with regulatory profiles such as expression of IL-10, FoxP3, etc. in their unsupervised UMAP analysis or otherwise flow experiments?

      Due to limited blood volumes we were unable to analyse regulatory/anti-inflammatory T cells phenotypes. Our serum cytokine data does not suggest statistically significant differences in serum IL-10 levels in patients with mild, moderate or severe disease. However, it is possible that we may have missed differences in FoxP3+ regulatory or IL-10- producing T cells.

      Reviewer #2 (Public Review):

      The manuscript is well written, the data are based on well-performed experiments, and the conclusions are supported by the data. The authors study thoroughly the global phenotype of T and NK cells and also analyze antigen-specific T cell frequencies. The data confirm that individuals who had severe COVID-19 disease (required ventilation and/or ITU admission) have slightly more activated CD4 and CD8 T cells at 3 months post-infection and report more frequently long COVID symptoms, yet the novelty of this manuscript is to show that these two are not linked to each other. Moreover, the manuscript confirms that patients across all disease severities mount and maintain memory T cell and antibody responses to SARS-CoV-2.

      The authors find that patients who recovered from severe COVID-19 3 months ago have more activated CD4+ and CD8+ T cells than patients who recovered from the mild disease. Although the difference is significant, the frequency of CD4+ T cells with an activated phenotype is increased only by about 2-fold (~2% vs ~1%), while the frequency of activated CD8+ T cells is about 6% vs 4%, which should be added to the results to better describe the extent of the activation.

      As the authors mention in the discussion, it cannot be excluded that the more activated T cell phenotype in patients who recovered from severe COVID-19 is not rather a consequence of the increased comorbidities associated with this group. However, their Luminex analysis of the serum shows that the levels of cytokines TNF-a, IL-4, IL-12, IL-15, and IL-17A decline by 8 and 12 months, suggesting that the immune activation by 3 months is most likely a consequence of the previous severe viral infection.

      To strengthen this point, PBMC is probably not available at a later time point, to see if the increased T cell activation decreases in line with the serum cytokines. Yet, the authors should at least try to repeat the experiments of coculturing CD3+ T cells from healthy volunteers with the serum of mild/severe patients at 8-12 months post-recovery (Fig. 3 D-E).

      Thank you for these suggestions. We had access to PBMCs from N=33 matched patients at 12 months post admission and have now performed analyses of these samples. Our results show that CD4+ and CD8+ T cell activation at 12 months is significantly decreased compared to that observed at 3 months (Figure 2- figure supplement 1). We show that the frequencies of Ki67+ CD38+ CD4+ and CD8+ T cells are significantly decreased at 12 compared to the 3-month time point. Similarly, the frequencies of CXCR3+ CD4+ and CD8+ T cells are strongly decreased at 12 compared to 3 months post admission. Activated HLA-DR+ CD38+ and granzyme B+ CD8+ T cells are also significantly decreased from 3 to 12 months post admission. Unsupervised UMAP analyses shows that the cell distribution and density of CD4+ and CD8+ T cell populations was similar across all severity groups at 12 months post infection, while major differences are observed at 3 months between the patient groups (Figure 2- figure supplement 1 K). We added this information in the manuscript at page 9 (see tracked changes) and in Figure 2- figure supplement 1.

      Thank you for suggesting we repeat the co-culture experiments of healthy donor PBMCs with serum of mild and severe patients at 12 months post admission. We co-cultured healthy PBMCS from the same donors (only 3 out of the 4 donors used for the 3 months experiment were still accessible) with serum from mild and severe patients at 12 months. Notably, we observed that IL-15R upregulation did not occur upon co-culture of healthy donor PBMCs with plasma from severe patients at 12 months. This suggests that factors inducing IL-15R upregulation present in the 3 months plasma may be absent in the 12 month plasma. We have added this new data to the manuscript (Figure 3E)

      The authors tried to find if the activated T cell phenotype or increased serum cytokines at 3 months post-infection is linked with increased long COVID symptoms. The study does not find any direct association when the data are adjusted for age, sex, and severity. This is the only novelty of this study, yet it is an important piece of information in the attempt to broaden our understanding of the underlying causes of long COVID symptoms.

      Overall, it would be important to understand if increased frequencies of T cell activation (~2-fold) and increased levels of serum cytokines at 3 months following severe COVID-19 that resulted in ventilation and/or ITU admission is specific to severe SARS-CoV-2 infection, or if similar consequences are resulting also from other severe acute viral infections. Addressing this question is beyond the scope of the manuscript, yet it should be discussed.

      We agree this is an important question. In H7N9 influenza infection persistent T cell activation was associated with fatal disease while T cell activation early during infection associated with positive clinical outcomes (Wang et al. Nature Comms 2018). Aging is also known to alter T cell function and the persisting low-grade inflammation present in elderly individuals may also facilitate the persistence of bystander activated T cells (Yunis et al Trends in Microbiology 2023). We have added these considerations in the discussion (page 18-19).

      Reviewer #3 (Public Review):

      In this paper, the authors used a cohort study to link immune signatures in blood 30 days after COVID-19 infection as possible predictors of prolonged symptomatology. The paper partially achieves its aims. While the selected analyses are comprehensive, the cohort design is appropriate and the mechanistic ex vivo work is clever and convincing, the strength of conclusions is somewhat limited by the selection of imprecise clinical endpoints, and the lack of analyses examining T regulatory signatures.

      Strengths of the paper are:

      • The paper includes a comprehensive and structured immune analysis.

      • The paper is extremely clearly written.

      • The use of manual gating and unsupervised analysis in Fig 1 is complementary and helpful.

      • Bystander T cell experiments with IL-15 are useful and attempt to explore mechanisms from human samples which are traditionally very challenging.

      • The experiments shown in Figure 4 documenting equal Cov2 T cell responses in all 3 cohorts are an extremely important result.

      Major concerns are:

      • The significance of the study is somewhat limited by the small sample size.

      • The symptomatic outcome scale for PASC is blunt and poorly captures severity. More state-of-the-start scales of symptomatic severity and heterogeneity exist for PASC. I suggest this and other papers as an example: https://pubmed.ncbi.nlm.nih.gov/36454631/

      Thank you for this suggestion. Additional outcome measures have now been included in our analysis, refer to the results section and the updated Figure 6- source data 1 A-B.

      • The omission of analyses examining T regulatory functions is a missed opportunity and these may be impaired in this population.

      We have acknowledged this as a limitation of the study.

      • This is a challenging question that can be applied to many exploratory studies of this nature: how can we rule out the possibility that statistically significant differences in Figs 1, 2 & 3 are statistically significant but biologically meaningless? All cellular and cytokine measures of immune responses shown in these figures are not routinely measured in the clinic. Are there studies that can be cited to show that these differences are sufficient to have a causal impact on prolonged symptoms and tissue damage rather than just correlations with these outcomes?

      This is a challenging question, and we were unable to find studies correlating these measurements with tissue damage and prolonged symptoms. In our study we however suggest that prolonged T cell activation is not related to ongoing long-COVID symptoms.

    1. Author Response

      Reviewer #1 (Public Review):

      This study presents an important finding on human m6A methyltransferase complex (including METTL3, METTL14 and WTAP). The evidence supporting the claims of the authors is convincing, although the model and assays need to be further modified. The work will be of interest to biologists working on RNA epigenetics and cancer biology.

      In mammals, a large methyltransferase complex (including METTL3, METTL14 and WTAP) deposits m6A across the transcriptome, and METTL3 serves as its catalytic core component. In this manuscript, the authors identified two cleaved forms of METTL3 and described the function of METTL3a (residues 239-580) in breast tumorigenesis. METTL3a mediates the assembly of METTL3-METTL14-WTAP complex, the global m6A deposition and breast cancer progression. Furthermore, the METTL3a-mTOR axis was uncovered to mediate the METTL3 cleavage, providing potential therapeutic target for breast cancer. This study is properly performed and the findings are very interesting; however, some problems with the model and assays need to be modified. It is widely known that METTL3 and METTL14 form a stable heterodimer with the stoichiometric ratio of 1:1 (Wang X et al. Nature 534, 575-578 (2016), Su S et al. Cell Res 32(11), 982-994 (2022), Yan X et al. Cell Res 32(12), 1124-1127 (2022)), the numbers of METTL3 and METTL14 in the model of Fig 7P are not equivalent and need to be modified.

      We thank for reviewer’s good suggestion. We will modify the model in Fig. 7P.

      Reviewer #2 (Public Review):

      In this study, Yan et al. report that a cleaved form of METTL3 (termed METTL3a) plays an essential role in regulating the assembly of the METTL3-METTL14-WTAP complex. Depletion of METTL3a leads to reduced m6A level on TMEM127, an mTOR repressor, and subsequently decreased breast cancer cell proliferation. Mechanistically, METTL3a is generated via 26S proteasome in an mTOR-dependent manner.

      The manuscript follows a smooth, logical flow from one result to the next, and most of the results are clearly presented. Specifically, the molecular interaction assays are well-designed. If true, this model represents a significant addition to the current understanding of m6A-methyltransferase complex formation.

      A few minor issues detailed below should be addressed to make the paper even more robust. The specific comments are contained below.

      1) The existence of METTL3a and METTL3b.

      In this study, the author found the cleaved form of METTL3 in breast cancer patient tissues and breast cancer cell lines. Is it a specific event that only occurs in breast cancer? The author may examine the METTL3a in other cell lines if it is a common rule.

      We thank reviewer for point this out. We discovered the cleaved form of METTL3 in breast cancer, and we also observed this cleaved METTL3 in other cell lines such as lung cancer cell lines, renal cancer cell lines, HCT116 and MEF, suggesting that it is a common rule. We will add these results in the revised manuscript.

      2) Generation of METTL3a and METTL3b.

      1) Figure 1 shows that METTL3a and METTL3b were generated from the C-terminal of full-length METTL3. Because the sequence of METTL3a is involved in the sequences of METTL3b, can METTL3b be further cleaved to produce METTL3a?

      Although the sequence of METTL3a is involved in the sequences of METTL3b, overexpression of METTL3b in T47D, MDA-MB-231 and 293T cells did not show METTL3a expression (please see Figures 3A, 3C, 3G), suggesting that METTL3b can not be further cleaved to produce METTL3a, and the METTL3 cleavage may require its N-terminal region. We will add this in the discussion.

      2) Based on current data, the generation of METTL3a and METTL3b are separated. Are there any factors that affect the cleavage ratio between METTL3a and METTL3b?

      We thank for reviewer’s excellent question. In this study, we show that both METTL3a and METTLb are produced through proteasomal cleavage, and both of them are positively regulated by the mTOR pathway. On the other hand, we indeed observed the differential cleavage ratios between METTL3a and METTL3b across different cell lines. For example, METTL3a/METTLb ratio was greater than 1 in MDA-MB-231 cells (see Figure 7C), less than 1 in T47D and 293T cell lines (see Figure 7A and 7B), and equal to 1 in MEF cells (see Figure 7O). Based on these results, we speculate that there may be some factors that control the cleavage ratio between METTL3a and METTL3b, which warrants further investigation. We will add this in the discussion.

      3) In Figure 2G, the author shows the result that incubation of the Δ198+Δ238 METTL3 protein with T47D cell lysates cannot produce the METTL3a and METTL3b variants. The author may also show the results that Δ198 METTL3 protein or Δ238 METTL3 protein incubates with T47D cell lysates, respectively.

      Following the reviewer’s suggestion, we will perform in vitro cleavage assays by incubation of METTL3-Δ238 or METTL3-Δ198 with T47D cell lysates, and will incorporate this result in the revised manuscript.

      4) As well as many results published in previous studies, the in vitro methylation assay shows that WT METTL3 is capable of methylating RNA probe (figure 2H). The main point of this study is that METTL3a is required for the METTL3-METTL14 assembly. However, the absence of METTL3a in the in vitro system did not inhibit METTL3-METTL14 methylation activity. Moreover, the presence of METTL3a even resulted in a weak m6A level.

      The main point of this study is that METTL3a is required for the METTL3-WTAP interaction, but dispensable for the METTL3-METTL14 assembly (see Figure 4A-4B). In this in vitro methylation assays, METTL3 and METTL14 is capable of methylating RNA probe in the absent of WTAP. In this condition, we found that METTL3 WT as well as its different variants (METTL3-Δ238, METTL3-Δ198, METTL3b and METTL3a) except the catalytically dead mutant METTL3 APPA showed methylation activity in vitro.

      5) In Figure 4A, the author suggests that WTAP cannot be immunoprecipitated with METTL3a and 3b because WTAP interacted with the N-terminal of METTL3. If this assay is performed in WT cells, the endogenous full-length METTL3 may help to form the complex. In this case, WTAP is supposed to be co-immunoprecipitated.

      We thank reviewer for point this out. METTL3 interacts with WTAP through its N-terminal (1-33aa) (1). Consistently, we find that the two cleaved forms METTL3a and METTL3b which lack the N-terminal region are not able to bind with WTAP. In Figure 4A, we overexpressed METTL3 WT as well as its different variants METTL3-Δ238, METTL3-Δ198, METTL3b and METTL3a respectively in WT cells, and compared their binding abilities with WTAP or METTL14 among these overexpressed METTL3 variants. We acknowledge that the exogenous METTL3a and METTL3b interact with endogenous full-length METTL3, and the endogenous full-length METTL3 may help them to form the complex with WTAP. But it is also noteworthy that the exogenous expression levels of METTL3a and METTL3b are much higher than that of endogenous full-length METTL3 (see Figure 3A and 3C). In this case, METTL3a or METTL3b predominantly interacts with itself, METTL3, METTL14 or other potential interacting proteins through its C-terminal region, this may greatly dilute the condition for the interaction between WTAP and endogenous full-length METTL3. Moreover, in Figure 4A, the comparison is among overexpressed METTL3 variants, this week indirect interaction through much lower expression levels of endogenous protein is not comparable to the direct interaction between the overexpressed METTL3 variant and WTAP.

      Reference:

      1. Schöller, E., Weichmann, F., Treiber, T., Ringle, S., Treiber, N., Flatley, A., Feederle, R., Bruckmann, A., and Meister, G. (2018). Interactions, localization, and phosphorylation of the m6A generating METTL3–METTL14–WTAP complex. RNA 24, 499-512.
    1. Author Response

      Reviewer #1 (Public Review):

      The authors present a study of visuo-motor coupling primarily using wide-field calcium imaging to measure activity across the dorsal visual cortex. They used different mouse lines or systemically injected viral vectors to allow imaging of calcium activity from specific cell-types with a particular focus on a mouse-line that expresses GCaMP in layer 5 IT (intratelencephalic) neurons. They examined the question of how the neural response to predictable visual input, as a consequence of self-motion, differed from responses to unpredictable input. They identify layer 5 IT cells as having a different response pattern to other cell-types/layers in that they show differences in their response to closed-loop (i.e. predictable) vs open-loop (i.e. unpredictable) stimulation whereas other cell-types showed similar activity patterns between these two conditions. They analyze the latencies of responses to visuomotor prediction errors obtained by briefly pausing the display while the mouse is running, causing a negative prediction error, or by presenting an unpredicted visual input causing a positive prediction error. They suggest that neural responses related to these prediction errors originate in V1, however, I would caution against over-interpretation of this finding as judging the latency of slow calcium responses in wide-field signals is very challenging and this result was not statistically compared between areas.

      Surprisingly, they find that presentation of a visual grating actually decreases the responses of L5 IT cells in V1. They interpret their results within a predictive coding framework that the last author has previously proposed. The response pattern of the L5 IT cells leads them to propose that these cells may act as 'internal representation' neurons that carry a representation of the brain's model of its environment. Though this is rather speculative. They subsequently examine the responses of these cells to anti-psychotic drugs (e.g. clozapine) with the reasoning that a leading theory of schizophrenia is a disturbance of the brain's internal model and/or a failure to correctly predict the sensory consequences of self-movement. They find that anti-psychotic drugs strongly enhance responses of L5 IT cells to locomotion while having little effect on other cell-types. Finally, they suggest that anti-psychotics reduce long-range correlations between (predominantly) L5 cells and reduce the propagation of prediction errors to higher visual areas and suggest this may be a mechanism by which these drugs reduce hallucinations/psychosis.

      This is a large study containing a screening of many mouse-lines/expression profiles using wide-field calcium imaging. Wide-field imaging has its caveats, including a broad point-spread function of the signal and susceptibility to hemodynamic artifacts, which can make interpretation of results difficult. The authors acknowledge these problems and directly address the hemodynamic occlusion problem. It was reassuring to see supplementary 2-photon imaging of soma to complement this data-set, even though this is rather briefly described in the paper.

      We will expand on the discussion of caveats as suggested.

      Overall the paper's strengths are its identification of a very different response profile in the L5 IT cells compared other layers/cell-types which suggests an important role for these cells in handling integration of self-motion generated sensory predictions with sensory input. The interpretation of the responses to anti-psychotic drugs is more speculative but the result appears robust and provides an interesting basis for further studies of this effect with more specific recording techniques and possibly behavioral measures.

      Reviewer #2 (Public Review):

      Summary:

      This work investigates the effects of various antipsychotic drugs on cortical responses during visuomotor integration. Using wide-field calcium imaging in a virtual reality setup, the researchers compare neuronal responses to self-generated movement during locomotion-congruent (closed loop) or locomotion-incongruent (open loop) visual stimulation. Moreover, they probe responses to unexpected visual events (halt of visual flow, sudden-onset drifting grating). The researchers find that, in contrast to a variety of excitatory and inhibitory cell types, genetically defined layer 5 excitatory neurons distinguish between the closed and the open loop condition and exhibit activity patterns in visual cortex in response to unexpected events, consistent with unsigned prediction error coding. Motivated by the idea that prediction error coding is aberrant in psychosis, the authors then inject the antipsychotic drug clozapine, and observe that this intervention specifically affects closed loop responses of layer 5 excitatory neurons, blunting the distinction between the open and closed loop conditions. Clozapine also leads to a decrease in long-range correlations between L5 activity in different brain regions, and similar effects are observed for two other antipsychotics, aripripazole and haloperidol, but not for the stimulant amphetamine. The authors suggest that altered prediction error coding in layer 5 excitatory neurons due to reduced long-range correlations in L5 neurons might be a major effect of antipsychotic drugs and speculate that this might serve as a new biomarker for drug development.

      Strengths:

      • Relevant and interesting research question:

      The distinction between expected and unexpected stimuli is blunted in psychosis but the neural mechanisms remain unclear. Therefore, it is critical to understand whether and how antipsychotic drugs used to treat psychosis affect cortical responses to expected and unexpected stimuli. This study provides important insights into this question by identifying a specific cortical cell type and long-range interactions as potential targets. The authors identify layer 5 excitatory neurons as a site where functional effects of antipsychotic drugs manifest. This is particularly interesting as these deep layer neurons have been proposed to play a crucial role in computing the integration of predictions, which is thought to be disrupted in psychosis. This work therefore has the potential to guide future investigations on psychosis and predictive coding towards these layer 5 neurons, and ultimately improve our understanding of the neural basis of psychotic symptoms.

      • Broad investigation of different cell types and cortical regions:

      One of the major strengths of this study is quasi-systematic approach towards cell types and cortical regions. By analysing a wide range of genetically defined excitatory and inhibitory cell types, the authors were able to identify layer 5 excitatory neurons as exhibiting the strongest responses to unexpected vs. expected stimuli and being the most affected by antipsychotic drugs. Hence, this quasi-systematic approach provides valuable insights into the functional effects of antipsychotic drugs on the brain, and can guide future investigations towards the mechanisms by which these medications affect cortical neurons.

      • Bridging theory with experiments

      Another strength of this study is its theoretical framework, which is grounded in the predictive coding theory. The authors use this theory as a guiding principle to motivate their experimental approach connecting visual responses in different layers with psychosis and antipsychotic drugs. This integration of theory and experimentation is a powerful approach to tie together the various findings the authors present and to contribute to the development of a coherent model of how the brain processes visual information both in health and in disease.

      Weaknesses:

      • Unclear relevance for psychosis research

      From the study, it remains unclear whether the findings might indeed be able to normalise altered predictive coding in psychosis. Psychosis is characterised by a blunted distinction between predicted and unpredicted stimuli. The results of this study indicate that antipsychotic drugs further blunt the distinction between predicted and unpredicted stimuli, which would suggest that antipsychotic drugs would deteriorate rather than ameliorate the predictive coding deficit found in psychosis. However, these findings were based on observations in wild-type mice at baseline. Given that antipsychotics are thought to have little effects in health but potent antipsychotic effects in psychosis, it seems possible that the presented results might be different in a condition modelling a psychotic state, for example after a dopamine-agonistic or a NMDA-antagonistic challenge. Therefore, future work in models of psychotic states is needed to further investigate the translational relevance of these findings.

      We fully agree that it is unclear how the effects of antipsychotics in mice relate to the drug effects that would be observed in schizophrenic patients. It is also correct that the reduction of the difference between closed and open loop locomotion onset response in L5 IT neurons (Figure 4) is not what we would have expected to find under the assumption that psychosis is characterized by a blunted distinction between predicted and unpredicted stimuli. We are not sure how to interpret this finding. However, it is probably important to note that the difference is only reduced when using a normalized comparison. Looking just at the subtraction of the two curves, the difference between closed and open loop locomotion onset responses remains unchanged before and after antipsychotic drug injection. The finding of a decorrelation of layer 5 activity, however, is easier to interpret under the assumption that layer 5 functions as an internal representation. If speech hallucinations, for example, are the consequence of a spurious activation of internal representations in speech processing areas of cortex, then antipsychotics might reduce the probability of these spurious activation events by reducing the lateral influence between layer 5 neurons in different cortical areas.

      We do indeed plan to address the question of how antipsychotics influence cortical processing in mouse models of schizophrenia in the future.

      • Incomplete testing of predictive coding interpretation

      While the investigation of neuronal responses to different visual flow stimuli Is interesting, it remains open whether these responses indeed reflect internal representations in the framework of predictive coding. While the responses are consistent with internal representation as defined by the researchers, i.e., unsigned prediction error signals, an alternative interpretation might be that responses simply reflect sensory bottom-up signals that are more related to some low-level stimulus characteristics than to prediction errors.

      This is correct – we will expand on the discussion of this point in the manuscript.

      Moreover, This interpretational uncertainty is compounded by the fact that the used experimental paradigms were not suited to test whether behaviour is impacted as a function of the visual stimulation which makes it difficult to assess what the internal representation of the animal actual was. For these reasons, the observed effects might reflect simple bottom-up sensory processing alterations and not necessarily have any functional consequences. While this potential alternative explanation does not detract from the value of the study, future work would be needed to explain the effect of antipsychotic drugs on responses to visual flow. For example, experimental designs that systematically vary the predictive strength of coupled events or that include a behavioural readout might be more suited to draw from conclusions about whether antipsychotic drugs indeed alter internal representations.

      We agree that much additional work will be necessary to identify internal representation neurons. However, it is difficult to envision how behavioral output could be used to make inferences about internal representations in sensory areas of cortex. In humans, for example, there is evidence that internal representations in visual cortex and behavioral output are not always directly related: binocular rivalry activates representations of both stimuli shown in visual cortex, while the conscious experience that drives behavioral output is only of one of the two stimuli. Hence, we would assume that the internal representation in visual cortex does not necessarily relate to behavioral output.

      • Methodological constraints of experimental design

      While the study findings provide valuable insights into the potential effects of antipsychotic drugs, it is important to acknowledge that there may be some methodological constraints that could impact the interpretation of the results. More specifically, the experimental design does not include a negative control condition or different doses. These conditions would help to ensure that the observed effects are not due to unspecific effects related to injection-induced stress or time, and not confined to a narrow dose range that might or might not reflect therapeutic doses used in humans. Hence, future work is needed to confirm that the observed effects indeed represent specific drug effects that are relevant to antipsychotic action.

      We agree that both dosages and a broader spectrum of non-antipsychotic compounds will need to be investigated. We are in the process of building a screening pipeline to perform exactly these types of experiments. We would however argue that the paper already includes a control condition in the form of the amphetamine data (Figure 7). While it is possible that amphetamine might have an effect that exactly cancels out potential i.p. injection- or stress-induced changes, we would argue it is more probable that these changes had no measurable effect on Tlx3 positive L5 IT neuron calcium activity per se. We will provide additional evidence that time or injection stress alone do not result in the observed effects.

      Conclusion:

      Overall, the results support the idea that antipsychotic drugs affect neural responses to predicted and unpredicted stimuli in deep layers of cortex. Although some future work is required to establish whether this observation can indeed be explained by a drug-specific effect on predictive coding, the study provides important insights into the neural underpinnings of visual processing and antipsychotic drugs, which is expected to guide future investigations on the predictive coding hypothesis of psychosis. This will be of broad interest to neuroscientists working on predictive coding in health and in disease.

      Reviewer #3 (Public Review):

      The study examines how different cell types in various regions of the mouse dorsal cortex respond to visuomotor integration and how antipsychotic drugs impacts these responses. Specifically, in contrast to most cell types, the authors found that activity in Layer 5 intratelencephalic neurons (Tlx3+) and Layer 6 neurons (Ntsr1+) differentiated between open loop and closed loop visuomotor conditions. Focussing on Layer 5 neurons, they found that the activity of these neurons also differentiated between negative and positive prediction errors during visuomotor integration. The authors further demonstrated that the antipsychotic drugs reduced the correlation of Layer 5 neuronal activity across regions of the cortex, and impaired the propagation of visuomotor mismatch responses (specifically, negative prediction errors) across Layer 5 neurons of the cortex, suggesting a decoupling of long-range cortical interactions.

      The data when taken as a whole demonstrate that visuomotor integration in deeper cortical layers is different than in superficial layers and is more susceptible to disruption by antipsychotics. Whilst it is already known that deep layers integrate information differently from superficial layers, this study provides more specific insight into these differences. Moreover, this study provides a first step into understanding the potential mechanism by which antipsychotics may exert their effect.

      Whilst the paper has several strengths, the robustness of its conclusions is limited by its questionable statistical analyses. A summary of the paper's strengths and weaknesses follow.

      Strengths:

      The authors perform an extensive investigation of how different cortical cell types (including Layer 2/3, 4 , 5, and 6 excitatory neurons, as well as PV, VIP, and SST inhibitory interneurons) in different cortical areas (including primary and secondary visual areas as well as motor and premotor areas), respond to visuomotor integration. This investigation provides strong support to the idea that deep layer neurons are indeed unique in their computational properties. This large data set will be of considerable interest to neuroscientists interested in cortical processing.

      The authors also provide several lines of evidence that visuomotor information is differentially integrated in deep vs. superficial layers. They show that this is true across experimental paradigms of visuomotor processing (open loop, closed loop, mismatch, drifting grating conditions) and experimental manipulations, with the demonstration that Layer 5 visuomotor integration is more sensitive to disruption by the antipsychotic drug clozapine, compared with cortex as a whole.

      The study further uses multiple drugs (clozapine, aripiprazole and haloperidol) to bolster its conclusion that antipsychotic drugs disrupt correlated cortical activity in Layer 5 neurons, and further demonstrates that this disruption is specific to antipsychotics, as the psychostimulant amphetamine shows no such effect.

      In widefield calcium imaging experiments, the authors effectively control for the impact of hemodynamic occlusions in their results, and try to minimize this impact using a crystal skull preparation, which performs better than traditional glass windows. Moreover, they examine key findings in widefield calcium imaging experiments with two-photon imaging.

      Weaknesses:

      A critical weakness of the paper is its statistical analysis. The study does not use mice as its independent unit for statistical comparisons but rather relies on other definitions, without appropriate justification, which results in an inflation of sample sizes.

      We will expand on both analyses and justifications throughout.

      For example, in Figure 1, independent samples are defined as locomotion onsets, leading to sample sizes of approx. 400-2000 despite only using 6 mice for the experiment. This is only justified if the data from locomotion onsets within a mouse is actually statistically independent, which the authors do not test for, and which seems unlikely. With such inflated sample sizes, it becomes more likely to find spurious differences between groups as significant. It also remains unclear how many locomotion onsets come from each mouse; the results could be dominated by a small subset of mice with the most locomotion onsets. The more disciplined approach to statistical analysis of the dataset is to average the data associated with locomotion onsets within a mouse, and then use the mouse as an independent unit for statistical comparison. A second example, for instance, is in Figure 2L, where the independent statistical unit is defined as cortical regions instead of mice, with the left and right hemispheres counting as independent samples; again this is not justified. Is the activity of cortical regions within a mouse and across cortical hemispheres really statistically independent? The problem is apparent throughout the manuscript and for each data set collected.

      This may partially be a misunderstanding. Figures 1F-1K indeed use locomotion onsets as a unit, but there were no statistical comparisons. In these Figures we were addressing the question of whether locomotion onsets in closed loop differ from those in open loop. Thus, we quantify variability as a unit of locomotion onsets. The question of mouse-to-mouse variability of this analysis is a slightly different one. We did include the same analysis (for visual cortex) with the variability calculated across mice as Figure S2. We will expand this supplementary figure with the equivalent data of Figure 3 to further address this concern.

      For Figure 1L (we assume the reviewer means Figure 1L, not Figure 2L), the unit we used for analysis was cortical area. We will update and improve the analysis. This was indeed not optimal, and we will replace the statistical testing with hierarchical bootstrap (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7906290/) to account for nested data.

      An additional statistical issue is that it is unclear if the authors are correcting for the use of multiple statistical tests (as in for example Figure 1L and Figure 2B,D). In general, the use of statistics by the authors is not justified in the text.

      We will update and improve the analysis shown in Figure 1L.

      In Figures 2B and 2D, we think adding family-wise error correction would be slightly misleading. We could add a correction – our conclusions would remain unchanged almost independent of the choice of correction (most of the significant p values are infinitesimally small, see Table S1). However, our interpretation is not focusing on one particular comparison (of many possible comparisons) that is significant - all comparisons between closed and open loop data points were significant in the L5 IT recordings and none of them were significant in the recordings in C57BL/6 mice that expressed GCaMP brain-wide.

      Finally, it is important to note that whilst the study demonstrates that antipsychotics may selectively impact visuomotor integration in L5 neurons, it does not show that this effect is necessary or sufficient for the action of antipsychotics; though this is likely beyond the scope of the study it is something for readers to keep in mind.

      We fully agree, it is still unclear how the effects we observe in our work relate to the treatment relevant effects in patients. We will expand on this point in the discussion.

    1. Author Response:

      Reviewer #1 (Public Review):

      […] The manuscript contains a large amount of data that make a major inroad on a new type of link between telomere replication and regulation of the telomerase. Nevertheless, the detailed choreography of the events as well as the role of PCNASUMO remain elusive and the data do not fully explain the role of the Stn1/Elg1 interaction. The data presented do not sufficiently support the claim that SUMOPCNA is a positive signal for telomerase activation.

      We thank the reviewer for her/his review efforts and opinion. We will resubmit a new version of the manuscript in which we will clarify some of the criticisms presented.

      Reviewer #2 (Public Review):

      […] The conclusions are largely supported by experiments examining protein-protein interactions at low resolution and ambiguous regarding directness of interactions like co-IP and yeast two-hybrid (Y2H) combined with genetics. However, some results appear contradictory and there's a lack of rigor in the experimental data needed to support claims. There is significant room for improvement and this work could certainly attain the quality needed to support the claims. The current version needs substantial revision and lacks the necessary experimental detail. Stronger support for the claims would add detail to help distinguish competing models.

      We thank the reviewer for her/his positive opinion. We will resubmit a new version of the manuscript in which we will clarify some of the criticisms presented by the referees, and add all the missing experimental details.

      Reviewer #3 (Public Review):

      This paper reveals interesting physical connections between Elg1 and CST proteins that suggest a model where Elg1-mediated PCNA unloading is linked to regulation of telomere length extension via Stn1, Cdc13, and presumably Ten1 proteins. Some of these interactions appear to be modulated by sumolyation and connected with Elg1's PCNA unloading activity. The strength of the paper is in the observations of new interactions between CST, Elg1, and PCNA. These interactions should be of interest to a broad audience interested in telomeres and DNA replication.

      We thank the reviewer for her/his positive opinion. We will resubmit a new version of the manuscript in which we will clarify some of the criticisms presented.

      What is not well demonstrated from the paper is the functional significance of the interactions described. The model presented by the authors is one interpretation of the data shown, and proposes that the role of sumolyation is temporally regulate the Elg1, PCNA and CST interactions at telomeres. This model makes some assumptions that are not demonstrated by this work (such as Stn1 sumolyation, as noted) and are left for future testing. Alternative models that envision sumolyation as a key in promoting spatial localization could also be proposed based on the data here (as mentioned in the discussion), in addition to or instead of a role for sumolyation in enforcing a series of switches governing a tightly sequenced series of interactions and events at telomeres. Critically, the telomere length data from the paper indicates that the proposed model depicts interactions that are not necessary for telomerase activation or inhibition, as telomeres in pol30-RR strains are normal length and telomeres in elg1∆ strains are not nearly as elongated as in stn1 strains. One possibility mentioned in the paper is the PCNAS and Elg1 interactions are contributing to the negative regulation of telomerase under certain conditions that are not defined in this work. Could it also be possible that the role of these interactions is not primarily directed toward modulating telomerase activity? It will be of interest to learn more about how these interactions and regulation by Sumo function intersect with regulation of telomere extension.

      We present compelling evidence for a role of SUMOylated PCNA in telomere length regulation. Figure 1 shows that this modification is both necessary and sufficient to elongate the telomeres, indicating that PCNA SUMOylation plays a positive role in telomere elongation. The model we present is consistent with all our results. There are, of course, possible alternative models, but they usually fail to explain some of the results. We agree that the fact that pol30-RR presents normal-sized telomeres implies that SUMO-PCNA is not required for telomerase to solve the "end replication problem", but rather is needed for "sustained" activity of telomerase. Since elongated telomeres (by absence of Elg1 or by over-expression of SUMO-PCNA) was the phenotype monitored, this may require sustained telomerase activity. Similar results were seen in the past for Rnr1 (Maicher et al., 2017), and this mode depends on Mec1, rather than Tel1 (Harari and Kupiec, 2018). Telomere length regulation is complex, and we may not yet understand the whole picture. It appears that for normal “end replication problem” solution, very little telomerase activity may be needed, and spontaneous interactions at a low level may suffice. Future work may find the conditions at which telomerase switches from "end replication problem" to "sustained" activity. We will add further explanations on this subject to the Discussion section.

      We suspect, but could not prove, a role for Stn1 SUMOylation in the interactions. SUMOylation is usually transient, and notoriously hard to detect, and despite the fact that many telomeric proteins are SUMOylated, Stn1 SUMOylation could not be shown directly by us and others (Hang et al, 2011).

    1. Author Response

      eLife assessment

      This study provides valuable information on the biogenesis of eccDNAs during spermatogenesis, i.e., eccDNAs in spermatogenic cells are not derived from miotic recombination hotspots but represent oligonucleosomal DNA fragments from apoptotic male germ cells, whose ends are ligated through microhomology-mediated end-joining. The study is currently incomplete because the method of bioinformatics needs more details and data interpretation should take the amplification bias into consideration.

      We highly appreciate the positive assessment.

      The negative assessment of our bioinformatics method is probably based on Reviewer #2’ comemnts. While Reviewer #1 considered that “Results from sequencing data analysis were presented elegantly”, Reviewer #2 overlooked some details and raised several critiques regarding our bioinformatics method. We respectfully disagree with many of his or her critiques: (I) Reviewer #2 considered that our method was not fully described. However, we have illustrated the principle and steps of our eccDNA detection method by Figure 4C and Figure 4-figure supplement 2, and submited our source codes to GitHub. (II) Reviewer #2 had concerns on the reliability of our method. However, we have revealed that it has comparible sensitivity and specificity with established bioinformatics tools (Figure 4—figure supplement 2C), and even higher accuracy on the assignment of eccDNA boundaries (Figure 4—figure supplement 2A). (III) Reviewer #2 also believed that “the similarity between the eccDNA profiles of human and mouse sperm remains uncertain”. However, we believe that our Fig. 5 have clearly shown that human sperm eccDNAs have exactly the same characteritics with mouse sperm eccDNAs. Nevertheless, in revised manuscript, we will add more description to help readers to better understand our method, and perform additional analyses to further back up our claims.

      The amplification bias is indeed a problem of Circle-seq. Following editors’ and Reviewer #1’s insightful suggestions, we will analyze other datasets generated either by rolling circle amplification or not to see how our findings are affected. Additionally, we will consider to add one section to remind readers of the limitations of rolling-circle amplification-based Circle-seq and our data interpretation.

      Reviewer #1 (Public Review):

      This study aims to address the mechanism of eccDNA generation during spermatogenesis in mice. Previous efforts for cataloging eccDNA in mammalian germ cells have provided inconclusive results, particularly in the correlation between meiotic recombination and the generation of eccDNA. The authors employed an established approach (Circle-seq) to enrich and amplify eccDNA for sequencing analyses and reported that sperm eccDNA is not associated with miotic recombination hotspots. Rather, the authors reported that eccDNAs are widespread, and oligonucleosomal DNA fragments from sperm undergoing apoptosis, with the ligation of DNA ends by microhomology-mediated end-joining, would be a major source of eccDNA.

      The strength of the study includes evaluating the eccDNA contents not only in sperm but also from earlier stages of cells in spermatogenesis. The differences in eccDNA size peaks between sperm and other progenitors, in particular, the unique peak in sperm around 360 bp, are intriguing. Results from sequencing data analysis were presented elegantly.

      We are grateful to Reviewer #1 for his or her recognition of the strength of this study.

      I also have critiques. First, the lack of eccDNA quality control step is a concern. Previous studies employed electron microscopy to ensure that DNA species are mostly circular before rolling-circle amplification. Phi29 polymerase is widely used for DNA amplification, including whole genome amplification of linear chromosomal DNA. Phi29 polymerase has a high processivity and strand displacement activity. When those activities occur within a molecule, it creates circular DNA from linear DNA in vitro. In vitro-created eccDNA from linear DNA would be randomly distributed in the genome, which may explain the low incidence of common eccDNA between replicates. Therefore, it will be crucial to show that DNA prior to amplification is dominantly circular. Electron microscopy would be challenging for the study because the relatively small number of cells were processed to enrich eccDNA. An alternative method for quality controls includes spiking samples with linear and circular exogenous DNA and measuring the ratios of circular/linear control DNA before and after column purification/exonuclease digestion. eccDNA isolation procedures can be validated by a very high circular/linear control DNA ratio.

      We highly appreciate Reviewer #1’s insightful suggestions. We would like to perform eccDNA quality control by introducing circular exogenous DNA into our samples and measuring its ratio to endogenous linear DNA before and after eccDNA isolation procedures.

      Another critique is regarding the limitation of the study. It is important to remind the readers of the limitations of the study. As the authors mentioned, rolling circle amplification preferentially increases the copy numbers of smaller eccDNA. Therefore, the native composition of eccDNA is skewed. In addition, the candidate eccDNAs are identified by split reads or discordant read pairs. The details of the mapping process are unclear from the methods, but such a method would require reads with high mapping quality; the identification of eccDNA is expected to require sequencing reads that are mapped to genomic locations uniquely with high confidence, and reads mapped to more than one genomic location, such as highly similar repeat sequences or duplications, are eliminated. Such identification criteria would favor eccDNA formed by little or no homology at the junction sequences, and eliminate eccDNA formed by long homologies at the ends, such as eccDNA formed exclusively by satellite DNA. Therefore, it is not surprising that the authors found the dominance of microhomology-mediated eccDNA. It remains to be determined whether small eccDNA with microhomologies are the dominant species of eccDNA in the native composition. In this regard, it is noted that similar procedures of eccDNA enrichment (column purification, exonuclease digestion, and rolling circle amplification ) revealed variable sizes and characteristics of eccDNA in sperm (human from Henriksen et al. or mice from this study), dependent on the methods of sequencing (long-read or short-read sequencing). Considering these limitations, the last sentence of the introduction, "We conclude that germline eccDNAs are formed largely by microhomology mediated ligation of nucleosome protected fragments, and barely contribute to de novo genomic deletions at meiotic recombination hotspots" needs to be revised.

      We thank Reviewer #1 for pointing out limitations of the study. We will take into account and integrate the perspectives of Reviewer #1 in our revised manuscript. We will also try to analyze eccDNA datasets generated by long-read sequencing to see how our conclusions might be affected. However, we envision that it might be challenging to examine the contribution of microhomology-mediated ligation to eccDNA biogenesis using long-read sequencing data as the sequencing error rate of nanopore long-read sequencing data is very high.

      Small eccDNA (microDNA) data from various mouse tissues are available from the study by Dillion et al., (Cell Reports 2015). Authors are encouraged to examine whether the notable findings in this study (oligonucleosomal-sized eccDNA peaks and the association with apoptotic cell death) are unique to sperm or common in the eccDNA from other tissues.

      We are thankful to Reviewer #1 for this suggestion. We would like to analyze additional eccDNA sequencing datasets to see whether our findings are unique to sperm or common for other tissues.

      Reviewer #2 (Public Review):

      This study presents a useful investigation of eccDNAs in spermatogenesis of mouse. It provides evidence about the biogenesis of eccDNAs and suggests that eccDNAs are derived from oligonucleosmal DNA fragmentation during apoptosis by MMEJ and may not be the direct products of germline deletions. However, the method of data analyses were not fully described and data analysis is incomplete. It provides additional observations about the eccDNA biogenesis and can be used as a starting point for functional studies of eccDNA in sperms. However, many aspects about data analyses and data interpretations need to be improved.

      We thank Reviewer #2 for his or her critical reading. However, we respectfully disagree with some critiques on our data analyses (see below). Anyway, we will provide more method details in addition to Fig. 4C and Figure 4-figure supplement 2 that have illustrated the principle and steps of our method, as well as the performance in comparison with established methods. We will also perform additional analyses and make some clarifications in revised manuscript (see below).

      • Most of the conclusions made by the work are only based on the bioinformatics analyses, the validation of these foundlings using other method (biochemistry/molecular biology method) are missing. For example, no QC results presented for the eccDNA purification, which may show whether contaminates such as linear DNA or mitochondria DNA have been fully removed. Additionally, it is also helpful to use simple PCR to test the existence of identified eccDNAs in sperm or other samples to validate the specificity of the Circle-seq method.

      Following both this Reviewer’s and Reviewer #1’s suggestions, we will introduce circular exogenous DNA into our samples and measure its ratio to endogenous linear DNA and to mitochondria DNA before and after eccDNA isolation procedures. We will also try to perform PCR to test the existence of identified eccDNAs.

      • The reliability of the data analysis methods is uncertain, as the authors constructed and utilized their own pipeline to identify eccDNAs, despite the availability of established bioinformatics tools such as ECCsplorer, eccFinder, and Amplicon Architect. Moreover, the lack of validation of the pipeline using either ground truth datasets or simulation data raises concerns about its accuracy. Additionally, the methodology employed for identifying eccDNA that encompasses multiple gene loci remains unclear.

      In fact, we have compared the performance between our method and established methods for identification of eccDNA regions, such as Circle_finder, Circle_Map and ecc_finder. Our method has comparable sensitivity and specificity with existing methods, especially Circle_finder and Circle_Map (Figure 4—figure supplement 2C). We also used one specific genomic region to show that existing methods identified the same eccDNA regions but misassigned the eccDNA boundaries (Figure 4—figure supplement 2A). These results have been shown in Figure 4—figure supplement 2. We will highlight the information to make it more clear in our revised manuscript. We will further detect eccDNAs by ECCsplorer for comparison. Since Amplicon Architect is more specifically designed for detection of ecDNAs, it will not be included in our comparison. We will also try to perform PCR to validate the identified eccDNAs.

      As pointed out by Reviewer #2, similar to ECCsplorer, Circle_finder, Circle_Map and ecc_finder, our method fails to identity ecDNAs that encompass multiple gene loci. We will remind readers of this limitation in our revised manuscript.

      • Although the author stated that previous studies utilizing short-read sequencing technologies may have incorrectly annotated eccDNA breakpoints, this claim requires careful scrutiny and supporting evidence, which was not provided in the manuscript.

      As abovementioned, we used one specific genomic region to show that existing methods all misassigned the eccDNA boundaries (Figure 4—figure supplement 2A). In revised manuscript, we will provide necessary statistics to support this claim.

      • The similarity between the eccDNA profiles of human and mouse sperm remains uncertain, and therefore, analyses of human eccDNA data and comparisons between the two are necessary if the authors claim that their findings of widespread eccDNA formation in mouse spermatogenesis extend to human sperms.

      We believe that our Fig. 5 have clearly shown that human sperm eccDNAs are also originated from oligonucleosomal fragmentation (Fig. 5A-C), not associated with meiotic recombination hotspots (Fig. 5D and E) but formed by microhomology directed ligation (Fig. 5F and G). These findings are consistent with what we observed in mouse sperm eccDNAs. Nevertheless, we will analyze additional public datasets to further back up our claim in revised manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      This is an interesting manuscript that proposes a new approach to for accounting for viral diversity within hosts in phylogenetic analyses of pathogens. Concretely, the authors consider sites for which a minor allele exist as an additional base in the substitution model. For example, if at a particular site 60% of reads have an C and 40% have a G, then this site is assigned Cg, as opposed to an C which is typical of analysing consensus sequences. Because we typically model sequence evolution as a Markovian process, as is the case here, the data become naturally more informative, given that there are more states in the Markov chain when adding these bases. As a result, phylogenetic trees estimated using these data are better resolved than those from consensus sequences. The branches of the trees are probably also longer, which is why temporal signal becomes more apparent.

      I commend the authors on their rigorous simulation study and careful empirical data analyses. However, I strongly suggest they consider whether treating minor alleles as an additional base is biologically realistic and whether this may have implication for other analyses, particularly when there is very high within-host diversity and the number of states in becomes very large.

      We thank the reviewer for the helpful and thorough review. We have included a paragraph in the Discussion regarding the biological interpretation of the 16-state model (Line 344-351), as well as the consequences when there’s high within-host diversity (Line 398).

      Reviewer #2 (Public Review):

      I agree that minor genetic variation could potentially be used to more accurately infer who-infected- whom in an outbreak scenario. Indeed, the use of minor genetic variation has proven very useful in reconstructing transmission chains for chronic infections such as HIV (e.g., see applications using Phyloscanner). To me, it seems that considering the full spectrum of viral genetic diversity within infected hosts would necessarily do the same if not better than considering only consensus-level viral sequence data. This is because there is a necessarily a loss of data and potentially a loss of information when going from considering the genetic composition of viral populations within a host to only considering the consensus sequences of those viral populations. As such, Ortiz et al.'s hypothesis stated on lines 66-70 is a reasonable one, and I was looking forward to seeing this hypothesis evaluated in detail in this manuscript.

      R2.1 There are several parts of this manuscript I really like. In particular, encoding within-sample diversity as character states and using that alternative representation of sequence data for phylogenetic inference (as shown in Figure 3) is a very interesting idea, I think. There are some limitations that are not explicitly mentioned, however. For example, when using this 16-character state representation for phylogenetic inference, they assume independence between nucleotide sites. This is a major assumption that can be violated when considering longitudinal intrahost data and transmission dynamics in an outbreak setting, given genetic linkage between sites.

      We have generated another set of simulations where the starting tree was a coalescent tree rather than a random phylogeny. This is described in the Results section, Line 228, and Figure 4—figure supplement 2. By using a coalescent tree, we increase the genetic linkage between sites. For all metrics used, the 16-state model performed better than the consensus sequence model. It is also important to note, as the reviewer points out, that longitudinal isolates should be removed from transmission inference, as we do in Figure 7 and Figure 7—figure supplement 2.. This point is now reflected in the Results (Line 286) and Methods (Line 534).

      I have several major concerns about the work as it stands, particularly in the context of the SARS-CoV-2 application.

      Concerns not related to the SARS-CoV-2 application:

      R2.2 Figure 4 shows that a model using within-sample diversity can more accurately reconstruct evolutionary histories than a model that uses only consensus-level genetic data. This is really interesting. The Materials and Methods section (particularly lines 351-354) indicates that the sequence data were generated using certain specified substitution rates. The rates specified seem to be chosen in such a way to facilitate finding an improvement when using within-sample diversity. I don't know whether the relative rates of these 'substitutions' at all mirror "real-life". It would be very useful to have a broader set of analyses here to examine the effect of these 'substitution' rates on the utility of incorporating within-sample diversity into phylogenetic inference. (Also, 1, 100, 200 (line 353) inconsistent with 1, 20, 200 in Supp Table 3)

      We have now corrected Supp Table 3 to reflect the rates described in the Methods section.

      We defined our model with three rates: rate of minor variant acquisition, rate of minor-major variant switch, and rate of minor variant loss. We chose the rates for the simulations (1, 100, 200) to reflect a low rate of minor variant acquisition (1) and high rates of minor-major variant switch (200) and minor variant loss (100). These rates will result in pure bases (A,C,G and T) 100 times more likely to be present than low frequency variants, as seen in the base frequencies in Supp Table 1 and 3, which would in turn minimize the effect of including minor variations. We chose these rates to reflect the high turnover of minor variation often observed in real data and the frequencies of minor alleles in the SARS-CoV-2 dataset, but we agree with the reviewer that this may not always be the case. We also agree with the reviewer that changing the parameters in the simulations also affects the effect of including low frequency variation in the model. As such, we have now included simulations using different sets of rates (Figure 4—figure supplement ):

      1) With a high rate of variant switch and loss compared to acquisition (1, 10, 100), reducing the frequency of minor variation.

      2) With a lower rate of switch and loss (1, 10, 10), promoting a stable landscape of low frequency variation.

      3) With no low frequency variation (Jukes-cantor model)

      R2.3 Figure 5 is very interesting, particularly the results at bottleneck sizes of 1-10. What are the 'substitution' rates that are inferred here from using this simulated dataset? The Material and Methods section also does not mention the within-host viral generation time anywhere, as far as I can see (~line 384 states the mutation rate per base per generation cycle but not the length of the generation cycle anywhere).

      Fastsimcoal2 is a coalescent simulator of population histories over several generations, given a population size and a mutation rate. For our purposes, transmissions are simulated as bottlenecks of constant size, and a generation is represented by each time step in the outbreak simulation, which corresponds to 1 day. This is further clarified in the Methods section (Line 475).

      Concerns related to the SARS-CoV-2 application:

      R2.4 I am very concerned about the testing of this hypothesis on the SARS-CoV-2 data presented. First, 1% is a very low variant calling threshold. Second, analysis of the 17 samples that were resequenced (out of 454) indicated that on average, 39% of iSNVS (intrahost single nucleotide variants) called between duplicate runs were only observed in one of the two runs (line 117). Their analysis in Figure 1 indicates that these discrepant (and seemingly spurious) variants occur at higher levels in high Ct samples (which makes sense; Figure 1b). They therefore decide to limit their analyses to samples with Ct values <= 30. This results in 249 samples. However, if we look at Figure 1b, only ~10% of iSNVs called across duplicate runs with Ct = 30 are shared! That means that 90% of iSNVs in the set appear to be spurious. If we assume that each duplicate run of a sample has approximately the same number of spurious iSNVs, then approximately 82% of iSNVs called in a sample with a Ct of 30 would be spurious. This fraction decreases with samples that have lower Ct values, but even at a Ct of 27, only ~60% of iSNVs called across duplicate runs are shared. All the downstream SARS-CoV-2 analyses based on within-host sample diversity therefore are based on samples where the large majority of considered sample diversity is not real. This leads to me necessarily discounting all of those downstream SARS-CoV-2 results.

      We agree with the reviewer that, as the results show, datasets that incorporate within-sample low frequency variation are expected to have considerably more noise than using exclusively consensus sequences, and perhaps this wasn’t properly discussed in the manuscript. We have incorporated some notes about this in the Discussion section (Line 408-413).

      The 1% variant frequency threshold was used to generate the analysis of Fig. 1 and Supp. Fig. 1-4. Looking at these results, we decided to establish the Ct cutt-off of 30 as mentioned by the reviewer, as well as a variant frequency threshold of 2% (as shown in the x-axis of Fig. 2). We overlooked this second variant frequency threshold in the manuscript, which has been added. As shown in Supp. Fig 4, this variant frequency threshold will increase the concordance between technical replicates, although some level of noise persists.

      R2.5 Lines 153-167: I can't figure out how to square the quantitative results given in this paragraph with what is shown in Figure 2. To me, Figure 2 shows only that Technical Replicates have higher probabilities of sharing a variant than with 'No' relationship. What would also be helpful here so that the reader can get a better feel for the data would be to see the iSNV frequencies plotted over time for the longitudinal replicate samples in the supplement and, for the 'epidemiological' samples to show 'TV plots' in the supplement (as in Fig 3c in McCrone et al. eLife)

      Figure 2 shows that technical replicates, longitudinal replicates, epidemiological samples and, in some instances, from the same department have a higher probability of sharing low frequency variants than those with no relationship (also shown in Supp Figure 5). However, also shown in Figure 2 is that the 95% CI is very wide, and therefore in many instances low frequency variants won’t be shared between epidemiological samples or samples from the same department.

      We have also added Figure 2—figure supplement showing the low frequency variants plotted over time for longitudinal replicates. Unlike McCrone et al, we don’t have proven transmission between pairs of samples, although we believe our analysis also shows a pattern of shared low frequency variants among potential epidemiological links.

      R2.6 Figure 6 and associated text: (a) root-to-tip distance: what units is this distance in? (b) That the authors find a temporal signal in these transmission clusters (where all consensus sequences within a cluster are the same) is interesting but also a bit baffling to me. Given the inference of very small transmission bottlenecks in previous studies (e.g., Martin & Koelle - reanalysis of Popa et al.; Lythgoe et al.; Braun et al.), I don't understand where the temporal signal comes in. Do the samples become more genetically diverse over the outbreak (this seems to be indicated in lines 260-262 but never shown and unlikely given bottleneck sizes)? Additional analyses to help the reader understand WHY within-sample diversity allows for the identification of temporal signal is important. This could involve plotting genetic diversity of the samples by collection date or some other, similar analyses.

      a) The units of the y-axis (root-to-tip distance) are measured in substitutions per genome. This is now reflected in the legend of the figure.

      b) As shown in Figure 5, even at small bottleneck sizes we are able to pick some of the diversity that evolves during the course of an outbreak. As hinted by the reviewer, the smaller the bottleneck the less diversity we can leverage for phylogenetic inference, and in fact for some epidemiological samples all the diversity will be lost during transmission, which is why many of the within-sample variants are not shared between the epidemiologically related samples. Figure 6 is indeed showing that the genetic distance (measured as number of substitutions per genome) increases per collection date. We have also added a Figure 6—figure supplement showing the increase in low frequency variants within outbreaks as the outbreaks progress in time (explained in Line 261 of the Results section), which explain in part the increasing temporal signal in clusters.

      R2.7 Paragraph consisting of lines 229-238 and Figure 7: This analysis stops abruptly. What are the conclusions here? Figure 7a (right) seems inconsistent to me with Figure 7b and 7C results. Also, the main hypothesis put forward in this paper is that within-sample sequence data can better resolve who-infected-whom in an outbreak setting. Figure 7b and 7c however are never compared against analogous panels that use just consensus sequences. (Even though the consensus sequences are the same, according to Figure 7a, the inferences shown in Figures 7b and 7c could use additional data such as collection times, etc. that would provide information even when using exclusively consensus-level data). Also, do the analyses in Figures 7b and 7c use the 16-character state model at all? I think Supp Figure 9 is relevant here but not sure how?)

      We have extended this section of the results to make it more coherent and clear (Line 284-293) and in the Discussion (Line 385-395). As added into the Discussion, we agree with the reviewer that even with equal sequences some inferences about transmission can be made with epidemiological data, specially collection dates. However, such data can’t be used to infer the genetic structure of the cluster, which complicates any analysis that can use a phylogenetic as input.

      Additional concerns:

      R2.8 Some of the stated conclusions, particularly in the Discussion section and in the Abstract, do not seem to be supported by the presented results. For example, line 27: 'within-sample diversity is stable among repeated serial samples from the same host': Figure 2 does not show this conclusively. Line 28: 'within-sample diversity... is transmitted between those cases with known epidemiological links': Figure 2 also does not show this conclusively. Line 29: 'within-sample diversity... improves phylogenetic inference and our understanding of who infected whom': Figure 7b/c results using within-sample diversity is never compared against results that use only consensus, so improvement not demonstrated. Line 272-273: 'samples with shorter distance in the consensus phylogeny were more likely to share low frequency variants'. Line 287: 'We demonstrated that phylogenies... were heavily biased'.

      Line 27 and Line 28: We agree with the reviewer that the genomic analysis of SARS-CoV-2 sequences show only partial congruence within technical replicates and epidemiological links. We have appropriately addressed this in the Abstract.

      Line 29 and Fig 7: Transmission inference using the consensus sequence in Figure 7b/c couldn’t be performed because the lack of any genetic difference between the consensus sequence meant that all sequences had the same transmission likelihood. This is now better explained in the Discussion section, lines 385-395.

      Line 272-273: We have removed this section as we did not perform this analysis, as pointed out by the reviewer.

      Line 287: The conclusion expressed in line 287 (now line 340) has been changed.

      R2.9 The manuscript at times does not cite previous work that is highly relevant and thus overstates the novelty of the current work. For example: lines 21-23: '..conventional whole-genome sequencing phylogenetic approaches to reconstruct outbreaks exclusively use consensus sequences...' Phyloscanner uses within-sample diversity, for example, as does SCOTTI. These are finally cited in the discussion section (~line 310), but because this previous work is not acknowledged earlier in the manuscript, the novelty of the work presented here is somewhat overstated.

      We have included background information in the introduction regarding the use of within-sample diversity for transmission inference (Line 69-73), as well as emphasizing that the novelty of our work lies more in the use of within-sample diversity in phylogenetic inference rather than exclusively transmission inference (Line 74, and other instance along the manuscript).

      In sum, I think that the 16 character-state model is a very interesting model. More analyses on simulated data would be helpful to expand on when below-the-consensus level genetic data would truly be informative of phylogenetic relationships and who-infected-whom in outbreak settings. The SARS-CoV-2 analyses are very worrisome to me, given the inclusion of samples where the majority of considered within-sample genetic diversity is very likely not real. Some of the stated conclusions appear to either be at odds with the results presented or not directly evaluated.

    1. Author Response

      Reviewer #2 (Public Review):

      Targeted genetic engineering with programmable nucleases and other targetable enzymes (aka "genome editing") has emerged as a technology with curative potential in hemoglobinopathies, sickle cell disease, and beta-thalassemia. Multiple ongoing clinical trials are evaluating such editing using distinct approaches: elevation of fetal hemoglobin (HbF), direct repair of the mutation causing SCD, and engineering of a Hb variant. The present work explores a different strategy: the targeted engineering of the promoter of a paralog of adult beta-globin known as HBD. This is a timely effort because there has emerged, over the past decade, a clear and charted path for advancing any such approach to human clinical trials. The study identifies three transcription factor binding sites as divergent in the HBD promoter vs the HBB one. A homology-directed repair (HDR)-based scheme using oligonucleotide repair templates in combination with a CRISPR-Cas9-induced double-strand break (DSB) is designed and used to generate pools of human immortalized cells bearing one, two, or all three such de novo introduced TF binding sites at the HBD promoter. Only the latter scheme is shown to measurably increase HBD (following erythroid differentiation) in pools of cells and single-cell-derived clones as gauged by qPCR and HPLC. A similar analysis is performed on pools of erythroid-like cells generated from genome-edited human hematopoietic stem and progenitor cells (HSPCs), as well as genetically clonal erythroid colonies bearing the edits of interest; trends in these data support the observations made on the immortalized cells. Overall the data support the notion that HBD promoter genome editing has the potential as a strategy to normalize hemoglobin synthesis in hemoglobinopathies. Further, the data support an advance of this approach down a well-established path of preclinical development in such cases: increasing the efficiency of genome editing in HSPCs to what would be deemed therapeutically useful, assessing the genotoxic burden from the editing, evaluating the potential negative impact on stemness, and determining whether this approach would normalize hemoglobin synthesis in the erythroid progeny of patient HSPCs.

      We thank reviewer 2 for their input on our manuscript, especially sharing their insight from a clinical path perspective.

      The genome editing scheme for the "KDT" strategy in Fig 1B involves the introduction of three binding sites for transcription factors at progressively increasing distances from the site of the DSB induced by Cas9. It would be of interest to determine from the next-generation-sequencing data whether partial gene conversion tracks are observed at the edited locus (Elliott and Jasin MCB 18: 93), and if yes, whether these affect in some way the pool-level measurement by qPCR on HBD mRNA levels (Fig 1D).

      For the analysis of our NGS reads, we utilized the CRISPResso2 analysis pipeline. After CRISPResso2 aligns the reads and makes allelic calls of either unmodified, NHEJ, or HDR. It is important to note that the KDT of our HBD knock-in construct, is not identical to the HBB promoter. Through simple searching of the CRISPResso aligned-reads, we did not find any HBB promoter sequence present. In this regard, our CRISPResso analysis does not seem to find any gene-conversions between HBB and HBD. However, we cannot rule out the possibility of gene conversions altogether – it can be that since our primers for NGS anneal specifically to HBD, we are unable to amplify, and therefore unable to see, the alleles in which these gene conversion events occurred.

      The data in Fig 2A show an analysis of transcription factor and RNA pol II occupancy following genome editing at HBD. The figure legend refers to these data as having been obtained on single-cell-derived clones bearing the edits in homozygous or heterozygous form, but it is unclear from fig 2A, which clones were used for which analysis.

      We have now clarified this point in the figure legend.

      The data in Fig 3C present an analysis of HBD levels in erythroid colonies derived from genome-edited HSPCs. It would be helpful to clarify whether an individual dot represents a single such colony (this would seem to be the case from the cognate figure legend). If so, what number of such colonies would one need to obtain to gain a clearer sense of the effect on HBD levels from the various genome editing strategies used?

      Indeed, each dot represents a singular colony. We have now expanded this dataset from colonies derived from n=2 HSPC donors to n=4 HSPC donors. Figure 3C and Supp Fig 3 have been updated accordingly.

      It would be helpful to comment, in the Discussion, on potential genome editing strategies to obtain high-efficiency pool-level uniform long-track gene conversion that is necessary to obtain high HBD levels in the progeny of edited CD34 cells. Would this be a good application of the AAV6 strategy developed by the Sangamo and Porteus groups? Would prime editing as developed by Liu be an option here?

      Prime editing can introduce small insertions, but still has limitations of low-editing efficiency (https://doi.org/10.1016/j.tibtech.2023.03.004). Additionally, our KDT construct would require a larger insertion that prime editing would not be able to facilitate easily. In light of the adverse effects using AAV6 for biotech company Graphite Bio, we will not suggest this in the discussion.

      It would be equally helpful, in the Discussion, to place the level of HbA2 obtained via the strategy shown in the manuscript in the context of other genome-editing-based approaches for normalizing Hb synthesis in the hemoglobinopathies (ie HbF elevation by editing the BCL11A enhancer, or the gammaglobin promoter; or direct repair of the SCD mutation; or engineering of Hb Makassar).

      We have now added a new section in the discussion summarizing some of the recent genome editing approaches for hemoglobinopathies. Specifically, we mention CRISPR Therapeutics’ clinical trial on the BCL11A enhancer, David Liu’s most recent paper on base-editing to correct the SCD mutation, and Annarita Miccio’s recent paper on disrupting a repressor binding site on the gamma-globin promoter.

      Reviewer #3 (Public Review):

      This is a well-written and referenced paper from the laboratory of an outstanding senior investigator. Dr. Corn and colleagues demonstrate convincingly that correction of three transcription factor binding sites in the delta-globin gene promoter results in high levels of delta-globin expression in HUDEP-2 clonal cell populations (Fig. 2B and C) and in CD34+ HSPC (hematopoietic stem and progenitor cells) clonal cell expansions (Fig. 3C). Although correction of the mutant KLF1 binding site has previously been shown to upregulate delta-globin gene transgenes, this new data demonstrate that correction of multiple factor binding sites is required to achieve high-level expression of the delta-globin gene in the endogenous beta-globin gene locus. The results are important because high delta-globin protein levels inhibit the formation of sickle hemoglobin (HbS) polymers that cause sickle cell disease.

      We thank reviewer 3 for their feedback on our manuscript.

      Unfortunately, high levels of delta-globin gene expression were not observed after editing of pooled (non-clonal) populations of HUDEP-2 cells (Fig. 1D) or CD34+ HSPC pooled cell populations (Fig. 3B). This result suggests that correction of all 3 promoter elements on individual alleles in CD34+ HSPC populations is far below the level required to be clinically relevant.

      We have added to the discussion on ways to improve HDR efficiency. Additionally, we show new data where we utilize an HDR enhancer drug and show that we can increase HDR and overall HBD in edited pooled populations of HSPCs (Fig 3 C and D)

      Also, NHEJ is high in CD34+ HSPC (Fig. 3A); therefore, promoter deletions will inactivate many alleles, and total hemoglobin levels in erythrocytes derived from populations of edited CD34+ HSPC will be much less than normal (29 pg/cell). These cells would be extremely beta-thalassemic.

      We were not completely sure about the origin of this point, since our edits are aimed at HBD, which makes up less than 5% of total hemoglobins under normal conditions. NHEJ occurring in HBB (e.g. when doing HDR for direct correction) would potentially yield thalassemic cells. But indels in the HBD promoter might at most cause a 5% decrease in total globin levels (if delta expression was completely destroyed). We have performed a new experiment to explicitly address this point. We edited n=4 CD34+ HSPCs donors and compared unedited populations to populations edited with Cas9+HBD gRNA but no repair template. This represents a “worst case” scenario, in which there can be no HDR-based promoter engineering and only NHEJ. These data are included this in Supplementary Figure 3. We observed high editing efficiency of 61 – 78% in the HBD promoter. We performed qRT-PCR of the beta-like globins in edited pools and normalized to HBA, reasoning that HBA is a neutral control for absolute levels of each globin in the beta locus because HBA is located in a different locus. By qRT-PCR, HBD transcripts were decreased by half compared to mock treated cells, while HBB and HBG1/2 were non-significantly affected. But as mentioned above, HBD expression makes up less than 5% of total hemoglobins, and therefore a half reduction in HBD represents a total reduction of 2.5% of globins. We do acknowledge that this experiment does not specifically quantify the rates of large deletions that might span from delta to beta, and further studies would be needed to address this point. But if such large deletions do exist, they do not greatly affect beta expression. We have included this in the results and the discussion section.

    1. Author Response

      Reviewer #1 (Public Review):

      In this interesting manuscript, Nasser et al explore long-term patterns of behavior and individuality in C. elegans following early-life nutritional stress. Using a rigorous, highly quantitative, high-throughput approach, they track patterns of motor behavior in many individual nematodes from L1 to young adulthood. Interestingly, they find that early-life food deprivation leads to decreased activity in young larvae and adults, but that activity between these times, during L2-L4, is largely unaffected. Further, they show that this "buffering" of stress requires dopamine signaling, as L2-L4 activity is significantly reduced by early-life starvation in cat-2 mutants. The paper also provides evidence that serotonin signaling has a role in modulating sensitivity to stress in L1 larvae and adults, but the size of these effects is modest. To evaluate patterns of individuality, the authors use principal components analysis to find that three temporal patterns of activity account for much of the variation in the data. While the paper refers to these as "individuality types," it may be more reasonable to think of these as "dimensions of individuality." Further, they provide evidence that stress may alter the strength and/or features of these dimensions. Though the circuit mechanisms underlying individuality and stress-induced changes in behavior remain unknown, this paper lays an important foundation for evaluating these questions. As the authors note, the behaviors studied here represent only a small fraction of the behavioral repertoire of this system. As such, the findings here are an interesting and very promising entry point for a deeper understanding of behavioral individuality, particularly because of the cellular/synaptic-level analysis that is possible in this system. This paper should be of interest to those studying C. elegans behavior and also more generally to those interested in behavioral plasticity and individuality.

      We thank the reviewer for finding our results interesting.

      Reviewer #2 (Public Review):

      This paper set out to understand the impact of early life stress on the behavior and individuality of animals, and how that impact might be amplified or masked by neuromodulation. To do so, the authors built on a previously established assay (Stern et al 2017) to measure the roaming fraction and speed of individuals. This technique allowed the authors to assess the effects of early life starvation on behavior across the entire developmental trajectory of the individual. By combining this with strains with mutant neuromodulatory systems, this enabled the authors to produce a rich dataset ripe for analysis to analyze the complicated interactions between behavior, starvation intensity, developmental time, individuality, and neuromodulatory systems.

      The richness of this dataset - 2 behavioral measures continuous across 5 developmental stages, 3 different neuromodulatory conditions (with the dopamine system subject to decomposition by receptor types) and 4 different levels of starvation, with ~50-500 individuals in each condition-underlies the strength of this paper. This dataset enabled the authors to convincingly demonstrate that starvation triggers a behavioral effect in L1 and adult animals that is largely masked in intermediate stages, and that this effect becomes larger with increased severity of starvation. Furthermore, they convincingly show that the masking of the effect of starvation in L2-L4 animals depends on dopaminergic systems. The richness of the dataset also allowed a careful analysis of individuality, though only neuromodulatory mutants convincingly manipulated individuality, recapitulating earlier research. Nonetheless, a few caveats exist on some of their findings and conclusions:

      We thank the reviewer for the constructive comments. In the revised manuscript we include additional analyses and textual changes as detailed below, to address the points raised.

      1) Lack of quantitative analysis for effects within developmental stages. In making the argument for buffered effects of starvation on behavior during periods of larval development, the authors make claims regarding the temporal structure of behavior within specific stages. However, no formal analysis is performed and and the traces are provided without confidence intervals, making it difficult to judge the significance of potential deviations between starvation conditions.

      In the revised manuscript, we include additional analyses of roaming fraction effects across shorter developmental-windows, showing within-stage differences in behavioral patterns following starvation (Figure 1 - figure supplement 1E; Figure 3 - figure supplement 1C). In addition, we further temper and rewrite our conclusions to clearly describe these effects (now- “…while 1 day of early starvation modified within-stage temporal behavioral structures by shifting roaming activity peaks to later time-windows during the L2 and L3 stages…” in p. 4 and “Interestingly, during the L2 intermediate stage the effects on roaming activity patterns were more pronounced during earlier time-windows of the stage…” in p. 8).

      2) Incorrect inferences from differences in significance demonstrating significant differences. The authors claim that there is an increase in PC1 inter-individual variation in tph-1 individuals, however the difference in significance is not evidence of a significant difference between conditions (see Nieuwenhuis et al. 2011). This undermines claims about an interaction of starvation, neuromodulators, and individuality.

      In the revised manuscript we provide now a direct comparison of PCs inter-individual variances between starved and unstarved populations, demonstrating significant differences in inter-individual variation in specific PC individuality dimensions following stress (Figure 6 and Figure 6 - figure supplement 1). These results include the increase in PC1 inter-individual variation in tph-1 mutants following 3 and 4 days of starvation (Figure 6A,E).

      3) Sensitivity of analysis to baseline effects and assumptions of additive/proportional effects. The neuromodulatory and stress conditions in this paper have a mixture of effects on baseline activity and differences from baseline. The authors normalize to the roaming fraction without starvation, making the reasonable assumption that the effect due to starvation is proportional to baseline, rather than an additive effect. This confound is most visible in the adult subpanel of figure 5d, where an ~2-3 fold difference in relative roaming due to starvation is clearly noted, however, this is from a baseline roaming fraction in tph-1 animals that are ~2 fold higher, suggesting that the effect could plausibly be comparable in absolute terms.

      Unavoidably, any such assumptions on the expected interaction between multiple effects will be a gross simplification in complicated nonlinear systems, and the data are largely shown with sufficient clarity to allow the reader to make their own conclusions. However, some of the interpretations in the paper lean heavily on an assumption that the data support a direct interpretation (e.g. "neuronal mechanisms actively buffer behavioral alterations at specific development times") rather than an indirect interpretation (e.g. that serotonin reduces baseline roaming fraction which makes a fixed sized effect more noticeable). Parsing the differences requires either more detailed mechanistic study or careful characterization of the effect of different baselines on the sensitivity of behavior to perturbation-barring that it's worth noting that many of these interactions may be due to differences in biological and experimental sensitivity to change under different conditions, rather than a direct interaction of stress and neuromodulatory processes or evidence of differing neuromodulatory activity at different stages of development.

      In the revised manuscript we added a discussion of the potential complicated interactions between neuromodulation and stress, altering baseline levels and deviations from baseline. We also discuss the interpretation of the results in the context of non-linear systems in which sensitivity of the behavioral response to underlying variations may be modified by specific neuromodulatory and environmental perturbations, without assuming direct differences in neuromodulatory states over development or across individuals (p. 16).

      Reviewer #3 (Public Review):

      In this study, Nasser et al. aim to understand how early-life experience affects 1) developmental behavior trajectory and 2) individuality. They use early life starvation and longitudinal recording of C. elegans locomotion across development as a model to address these questions. They focus on one specific behavioral response (roaming vs. dwelling) and demonstrate that early life (right after embryo hatching) starvation reduces roaming in the first larval (L1) and adult stages. However, roaming/dwelling behavior during mid-larval stages (L2 through L4) is buffered from early life starvation. Using dopamine and serotonin biosynthesis null mutant animals, they demonstrated that dopamine is important for the buffering/protection of behavioral responses to starvation in mid-larval stages, while in contrast, serotonin contributes to early-life starvation's effects on reduced roaming in the L1 and adult stages. While the technique and analysis approaches used are mostly solid and support many of the conclusions made in the manuscript for part 1), there are some technical limitations (e.g., whether the method has sufficient resolution to analyze the behaviors of younger animals) and confounding factors (e.g., size of the animal) that the authors do not yet sufficient address, and can affect interpretation of the results. Additionally, much of the study is descriptive and lacks deep mechanistic insight. Furthermore, the focus on a single behavioral parameter (dwelling vs. roaming) limits the broad applicability of the study's conclusions. Lastly, the manuscript does not provide clear presentation or analysis to address part 2), the question of how early life experience affect individuality.

      We thank the reviewer for these important comments. As described below, in the revised manuscript we include new analyses (following extraction of size data), showing behavioral modifications across different conditions/genotypes also in size-matched individuals (within the same size range) (Figure 1 - figure supplement 1F; Figure 3 - figure supplement 1D,E; Figure 5 - figure supplement 1B,D). We also made edits to the text to describe these results (Methods p. 21 and Results section). In addition, while we can detect behavioral changes using our imaging method even in young L1 worms across conditions and genotypes (described in Stern et al. 2017 and this manuscript), as the reviewer correctly pointed out, we may miss some milder behavioral effects due to lower spatial imaging resolution in younger worms. We are now referring to this spatial resolution limitation in the revised manuscript (discussion part). Lastly, in the revised manuscript we added clearer and more direct analyses of changes in inter-individual variation in multiple PC dimensions following early stress, by directly comparing variation between starved and unstarved individuals within the mutant and wild-type populations (Figure 6; Figure 6 - figure supplement 1). These analyses show significant changes in inter-individual variation within specific PC individuality dimensions following early stress. Also, we made textual changes along the manuscript to increase the clarity of presentation of these results.

    1. Author Response

      Reviewer #2 (Public Review):

      Using an approach that combines synthetic genetic array (SGA) analysis with high-throughput microscopic analysis of the GFP-tagged yeast ORF collection in the budding yeast, Saccharomyces cerevisiae, this study has examined the contribution of the critical checkpoint kinases Mec1 and Rad53 to the subcellular relocalization of 322 candidate proteins in response to HU- and MMS-induced replication stress. Previous studies have established that Mec1 is required for Rad53 activation during replication stress and that Mec1 also serves checkpoint functions independent of Rad53. Unexpectedly, this study identifies groups of proteins whose stress-induced relocalization is dependent on Rad53 but not Mec1. This data indicates that Rad53 mediates some replication stress responses in a non-canonical manner that is independent of Mec1.

      The authors confirm their initial observations from the screening approach by focusing on the Rad53-dependent and Mec1-independent focus formation of GFP-Rad54. Moreover, using mass-spec analysis the authors demonstrate that some Rad53 phosphorylation sites known to be critical for Rad53 activation, including a consensus Mec1 phosphorylation site, are phosphorylated after replication stress even in the absence of Mec1. Motivated by this finding the authors screen for potential kinase and phosphatase pathways that may regulate Rad53 function during MMS-induced replication stress. Top hits identified include members of the retrograde signaling pathway, which is confirmed by conventional genetic assays while mass spec analysis supports the involvement of Rtg3 in mediating Rad53 phosphorylation during replication stress in the absence of Mec1.

      Overall this is a solid study reporting unexpected new findings that significantly advance our view of the global replication checkpoint response. The data are generally of high quality, well presented and quantified, and overall support the authors' claims. The mass spec approach used here to identify Rad53 phosphorylation sites offers an unbiased alternative to the simpler and more widely employed gel-shift method to monitor Rad53 activation. The hits identified in the various screens presented here provide a platform for potential follow-up studies by the community. The main drawback is that it remains unclear how Rtg3 promotes Rad53 activation. However, this could be considered to be beyond the scope of this study.

      We thank the reviewer for their positive assessment of our experimental data. We have made the changes requested by the reviewer to increase the clarity of Figure 5, and performed a second replicate to show that the FACS data are reproducible.

      Reviewer #3 (Public Review):

      The work by Ho et al describes the identification of Mec1/Tel1 independent activation of Rad53 after MMS treatment, which could lead to changes of GFP fusion signals for several dozens of proteins and this was partly dependent on Rtg3. Starting from an unbiased, targeted screen, the authors identified proteins whose GFP fusion signals changed intensity in rad53∆ but not in mec1∆ cells using live cell imaging, including Rad54. Using Rad54 as a readout for the subsequent experiments, a second screen amongst kinases/phosphatases and their regulators found that rtg2-3 mutants reduced Rad54-GFP intensity. Mass spectrometry data identified Rad53 phosphorylate sites in mec1∆ tel1∆ cells, consistent with the cell biological data described above. Overall, the work was well done and supported the main conclusions. The concept of Mec1/Tel1-independent and Rtg3-dependent Rad53 activation connects checkpoint signaling with the retrograde pathway.

      We thank the reviewer for their positive assessment of our experimental work and appreciate the suggestions that ultimately led to interesting new data. As outlined below, we attempted to perform most of the experiments suggested by the reviewer. Unfortunately, experiments with a MEC1-AID degron allele were inconclusive, with details summarized below. However, we have identified additional proteins whose re-localization is affected by Rtg3, providing stronger support for its role in the replication stress response. We revised the manuscript to add these new details.

    1. Author Response

      Reviewer 1 (Public Review):

      In this paper, Reato, Steinfeld et al. investigate a question that has long puzzled neuroscientists: what features of ongoing brain activity predict trial-to-trial variability in responding to the same sensory stimuli? They record spiking activity in the auditory cortex of head-fixed mice as the animals performed a tone frequency discrimination task. They then measure both overall activity and the synchronization between neurons, and link this ’baseline state’ (after removing slow drifts) of cortex to decision accuracy. They find that cortical state fluctuations only affect subsequent evoked responses and choice behavior after errors. This indicates that it’s important to take into account the behavioral context when examining the effects of neural state on behavior.

      Strengths of this work are the clear and beautiful presentation of the figures, and the careful consideration of the temporal properties of behavioral and neural signals. Indeed, slowly drifting signals are tricky as many authors have recently addressed (e.g. Ashwood, Gupta, Harris). The authors are well aware of the difficulties in correlating different signals with temporal and cross-correlation (such as in their ’epoch hypothesis’). To disentangle such slow trends from more short-lived state fluctuations, they remove the impact of the past 10 trials and continue their analyses with so-called ’innovations’ (a term that is unusual, and may more simply be replaced with ’residuals’).

      The terms ‘innovations’ and ‘residuals’ are sometimes used interchangeably. We used innovations because that’s how they were introduced in the signal processing literature (i.e., Kailath, T (1968). ”An innovations approach to least-squares estimation–Part I: Linear filtering in additive white noise.” IEEE transactions on automatic control). We try to be explicit in the text about the formal definition of this quantity, to avoid problems with terminology.

      I do wonder if this throws out the baby with the bathwater. If the concern is statistical confound, the ’session permutation’ method (Harris) may be better suited. If the concern is that short-term state fluctuations are more behaviorally relevant (and obscured by slow drifts), then why are the results with raw signals in the supplement (Suppfig 8) so similar?

      The concern was statistical confound, although this concern is ameliorated when using a mixed model approach and focusing on fixed effects. However, our approach allowed us to assess the relative importance of slow versus single-trial timescales in the predictive relationship between cortical state (and arousal) and behavior, revealing that, in the conditions of our experiment, only the fast timescales are relevant. Because of this, we think that the baby wasn’t thrown out with the bathwater as, qualitatively, no new phenomenology was revealed when the slow components of the signals were included. In hindsight, it is true that the results we obtained suggest that maybe the effort we made to isolate the fast component of the signals was unjustified. However, this can only be known after both options have been tried, as we did. Moreover, we started using innovations based on the results in Figure 2 where, as we show, the use of innovations does make a difference, even at the level of fixed effects in a mixed model. We agree that we could have used the ‘session permutation’ method, but given the depth at which we have explored this issue in the manuscript already, and the clarity of the results, we think that adding a third method would only make reading the manuscript more difficult without adding any substantially new content.

      While the authors are correct that go-nogo tasks have drawbacks in dissociating sensitivity from response bias, they only cursorily review the literature on 2AFC tasks and cortical state. In particular, it would be good to discuss how the specific method - spikes, EEG (Waschke), widefield (Jacobs) and algorithm for quantifying synchronization may affect outcomes. How do these population-based measures of cortical state relate to those described extensively with slightly different signals, notably LFP or EEG in humans (e.g. work by Saskia Haegens, Niko Busch, reviewed in https://doi.org/10.1016/j.tics.2020.05.004)? This review also points out the importance of moving beyond simple measures of accuracy and using SDT, which would be an interesting improvement for this paper too.

      We thank the reviewer for pointing us towards the oscillation-based brain-state literature in humans. We have expanded the paragraph in the discussion where we compare our results with previous work in order to (i) elaborate on the literature on 2AFC tasks, (ii) specifically address the literature linking alpha power in the pre-stimulus baseline and psychophysical performance, and (iii) mention different methods for assessing desynchronization. Our view is that absence of lowfrequency power is a robust measure which can be assessed using different types of signals (spikes, imaging, LFP, EEG). That said, the relationship between desynchronization and behavior appears subtle and variable, specially within discrimination paradigms. These issues are discussed in the paragraph starting in line 527 in the text.

      Regarding the use of SDT, we had already established that our main finding could be expressed as a significant interaction between FR/Synch and the stimulus-strength regressor, when predicting choice after errors (Supplementary Fig. 4A in original manuscript), which is equivalent to a cortical state-dependent increase in d′ after the mice made a mistake. In order to consider a possible effect of cortical state on the ‘criterion’ (i.e., an effect on the bias of the mice towards either response spout), we re-run this GLMM but adding the cortical state regressors as main effects. The results show that the FR-Synch predictor is only significantly greater than zero as an interaction after errors (p = 0.0025). As a main effect, it’s not significantly different from zero neither after errors (p = 0.28), nor after correct trials (p = 0.97). We have included this analysis as Figure 3-figure supplement 1B (replacing the previous Supplementary Fig. 4A) and commented on them in the text (lines 222-225).

      Reviewer 2 (Public Review):

      The relationship between measures of brain state, behavioral state, and performance has long been speculated to be relatively simple - with arousal and engagement reflecting EEG desynchronization and improved performance associated with increases in engagement and attention. The present study demonstrates that the outcome of the previous trial, specifically a miss, allows these associations to be seen - while a correct response appears less likely to do so. This is an interesting advance in our understanding of the relationship between brain state, behavioral state, and performance.

      This is probably just a typo, but we would like to clarify that the relevant outcome in the previous trial is not a miss, but an incorrect choice in an otherwise valid trial (i.e., a trial with a response within the allowed response window).

      While the study is well done, the results are likely to be specific to their trial structure and states exhibited by the mice. To examine the full range of arousal states, it needs to be demonstrated that animals are varying between near-sleep (e.g. drowsiness) and high-alertness such as in rapid running. The fact that the trials occurred rapidly means that the physiological and neural variables associated with each trial will overlap with upcoming trials - it takes a mouse more than a few seconds to relax from a previous miss or hit, for example. Spreading the rapidity of the trials out would allow for a broader range of states to be examined, and perhaps less cross-talk between adjacent trials. The interpretation of the results, therefore, must be taken in light of the trial structure and the states exhibited by the mice.

      We thank the reviewer for the positive assessment of our work and also for raising this point in particular. This motivated us to look more carefully at this issue, with results that, we believe, strengthen our study.

    1. Author Response

      Reviewer #1 (Public Review):

      In this work, Roche et al. study a 13-year long time series of microbiome samples from wild baboons from Kenya. The data used in this work challenge a previous finding from the same authors that temporal dynamics in microbiome changes are largely individualized. Using a multinomial logistic-normal modeling approach, the authors detect that co-variance in temporal dynamics in microbial pair-wise associations among individuals occurs more frequently between relatives. Furthermore, the authors identify that microbial phylogenetic proximity is associated with consistent co-abundance changes over time and that their metric of universal microbial relationships is robust across hosts and is detected even in human longitudinal data. The authors conduct a thorough statistical revision of publicly available results, highlighting this time (e.g. compared to Björk et al, doi: 10.1038/s41559-022-01773-4) the consistently shared microbial properties between individuals, rather that the individual microbial signatures highlighted in their previous work.

      Thank you for this summary. We would like to briefly clarify that we do not see the current work as inconsistent with our prior finding in Björk et al. that microbiome taxonomic compositions are idiosyncratic and asynchronized. However, this new analysis, which focuses on abundance correlations between pairs of taxa, indicates that the personalized compositions and dynamics we observed in Björk et al. are probably not attributable to personalized microbiome ecologies. In other words, Björk et al. showed that microbial taxa found in the guts of different baboons can be quite distinct (and remain so over time, giving rise to semi-stable individual signatures). The current study shows that, despite this taxonomic individuality, the correlations between pairs of microbes in the baboon gut are often quite consistent. To give a basic example, hot weather and ice cream, when observed, are often observed together (positively correlated), but while some places have a lot of both, some have little of either. This idea is discussed in more detail below (see response R6) and in the revised Discussion section (lines 572 to 586).

      Strengths:

      This work is foundational in its compelling effort to generate a rigorous method to evaluate coabundance dynamics in longitudinal microbiome data. The approach taken will likely inspire developments that will sharpen the capacity to extract co-varying microbial features, taking into account seasonality, diet, age, relatedness, and more. To the best of my understanding, their hierarchical model integrated into the Gaussian process to analyze microbial dynamics is reasonably robust and they clearly explain the implementation. Furthermore, this work introduces and defines the concept of a universality score for microbial taxon pairs. Overall, the work presented is clear and convincing and provides tools for the community to benefit from both methods and results. Furthermore, conceptually, this work stresses the value of consistent and shared microbial dynamics in groups, which enriches our understanding of host-associated microbial ecology, otherwise understood to be largely dependent on external fluctuations.

      Weakness:

      It is not entirely clear the extent to which the presented results revise, refute, or support the previously published analysis performed by the authors on the same dataset (doi: 10.1038/s41559-022-01773-4), which was more focused on individuality.

      We agree the relationship between Björk et al. and the current manuscript was unclear in our original submission. We now elucidate the relationship between these papers in the Discussion (lines 572 to 586). Briefly, Björk et al. found that microbiome taxonomic compositions are idiosyncratic and asynchronized. The current analysis finds that pairwise bacterial abundance correlations are predominantly shared and not highly personalized. We think the most likely explanation is that, as mentioned by Reviewer 2 below, the current analyses do not account for the role that environmental gradients play in the gut. If these environments differ asynchronously across hosts, it could lead to shared abundance correlations, but individualized microbiome compositions and individualized single-taxon dynamics. We discuss this possibility and other potential explanations in the revised Discussion (lines 572 to 586).

      Reviewer #2 (Public Review):

      The authors of this paper identify a knowledge gap in our understanding of the generalizability of ecological associations of gut bacteria across hosts. Theoretically, it is possible that ecological associations between bacteria are consistent within a host organism but differ between hosts, or that they are universal across hosts and their environmental gradients. The authors utilize longitudinal data with a unique temporal resolution, on Amboseli baboons, 56 individuals who were sampled for gut microbiome hundreds of times over a decade. This data allows disentangling ecological dynamics within and across individuals in a way that as far as I know has never been done before. The authors show that ecological relationships among baboon gut bacteria, measure through a correlation based on covariation, are largely universal (similar within and across host individuals) and that the most universally covarying taxa are almost always positively associated with each other. They also compare these results with two sets of human data, finding similar patterns in one human data set but not in the other.

      The main aim of this paper is to establish whether gut microbial ecologies are universal across hosts, and this the authors generally show to be true in a thorough and convincing way. However, some re-assessment or re-assurance on the solidity of their chosen method of estimating co-variation would be needed to fully assess the robustness of subsequent results. Specifically, the authors measure the correlation between microbial taxa from data on their abundance co-variation across samples. While necessary steps have been taken to validate the estimates across spurious correlations due to the compositional nature and autocorrelation structures present in the data, I worry that the sparsity of the data might influence the estimation of positive and negative correlations in a slightly different manner. There exist more microbial taxa than samples in the data and some taxa are present in as few as 20% of the samples, meaning that the covariation data will have a large amount of 0-0 pairs. I worry that the abundance of 0-0 pairs in the data might inflate the measures of positive co-variation, making taxa seem highly positively correlated in abundance when they in fact are missing from many samples. Of course, mutual absence is also a form of biologically meaningful covariation but taking the larger number of taxa than samples and the inability of sequencing technology to detect all low-abundance taxa in a sample, I am currently not convinced that all of the 0-0 pairs are modeled as a realistic and balanced way as a continuum of the other non-zero co-variation between taxa in the data. This may become problematic when positive and negative relationships are compared: The authors state that even though most associations between taxa were negative, the most universally correlated taxa pairs (taxa pairs with strongest correlations in abundance both within and between hosts) were enriched in positive associations. It may be possible that this is influenced by the fact that zero inflation in the data lends more weight to positive links than negative links. Whether these universal positive correlations are driven by positive non-zero abundance covariation or just 0-0 links in the data is currently unclear.

      Thank you for pointing out this weakness in our original analyses. As described in response R1 above, your hunch was correct: zero inflation biased our correlation patterns such that taxa pairs with a high frequency of joint zero observations (i.e., where both members of the pair had very low or zero abundances) tended to be positively correlated (Fig. R1). Consequently, as you suggested, zero inflation in the data lent more weight to positive links than negative links in our data set. To address this problem in the revised manuscript, we now restrict our analyses to taxon pairs whose joint zero-abundance observations were less than 5% of all samples across hosts (pairs to the left of the dashed vertical line in Fig. R1 above). We also restricted our analyses to taxa observed in at least 50% of all samples. The first of these criteria was the most restrictive. As described above, our new filtering procedure retained 1,878 of the original 7,750 ASV-ASV pairs; 57 of the original 66 phylum-phylum pairs; and 473 of the original 666 class/order/family-level pairs.

      Another additional result that would benefit from a more clear context is the result that taxa correlation patterns were more similar between phylogenetically close taxa and between genetically close host individuals. The former notion is to be expected if taxa abundances are driven by environmental (or host physiology-related) selective forces that favor bacteria with similar phenotypes. This yields more support to the idea that covariation is environmentally driven rather than driven by the ecological network of the bacteria themselves, and this could be more clearly emphasized. The latter notion of covariation being more similar in genetically related hosts is currently impossible to disentangle from the notion that covariation patterns were more similar with individuals harboring a more similar baseline microbiome composition since microbiome composition and genetic relatedness were apparently correlated. To understand if something about relatedness was actually influential over correlation pattern similarity, one would need to model that effect on top of the baseline similarity effect. Currently, it is not clear if this was done or not.

      We agree that shared responses to environmental gradients within hosts—especially immune profiles and pH—could explain both of these findings. These ideas are now described in the Discussion in lines 559 to 562.

      We also now report partial Mantel tests to control for baseline similarity in microbiome composition when testing for shared microbial correlation patterns among genetic relatives. Controlling for baseline similarity had little effect on the results, and we now report the statistics for this partial Mantel (Fig. 5B; Table S7; r2=0.009; partial Mantel p-value=0.002). See lines 391-392.

      The authors also slightly overemphasize the generalizability of their results to humans, taking that only one of the human data sets they compare their results to, shows similar patterns. While they mention that the other human data set (that was not similar in patterns to theirs) was different in some key aspects (sampling frequency was much higher), the other human data set was also dissimilar to the other two (it only contained infants, not adults). Furthermore, to back up the statement that higher sampling frequency would be the reason this data set had dissimilar covariation between taxa, one would need to show that the temporal variation in this data set was different from the baboon one and show that these covariation patterns were sensitive to timescale by subsampling either data to create mock data sets with different sampling frequency and see how this would change the inference of ecological associations.

      We have revised the text to tone down the generalizability of our results to humans. For instance, the abstract (line 58) now states that “universality in baboons was similar to that in human infants, and stronger than one data set from human adults” but does not state that our results are generalizable to humans.

      We also considered sub-sampling the data set from Johnson et al., from daily to monthly scales, but unfortunately that data set is only 17 days long, so doing so is impossible. This is now stated in the Discussion in line 619, which states, “However, without the ability to subsample Johnson et al. [7] to monthly scales (this data set is only 17 days long), it is impossible to test this prediction.”

      To the extent that the results are robust, particularly regarding to the main result of the universality of gut microbial ecological associations, the impact of this paper is not small. This question has never been so thoroughly and convincingly addressed, and the results as they stand have the power to strongly influence the expectations of gut microbial ecology across many different systems. Moreover, as the authors point out, evidence for universal gut microbial ecology is important for the future development of probiotics. An important point here, underemphasized by the authors, is that universal gut microbe ecologies will allow specific interventions that use gut microbe ecology to manipulate emergent community properties of microbiomes to be more beneficial for the host, rather than just designing compositional cocktails that should fit all. In addition to the main finding of this study, the unique data set and the methods developed as part of this study (e.g. the universality score, the enrichment measures, the model of log-ratio dynamics, the assessment of covariation from time-ordered abundance trajectories) will doubtlessly be translatable to many other studies in the future.

      Thank you for these suggestions. We now mention these implications in the introduction (line 82-84) and in the discussion in lines 537-539 and line 630.

      Reviewer #3 (Public Review):

      This is a well-executed study, offering thorough analysis and insightful interpretations. It is wellwritten, and I find the conclusions interesting, important, and well-supported.

      Thank you for your supportive comments.

      References

      1. Silverman JD, Roche K, Holmes ZC, David LA, Mukherjee S. Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes. Journal of Machine Learning Research. 2022;23:1-42.
      2. Quinn TP, Richarrson MF, Lovell D, Crowley TM. propr: An R-package for Identifying Proportionally Abundant Features Using Compositional Data Analysis Scientific Reports. 2017;7:16252.
      3. Cao Y, Lin W, Li H. Large covariance estimation for compositional data via compositionadjusted thresholding. . J Am Stat Assoc. 2019:759-72.
      4. Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol. 2012;8(9):e1002687. doi: 10.1371/journal.pcbi.1002687. PubMed PMID: 23028285; PubMed Central PMCID: PMCPMC3447976.
      5. Risely A, Schmid DW, Muller-Klein N, Wilhelm K, Clutton-Brock TH, Manser MB, et al. Gut microbiota individuality is contingent on temporal scale and age in wild meerkats. Proc Biol Sci. 2022;289(1981):20220609. Epub 20220817. doi: 10.1098/rspb.2022.0609. PubMed PMID: 35975437; PubMed Central PMCID: PMCPMC9382201.
      6. Wilmanski T, Diener C, Rappaport N, Patwardhan S, Wiedrick J, Lapidus J, et al. Gut microbiome pattern reflects healthy ageing and predicts survival in humans. Nat Metab. 2021;3(2):274-86. Epub 20210218. doi: 10.1038/s42255-021-00348-0. PubMed PMID: 33619379; PubMed Central PMCID: PMCPMC8169080.
      7. Johnson AJ, Vangay P, Al-Ghalith GA, Hillmann BM, Ward TL, Shields-Cutler RR, et al. Daily Sampling Reveals Personalized Diet-Microbiome Associations in Humans. Cell Host & Microbe. 2019;25(6):789-802. Epub 2019/06/14. doi: 10.1016/j.chom.2019.05.005. PubMed PMID: 31194939.
      8. Franzosa EA, Huang K, Meadow JF, Gevers D, Lemon KP, Bohannan BJM, et al. Identifying personal microbiomes using metagenomic codes. Proceedings of the National Academy of Sciences. 2015;112(22):E2930-E8. doi: 10.1073/pnas.1423854112. PubMed PMID: WOS:000355832200014.
      9. Faith JJ, Guruge JL, Charbonneau M, Subramanian S, Seedorf H, Goodman AL, et al. The long-term stability of the human gut microbiota. Science. 2013;341(6141):1237439. Epub 2013/07/06. doi: 10.1126/science.1237439. PubMed PMID: 23828941; PubMed Central PMCID: PMC3791589.
      10. Bik EM, Costello EK, Switzer AD, Callahan BJ, Holmes SP, Wells RS, et al. Marine mammals harbor unique microbiotas shaped by and yet distinct from the sea. Nat Commun. 2016;7:10516. Epub 20160203. doi: 10.1038/ncomms10516. PubMed PMID: 26839246; PubMed Central PMCID: PMCPMC4742810.
      11. Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, Stombaugh J, et al. Moving pictures of the human microbiome. Genome Biology. 2011;12(5):R50. doi: Artn R50 Doi 10.1186/Gb-2011-12-5-R50. PubMed PMID: ISI:000295732700014.
      12. Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R. Bacterial community variation in human body habitats across space and time. Science. 2009;326(5960):1694-7. doi: Doi 10.1126/Science.1177486. PubMed PMID: ISI:000272839000053.
      13. Dolinsek J, Goldschmidt F, Johnson DR. Synthetic microbial ecology and the dynamic interplay between microbial genotypes. Fems Microbiology Reviews. 2016;40(6):961-79. doi: 10.1093/femsre/fuw024. PubMed PMID: WOS:000387995000010.
      14. Louca S, Polz MF, Mazel F, Albright MBN, Huber JA, O'Connor MI, et al. Function and functional redundancy in microbial systems. Nat Ecol Evol. 2018;2(6):936-43. Epub 2018/04/18. doi: 10.1038/s41559-018-0519-1. PubMed PMID: 29662222.
      15. Rainey PB, Quistad SD. Toward a dynamical understanding of microbial communities. Philos Trans R Soc Lond B Biol Sci. 2020;375(1798):20190248. Epub 2020/03/24. doi: 10.1098/rstb.2019.0248. PubMed PMID: 32200735; PubMed Central PMCID: PMCPMC7133524. 16. Martiny JB, Jones SE, Lennon JT, Martiny AC. Microbiomes in light of traits: A phylogenetic perspective. Science. 2015;350(6261):aac9323. doi: 10.1126/science.aac9323. PubMed PMID: 26542581.
      16. Debray R, Herbert RA, Jaffe AL, Crits-Christoph A, Power ME, Koskella B. Priority effects in microbiome assembly. Nat Rev Microbiol. 2022;20(2):109-21. Epub 20210827. doi: 10.1038/s41579-021-00604-w. PubMed PMID: 34453137.
      17. Gloor GB, Reid G. Compositional analysis: a valid approach to analyze microbiome highthroughput sequencing data. Can J Microbiol. 2016;62(8):692-703. Epub 2016/06/18. doi: 10.1139/cjm-2015-0821. PubMed PMID: 27314511.
      18. Joseph TA, Pasarkar AP, Pe'er I. Efficient and Accurate Inference of Mixed Microbial Population Trajectories from Longitudinal Count Data. Cell Syst. 2020;10(6):463-9 e6. Epub 20200624. doi: 10.1016/j.cels.2020.05.006. PubMed PMID: 32684275.
      19. Aijo T, Muller CL, Bonneau R. Temporal probabilistic modeling of bacterial compositions derived from 16S rRNA sequencing. Bioinformatics. 2018;34(3):372-80. doi: 10.1093/bioinformatics/btx549. PubMed PMID: 28968799; PubMed Central PMCID: PMCPMC5860357.
      20. Coyte KZ, Rao C, Rakoff-Nahoum S, Foster KR. Ecological rules for the assembly of microbiome communities. PLoS Biol. 2021;19(2):e3001116. Epub 20210219. doi: 10.1371/journal.pbio.3001116. PubMed PMID: 33606675; PubMed Central PMCID: PMCPMC7946185.
      21. Coyte KZ, Schluter J, Foster KR. The ecology of the microbiome: Networks, competition, and stability. Science. 2015;350(6261):663-6. doi: 10.1126/science.aad2602. PubMed PMID:
      22. Palmer JD, Foster KR. Bacterial species rarely work together. Science. 2022;376(6593):581-2. Epub 20220505. doi: 10.1126/science.abn5093. PubMed PMID:
      23. Reese AT, Pereira FC, Schintlmeister A, Berry D, Wagner M, Hale LP, et al. Microbial nitrogen limitation in the mammalian large intestine. Nat Microbiol. 2018. Epub 2018/10/31. doi: 10.1038/s41564-018-0267-7. PubMed PMID: 30374168.
      24. Firrman J, Liu L, Mahalak K, Tanes C, Bittinger K, Tu V, et al. The impact of environmental pH on the gut microbiota community structure and short chain fatty acid production. FEMS Microbiol Ecol. 2022;98(5). doi: 10.1093/femsec/fiac038. PubMed PMID: 35383853.
      25. de Vos WM, Tilg H, Van Hul M, Cani PD. Gut microbiome and health: mechanistic insights. Gut. 2022;71(5):1020-32. Epub 20220201. doi: 10.1136/gutjnl-2021-326789. PubMed PMID: 35105664; PubMed Central PMCID: PMCPMC8995832.
      26. Tamames J, Sanchez PD, Nikel PI, Pedros-Alio C. Quantifying the Relative Importance of Phylogeny and Environmental Preferences As Drivers of Gene Content in Prokaryotic Microorganisms. Front Microbiol. 2016;7:433. Epub 20160331. doi: 10.3389/fmicb.2016.00433. PubMed PMID: 27065987; PubMed Central PMCID: PMCPMC4814473.
      27. Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome datasets are compositional: and this is not optional. Front Microbiol. 2017;8:2224. Epub 2017/12/01. doi: 10.3389/fmicb.2017.02224. PubMed PMID: 29187837; PubMed Central PMCID: PMCPMC5695134.
      28. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of the National Academy of Sciences. 2011;108:4516-22. doi: Doi 10.1073/Pnas.1000080107. PubMed PMID: ISI:000288451300002.
    1. Author Response

      Reviewer #1 (Public Review):

      This study was designed to examine the bypass of Ras/Erk signaling defects that enable limited regeneration in a mouse model of hepatic regeneration. The authors show that this hepatocyte proliferation is marked by expression of CD133 by groups of cells. The CD133 appears to be located on intracellular vesicles associated with microtubules. These vesicles are loaded with mRNA. The authors conclude that the CD133 vesicles mediate an intercellular signaling pathway that supports cell proliferation. These are new observations that have broad significance to the fields of regeneration and cancer.

      The primary observation is that the limited regeneration observed in livers with Ras/Erk signaling defects is associated with CD133 expression by groups of cells. The functional significance of CD133 was tested using Prom1 KO mice - the data presented are convincing.

      The major weakness of the study is that some molecular mechanistic details are unclear - this is, in part, due to the extensive new biology that is described. Nevertheless, the data used to support some key points in this study are unclear:

      We fully agree that some details of the molecular mechanisms are yet to be elucidated for the CD133+ vesicles (intercellsomes, as we named). This is the first report of a new direct cell-cell communication mechanism provoked in stress response to proliferative signal deficit.

      Despite a huge body of literature, many questions remain open for the molecular mechanisms of exosomes/EVs.

      a) What is the evidence that the observed CD133 groups of cells are not due to clonal growth. Is this conclusion based on the time course (the groups appear more rapidly than proliferation) or is this based on the GFP clonal analysis?

      This is indeed a very critical point for this study. Our initial thought and efforts were indeed on finding evidence that supports clonal expansion of progenitor cells. However, the experiments showed that the CD133+ cells were negative for all other stem/progenitor cell markers and that they are mature hepatocytes. CD133 expression was upregulated dramatically in regenerating livers and disappeared upon completion of liver regeneration. Furthermore, suppression of Ras-Erk signaling by Shp2 and Mek inhibitors robustly induced CD133 expression in a variety of cancer cell lines in culture in vitro.

      At 2 days after PHx, we already observed big colonies, which were unlikely derived from a single initiating cell (Figure 1). The GFP clonal analysis unambiguously demonstrated the heterogenous origin of the clustered cells (Figure 3). We detected mixed GFP-positive and -negative cells within each colony, without a single colony consisting entirely of GFP-positive cells. The original colony sizes were estimated to be 10 cells or more (Figures 3G and Figure 3–figure supplement 1B). Thus, both the sizes and compositions in the GFP clonal analyses support the assertion that CD133+ cell clusters originated from multiple mature hepatocytes.

      b) What is the evidence that the CD133 vesicles mediate intercellular communication. This is an exciting hypothesis, but what is the evidence that this happens? Is this inferred from IEG mRNA diversity? or some other data. Is there direct evidence of transfer - for example, the does the GFP clonal analysis show transfer of GFP that is not mediated by clonal proliferation? Moreover, since the hepatocytes are isogenic, what distinguishes the donor and recipient cells?

      Increased clarity concerning what is hypothesis and what is directly supported by data - would improve the presentation of this study.

      Per the reviewer’s advice, we have clarified these points in the revised version. Our proposal that CD133 vesicles mediate intercellular communication was supported by these experimental results.

      A). Data in Fig. 5 suggest direct trafficking of the vesicles, as CD133 existed on the filaments that bridge the tightly contacting cells. This was confirmed by two different CD133 antibodies in mouse and human. We are now conducting correlative light and electron microscopy to characterize these bridges and the exchange event at the cell-cell border. Of note, CD133+ vesicles are negative for CD9, CD63 or CD81, markers for exosomes/EVs. We could only isolate CD133+ vesicles from cell lysates in vitro and mouse tissue lysates, but not from cell supernatants from which exosomes/EVs are isolated.

      B). More direct evidence of the transfer was presented in Fig. 6H, showing Myc-tagged CD133 molecules transferred from one cell to another. We are engineering a knock-in system to track the endogenous CD133.

      C). Further experimental evidence was provided in the single and double gene KO experiments in Fig. 8E-G, suggesting the functional significance of CD133 in intercellular communication.

      D). In addition to the data above, the IEG mRNA diversity analyses based on scRNA-seq support the mRNA exchange model. The isogenic CD133+ SKO hepatocytes were found to lack different IEG transcripts randomly. This is why we propose a mutually sharing model, rather than a donor and recipient model. Importantly, the mRNA diversity (entropy) model also illustrates the association of CD133 and “stemness", as described in the discussion.

      In sum, we believe that a most reasonable interpretation of the current data set is a model of direct cell-cell communication via CD133+ vesicles. We take the reviewer’s point and have made changes to the text to better distinguish conclusion and hypothesis, which will be validated in future studies.

      Reviewer #2 (Public Review):

      The manuscript by Kaneko set out to understand the mechanisms underlying cell proliferation in hepatocytes lacking Shp2 signals. To do this, the authors focused on CD133 as the proliferating clusters of cells in the Shp2 knockout (SKO) livers are CD133 expressing. After excluding the contribution of progenitors that are CD133 to this cell population, the authors focused on the intrinsic regulation of CD133 by Met/Shp2 regulated Ras/Erk parthway and showed upregulation of CD133 to be a compensatory signal to overcome loss of Ras/Erk signal and suggested Wnt10a in the regulation of CD133 signal. The study then focused on the observed filament localization of CD133 in the CD133+ cluster of cells. The study went on to identify the CD133+ vesicles that contain primarily mRNA vs. microRNA like other EVs. Specifically, the authors identified several mRNA species that encode IEGs, indicating a potential role for these CD133+ vesicles in cell proliferation signal transmission to neighboring cells via delivery of the IEG mRNAs as cargos. Finally, they showed that the induction of CD133 (and by derivative, the CD133+ vesicles) are necessary for maintaining cell proliferation in the cell cluster with high proliferation capacities in the SKO livers; and in intestinal crypt organoids treated with Met inhibitors to block Ras/ERk signal.

      1) The identification of CD133+ vesicles is largely based on staining and costainings. Though the experiments are very well done with many controls and approaches, the authors may want to perform one or two key experiments with EM to definitively demonstrate the colocalization. For example, the mCherry experiment in Fig6H and the colocalization experiments for CD133 and HuR in Fig 7.

      Many thanks for the suggestion. We are now establishing a correlative light and electron microscopy system, as the classic immunogold staining method was not sufficient for this purpose. Further characterization of CD133+ vesicles is now a major focus of research in our lab, to establish, substantiate or modify the model and hypothesis presented in this article. The long-term goal is to elucidate how cells strive to proliferate under insufficient proliferative signal, which is likely relevant to drug resistance and tumor relapse.

      2) Since CD133+ marks the 50nM intracellsome defined by the authors, it is unclear what the CD133- vesicles used as controls are. Are they regular EVs that are larger in size? This needs better clarification as they are used as a control for many experiments such as Fig 7A.

      Per the advice, we added more explanation to the revised text. We used regular EVs as the control, since they are the well-studied intercellular communication vesicles. Since the EVs are highly heterogenous, we did not choose to select a specific subpopulation of EVs. We used the well-established polymer-based precipitation method to isolate the EV fraction from cell culture supernatant for RNA-seq analysis. We did detect the enrichment of micro-RNAs in the isolated EVs, consistent with reports in the literature. Strikingly, the CD133 vesicles isolated from cell lysates showed a completely distinct RNA profile, relative to the EVs.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper raises an interesting question about learning signals. The most intriguing property of this system is the one-to-one convergence, plasticity, and apparently linear input/output function of the SFL-to-SAM relay. These properties suggest that, unlike structures like the insect mushroom body or mammalian cerebellum, in which the intermediate layer is thought to increase the dimensionality of the representation, the SAMs should be thought of more like the weights of a linear readout of the SFL inputs by the LNs.

      What learning signal guarantees appropriate weight changes? In a few places (the section on "associativity" and the section on AFs), it is suggested that SAMs can themselves, through coordinated local activity, cause LTP, which the authors call "self LTP-induction." But what is the purpose of such plasticity? It doesn't seem like it would permit, for example, LTP which associates a pattern of SFL activity with the appropriate LNs for the correct vs. the incorrect action. Presumably, appropriately routed information from the NMs and AFs sends the appropriate learning signals to the right places. Does the pattern of innervation of NMs and AFs reveal how these signals are distributed across association modules? Does this lead to a prediction for the logic of the organization of the association modules?

      We extended the discussion section to clarify some of these points. One paragraph describes our idea of “self-LTP induction” (L712-744). In addition, we address the potential role of the neuromodulatory fibers (NMs) and ascending fibers (AFs) in a paragraph titled "Perspective on the involvement of the ascending fibers (AF) and the neuromodulatory fibers (NM) in the supervision of learning" (L786). Answering how these signals manifest across different association modules requires a larger reconstruction.

      One challenge for a reader who is not an expert on the VL is that the manuscript in its present form lacks discussion about the impact (or hypothesized impact) of the VL on behavior. There is a reference to a role for LNs suppressing attack behavior, but a more comprehensive picture of what the readout layer of this system is likely controlling would be helpful.

      To contextualize how the VL circuitry can allow for the coincident detection of visual stimuli and environmental cues (punishing or rewarding) to control the stereotypic attack behavior of the octopus, we added two discussion sections: "Perspective on the VL involvement in octopus associative learning" (L774) and "Perspective on the involvement of ascending fibers (AF) and neuromodulatory fibers (NM) in the supervision of learning" (L786).

      The authors do a thorough job of characterizing the "fan-out" architecture from SFL axons to SAMs and CAMs. A few key numbers remain to characterize the "fan-in" architecture of LNs. There appears to be a 400:1 convergence from AMs to LNs. Is it possible to estimate the approximate number of presynaptic inputs per LN? The text around Figure 7 states a median of 162 sites per 100μm dendrite length. One could combine this with an estimate of the total dendritic length for one of these cells from previously available data to estimate the number of inputs per LN. This would help determine the degree of overlap of different association modules in Figure 11, which would be interesting from a computational perspective.

      Due to the limitations associated with a small EM volume, our study focused on the fanout of the VL network. We agree that a better understanding of the fan-in part of the VL network is crucial. To the best of our knowledge, previous data have not provided estimates on the dendritic length of the LNs, due to low-resolution images or lack of 3D imaging (Hochner/Shomrat/Young experiment). We intentionally avoided making a largely inaccurate estimate of the fan-in part of the network based on our data. We believe that future research can aim to combine neuron labeling with EM, or other super-resolution techniques, to allow for detailed assessment of the large neuron arborization.

      Reviewer #2 (Public Review):

      Octopuses are known for their abilities in solving complex tasks and numerous apparently complex cognitive behaviours such as astonishment at octopuses learning how to open jars by watching others and the mind-boggling camouflage. They are very clever molluscs. The octopus shows the famously advanced brain plan but it is one that has little research progress due to its large size and structural complexity. This was originally recognised by the work of BB Boycott, JZ Young, EG Gray, and others in mid last century. Since then, however, little progress has been achieved towards a modernday description of the octopus neural network particularly in the higher-order brain lobe, despite intense interest and indeed research progress concerning their complex behavioural and cognitive abilities.

      This study applied a combination of EM-based imaging, neural tracing, and analyses to start revealing a further detailed view of a part of the lateral gyrus of the vertical lobe (learning and memory centre) of the common European octopus. It is a long overdue contribution and starts to bring octopus neuroscience a step close to the details of some vertebrates achieved. The new findings of neurons and the associated network provide new insights into this very complex but unfamiliar brain, allowing to propose a functional network that may link to the octopus memory formation. Also, this work could be of potential interest to a broad audience of neuroscientists and marine biologists as well as those in bio-imaging and deep-learning fields.

      Strengths:

      Current knowledge of the neuroanatomy and the associating network of the octopus vertical lobe (learning and memory centre) remains largely based on the pioneering neuroanatomical studies in the '70s, this work indeed provides a rich and new dataset using modern-day imaging technology and reveals numerous previously-unknown neuron types and the resulting further complex network than we thought before. This new dataset reveals hundreds of cell processes from seven types of neurons located in one gyrus of the vertical lobe and can be useful for planning further approaches for advanced microscopy and other approaches including electrophysiological and molecular studies.

      Another strength of this study is to apply the current fashion of the deep learning technique to accelerate the imaging process on this octopus complex neural network. This could trigger some inventions to develop new algorithms for further applications on those non-model animals.

      Weakness/limitations:

      In an effort to match the key claims of the first connectome of the octopus vertical lobe, mapping up an entire vertical lobe is essential. However, also understandably, given challenges in imaging a large-sized brain region, this study managed to image a very small proportion of the anterior part of the lateral gyrus. Along with the current limited dataset, a partially reconstructed neural network of one gyrus, it is unclear whether the wiring pattern found in this study would appear as a similar arrangement throughout an entire lateral gyrus. Furthermore, it is also unknown if another 4 gyri might keep a similar pattern of neural network as it found in the lateral gyrus. Considering some recent immunochemistry evidence that showed distinct different signals in different gyri in terms of heterogeneity of neuron types amongst gryi, to assume this newly discovered network can represent the wiring pattern across an entire 5-gyrus vertical lobe is inadequate.

      We revised the introduction (L106-113) to address this important point, and added discussion section titled "How well does a partial connectome of a small portion of one VL lateral lobule represent the connectivity patterns across all five VL lobuli?” (L894). We clarify what we believe is likely conserved across VL gyri and what is more likely to differ.

      As this study is the first big step to reveal the complex network in the octopus vertical lobe system, the title may be changed to "Toward the connectome of the Octopus vulgaris vertical lobe - new insights into a memory acquisition network".

      We appreciate the reviewer's suggestion for a new title for our manuscript. We feel, however, that the current title reflects the scope of our work and the significant step it makes toward understanding the neuronal network of the octopus vertical lobe. We do not claim to provide the octopus' VL connectome but how its connectomics unravels its workings and underlying principles. After deliberations, we decided to leave the title unchanged.

    1. Author Response

      Reviewer #1 (Public Review):

      In this study the authors sought to address the issue of whether the Steller's sea cow -- a massive extinct sirenian ("sea cow") species that differs from its living relatives (manatees and dugongs) not only in body mass but also in having inhabited cold climates in the northern Pacific -- had hemoglobin adaptations that enhanced the species' thermoregulatory capacities relative to those of the extant species, which are restricted to relatively warm waters. To do so, the authors synthesized recombinant hemoglobin proteins of all the major sea cow lineages and used these data to assess differences in O2 binding, Hb solubility, responses to allosteric effectors, and thermal sensitivity. The work presented is very innovative and in my opinion convincingly demonstrates that the Steller's sea cow had remarkable hemoglobin adaptations that allowed for an extreme range extension into cool waters despite several physiological constraints that are inherent to the sirenian (and paenungulate, afrotherian, etc.) clade. I did not detect any obvious weaknesses of the paper, whereas the use of ancient DNA to resurrect 'extinct' hemoglobins, and the various analyses of these extinct hemoglobins alongside those of extant relatives is very exciting and are major strengths of the paper that make this study a very important advance for our understanding of Steller's sea cow's paleophysiology, as well as our understanding of the potential for extreme hemoglobin phenotypes that have not been documented among living species. Moving forward, these methods can be used to study aspects of the paleophysiology of other recently extinct mammals. I applaud the authors on an excellent and innovative study that significantly augments our understanding of the Steller's sea cow.

      We sincerely appreciate the constructive comments of this reviewer.

      Reviewer #2 (Public Review):

      This manuscript is an impressive "resurrection" of physiology regarding an enigmatic though unfortunately extinct species, and their potential adaptation to cold-water environments. I am largely convinced of their findings, which I feel are very straightforward and thorough.

      One place where the authors perhaps fell a bit short was regarding some conclusions associated with maternal/fetal oxygen delivery. The sirenian versions of fetal & embryonic hemoglobin genes have been identified and assessed to some degree in previously published work the same research group. I feel the manuscript would have benefited from actual analysis of the fetal & embryonic hemoglobin (epsilon, gamma, zeta) to strengthen their assertions.

      Again, we appreciate the kind words and valid concern of this reviewer regarding a potential shortcoming of the maternal/fetal gas exchange discussion. As noted above, we previously collected physiological data from two pre-natal Steller’s sea cow Hb isoforms that were initially intended to form a stand-alone publication on this topic. However, we have elected to include this data here to better support our claims and provide a more complete picture of maternal/fetal oxygen delivery in this extinct species.

      Reviewer #3 (Public Review):

      Signore et al. synthesized and functionally characterized the recombinant adult hemoglobin (Hb) proteins of extant, extinct, and ancestral sirenians to explore the putative role of Hb in helping Steller's sea cows adapt to life in extremely cold waters. The functional comparisons show that the Hb of the subarctic Steller's sea cows differs in multiple biochemical properties relative to the Hbs of the two extant sirenians in the study, the Florida manatee, and the dugong and also from the Hb inferred for the common ancestor of Steller's sea cow and dugong. Specifically, the Steller's sea cow shows reduced oxygen binding affinity, reduced sensitivity to the allosteric cofactors DPG, Cl-, and H+, increased solubility, and reduced thermal sensitivity. DPG plays an important role in regulating Hb oxygen affinity in mammals, and the lack of sensitivity to it is unique to the Hb of Steller's sea cow. Sequence comparisons show that the Hb of the Steller's sea cow differs at 11 amino acids from that of its sister group, the dugong, one of which is intriguing because it occurs in a position that is invariable among mammals at a site that is critical for DPG binding, a change from Lys to Asn in position 82 of the mature β/δ globin chain. To test the significance of this change, the authors use site directed mutagenesis to insert back a Lys in the Steller's sea cow Hb background (β/δ82Asn→Lys) and test its biochemical properties. The functional assays with the β/δ82Asn→Lys mutant indicate that reverting this position to its ancestral state drastically altered the biochemical properties of the Steller's sea cow Hb, making it functionally similar to the Hbs of manatee, dugong, and the Hb inferred for the common ancestor of Steller's sea cow and dugong.

      The study's strength lies in comparing the different recombinant Hbs in an explicit evolutionary framework. The conclusions are supported by the analyses, and the results are relevant in the fields of evolutionary biology, physiology, and biochemistry because they suggest that a single amino acid substitution in a protein can have profound biochemical consequences that impact whole organism physiology.

      We concur with the excellent synopsis of this reviewer. The finding that most of the functional differences between Steller’s sea cow and other sirenian Hbs can be attributed to a single amino acid replacement mirrors earlier sentiments of hemoglobin adaptation by pioneers in the field (e.g. Max Perutz). By contrast, more recent studies highlight the importance of multiple causative replacements of smaller effect and the significance of genetic background in hemoglobin evolution/adaptation (which is also evident for Steller’s sea cow Hb). We hope that the present work helps to bridge these two important evolutionary forces.

    1. Author Response

      Reviewer #3 (Public Review):

      The authors described the one family showing autoinflammatory phenotypes with L236P variant of TNFAIP3 gene. The variant has not been reported on and they evaluated the function of this variant using in vitro and in silico methods. I think this is well-written manuscript and I agree with their interpretation about the pathogenicity of this variant, but the new finding is poor. The variant information was only a new finding.

      I recommend the revision of the following points.

      In Table 2, T647P seemed to be pathogenic which was evaluated with in vitro assay by Kadowaki.

      The Kadowaki study indeed showed reduced NFκB activity for the Thr647Pro variant. This information has now been added to Table 2. However, the variant is highly frequent in the control population. According the ACMG guidelines, the variant does not fulfil all the conditions to be considered as pathogenic as its allele frequency is not compatible with the disease frequency. Therefore, as we cannot conclude on the pathogenic effect of the variation, we have described it as a variant of unknown significance.

      Two other missense variants, V377I (Niwano, Rheumatology 2022) and T602S (Jiang W, Cellular Immunol 2022) were recently reported. These should be included in the discussion.

      We have analyzed all additional missense variations, including the V377M (we found the report of a variation involving V377, but it was V377M and not V377I) and T602S variations and have added them to Table 2.

    1. Author Response

      Reviewer #2 (Public Review):

      Granell et al. investigated genetic factors underlying wheezing from birth to young adulthood using a robust data-driven approach with the aim of understanding the genetic architecture of different wheezing phenotypes. The association of 8.1 million single nucleotide polymorphisms (SNPs) with wheeze phenotypes derived from birth to 18 years of age was evaluated in 9,568 subjects from five independent cohorts from the United Kingdom. This meta-genome-wide association study (GWAS) revealed the suggestive association of 134 independent SNPs with at least one wheezing subtype. Among these, 85 genetic variants were found to be potentially causative. Indeed, some of these were located nearby well-known asthma loci (e.g., the 17q21 chromosome band), although ANXA1 was revealed for the first time to play an important role in early-onset persistent wheezing. This was strongly supported by functional evidence. One of the top ANXA1 SNPs associated with wheezing was found to be potentially involved in the regulation of the transcription of this gene due to its location at the promoter region. This polymorphism (rs75260654) had been previously evidenced to regulate the ANXA1 expression in immune cells, as well as in pulmonary cells through its association as an eQTL. Protein-protein network analyses revealed the interaction of ANXA1 with proteins involved in asthma pathophysiology and regulation of the inflammatory response. Additionally, the authors conducted a murine model, finding increased anxa1 levels after a challenge with house dust mite allergens. Mice deficient in anxa1 showed decreased lung function, increased eosinophilia, and Th2 cell levels after allergen stimulation. These results suggest the dysregulation of the immune response in the lungs, eosinophilia, and Th2-driven exacerbations in response to allergens as a result of decreased levels of anxa1. This coincides with evidence of lower plasmatic ANXA1 levels in patients with uncontrolled asthma, suggesting this locus is a very promising candidate as a target of novel therapeutic strategies.

      Limitations of this piece of work that need to be acknowledged:

      (1) the manual and visual inspection of Locus Zoom plots for the refinement of association signals and identification of functional elements does not seem to be objective enough;

      This is an important observation and we have now added the following text in the Discussion which can be found on lines 400-2 Revised Main Manuscript:

      “Finally, the manual and visual inspection of Locus Zoom plots for the refinement of association signals and identification of functional elements was an objective approach which might have undermined the findings.“

      (2) the sample size is limited, although the statistical power was improved by the assessment of very accurate disease sub-phenotype;

      This point was already mentioned as a limitation and it can now be found in lines 349-365 Revised Main Manuscript:

      “By GWAS standards, our study is comparatively small and may be considered to be underpowered. The sample size may be an issue when using an aggregated definition (such as “doctor-diagnosed asthma”) but is less likely to be an issue when primary outcome is determined by deep phenotyping. This is indirectly confirmed in our analyses. Our primary outcome was derived through careful phenotyping over a period of more than two decades in five independent birth cohorts, and although comparatively smaller than some asthma GWASs, our study proved to be powered enough to detect previously identified key associations (e.g. chr17q21 locus). Precise phenotyping has the potential to identify new risk loci. For example, a comparatively small GWAS (1,173 cases and 2,522 controls) which used a specific subtype of early-onset childhood asthma with recurrent severe exacerbations as an outcome, identified a functional variant in a novel susceptibility gene CDHR3 (SNP rs6967330) as an associate of this disease subtype, but not of doctor-diagnosed asthma(51). This important discovery was made with a considerably smaller sample size but using a more precise asthma subtype. In contrast, the largest asthma GWAS to date had a ~40-fold higher sample size(7), but reported no significant association between CDHR3 and aggregated asthma diagnosis. Therefore, with careful phenotyping, smaller sample sizes may be adequately powered to identify larger effect sizes than those in large GWASs with broader outcome definitions(52).”

      (3) association signals with moderate significance levels but with strong functional evidence were found;

      We do not think of this as a limitation but as a strength. We were able to support our genetic results with evidence from experimental mouse models.

      (4) no direct replication of the findings in independent populations including diverse ancestry groups was described.

      This point was already mentioned as a limitation and it can now be found in lines 375-391 and 392-399 Revised Main Manuscript.

      “We are cognisant that there may be a perception of the lack of replication of our GWAS findings. We would argue that direct replication is almost certainly not possible in other cohorts, as phenotypes for replication studies should be homogenous(56). However, there is a considerable heterogeneity in LCA-derived wheeze phenotypes between studies, and although phenotypes in different studies are usually designated with the same names, they differ between studies in temporal trajectories, distributions within a population, and associated risk factors(57). This heterogeneity is in part consequent on the number and the non-uniformity of the timepoints used, and is likely one of the factors responsible for the lack of consistent associations of discovered phenotypes with risk factors reported in previous studies(58). This will also adversely impact the ability to identify phenotype-specific genetic associates. For example, we have previously shown that less distinct wheeze phenotypes in PIAMA were identified compared to those derived in ALSPAC(59). Thus, phenotypes that are homogeneous to those in our study almost certainly cannot readily be derived in available populations. This is exemplified in our attempted replication of ANXA1 findings in PIAMA cohort (see OLS, Table E12). In this analysis, the number of individuals assigned to persistent wheezing in PIAMA was small (40), associates of this phenotype differed to those in STELAR cohorts, and the SNPs’ imputation scores were low (<0.60), which meant the conditions for replication were not met.”

      “Our study population is of European descent, and we cannot generalize the results to different ethnicities or environments. It is important to highlight the under-representation of ethnically diverse populations in most GWASs(9). To mitigate against this, large consortia have been formed, which combine the results of multiple ethnically diverse GWASs to increase the overall power to identify asthma-susceptibility loci. Examples include the GABRIEL(6), EVE(60) and TAGC(7) consortia, and the value of diverse, multi-ethnic participants in large-scale genomic studies has recently been shown(61). However, such consortia do not have the depth of longitudinal data to allow the type of analyses which we carried out to derive a multivariable primary outcome.”

      Nonetheless, the robustness and consistency of the findings supported by different analytical and experimental layers is the major strength of this study.

      The authors successfully achieved the aims of the study, strongly supported by the results presented. This study not only provides an exciting novel locus for wheezing with potential implications in the development of alternative therapeutic strategies but also opens the path for better-powered research of asthma genetics, focused on accurate disease phenotypes derived by innovative data-driven approaches that might speed up the process to disentangle the missing heritability of asthma, making use of still useful GWAS approaches.

    1. Author Response

      Reviewer #1 (Public Review):

      This study aimed to estimate contact parameters associated with the transmission of SARS-CoV-2 in unvaccinated South African households over one year. The authors found no correlation between the frequency or duration of contacts and infection risk. Similar parameters (e.g., sharing a room with the index patient) also failed to yield an association. Reassuringly, a robust association was found with the Ct of the index case; female sex and individuals aged 13-17 years were also associated with increased risk. In a more general analysis, obesity, age >5 and <60 y, and non-smoking status were associated with increased risk.

      Strengths of the study are its relatively large size (131 households involving 497 people) with detailed proximity data; frequent testing to enable high ascertainment of infections; and ability to exclude individuals seropositive at baseline. Additionally, several outcomes were evaluated in the models, partly to accommodate uncertainty in the index case. Different model structures were evaluated to gauge robustness.

      Limitations of the study include the fact that many index cases were likely enrolled after their infectious period, and it is possible that apparent secondary cases in the household arose from a shared exposure with the index case but had a longer latent period. Each of these factors could weaken the perceived effect of close contacts. Statistically, there is the vexing question of what age (gender, smoking, etc.) really represents mechanistically, and whether the models may be conditioning on a collider. Another statistical consideration is that many household contacts were excluded from the study because they were seropositive at baseline. In effect, their households may already have been "challenged" with the virus, and there may be heterogeneities in household susceptibility that are not fully considered by the simple exclusion of individuals with evidence of prior infection. Separating these household types in the analysis might have yielded different results.

      Although conditioning on a collider in the multivariable analysis is theoretically possible, it would be important to consider these co-variates as they had been found to be associated with transmission in both this and previous studies. The fact that there is also no signal for the contact parameters on univariate analysis support these findings.

      We added another sensitivity analysis where any households where individuals were excluded due to seropositivity was excluded from the analysis. The results showed that none of the contact patterns were significantly associated with SARS-CoV-2 transmission in the household.

      All that said, it is telling that in these households, infection is not clearly linked to typically defined close contacts. This is an important result that complements other strong evidence that aerosols are the dominant route of transmission for SARS-CoV-2. This information is critical for the design of effective intervention strategies. Additionally, the authors outline how future studies can be designed to improve on this work.

      Reviewer #3 (Public Review):

      The manuscript by Kleynhans et al analyzes data from household contacts of SARS-CoV-2 cases at two sites in South Africa. Proximity sensors were distributed to household members following diagnosis of the "index case" and measured the frequency and duration of close contacts (defined as being face-to-face within 1.5 meters for at least 20 seconds). The authors then examined the association between the duration, frequency, and average duration of contacts and the risk of a diagnosis of SARS-CoV-2 among household members in the subsequent two weeks, for both contact with the index case and all cases within the household. The risk of infection among household members was high (~60%), but was not significantly associated with the contact metrics examined. The findings may indicate that aerosols may be the predominant mode of SARS-CoV-2 transmission within households; however, there are also a number of limitations associated with the design and analysis of the study, which the authors acknowledge and which may limit the interpretability of the conclusions of this study.

      One important study limitation has to do with the design of the study: Sensors were not distributed to household members until a day or two after the diagnosis of the index case. Since individuals are most infectious with SARS-CoV-2 just prior to symptom onset, contact patterns were measured only after most transmission from the index case likely occurred. Furthermore, household members may have limited their contact with the index case, particularly if the index case attempted to isolate following their diagnosis, so the contact patterns measured are unlikely to be representative of typical mixing within the household.

      Another important limitation has to do with the analytical approach: The logistic regression model assumes that the first person in the household to test positive for SARS-CoV-2 (i.e. the index case) infected all subsequent cases within the household. However, this approach does not account for chains of transmission within the household or transmission from outside the household (possibly from the same source that infected the index case). While this concern is partially addressed by also assessing the association between the risk of infection and contact with all infected household members, more sophisticated methods could be used to infer the most likely infector of each case. The possibility of multiple introductions of the virus from outside the household is also only partially addressed by excluding households in which more than one variant was detected. While these limitations (and others) are appropriately acknowledged by the authors in the Discussion, nevertheless they limit the conclusions that can be drawn from the study results.

      We will be considering an analytic framework to distinguish the chains of transmission in future analyses.

      It is also worth noting that the contact metrics as defined and analyzed in the model may not be the measures that are most relevant to transmission. The authors examined three different contact measures: the median daily duration of contact, the median daily frequency of contact, and the median daily average duration of contact (i.e. the ratio of the two previous measures). They chose to examine the median daily values because contact duration was heavily skewed and the number of days of follow-up varied after data cleaning, but it may be that longer-duration contacts important to transmission are not appropriately captured by these metrics. Indeed, the median daily duration of the contact is quite short (only ~18 minutes on average). It would be useful to also evaluate a measure such as the total cumulative duration of contact and frequency of contacts divided by the number of days of follow-up, which differs from the measures they calculate and would take into account more prolonged and frequent contacts.

      Additional contact parameters added as described in response to reviewer 2.

      Lastly, the measures of association reported in the manuscript are the odds ratios (ORs) associated with one additional second of contact per day. This is not a very biologically meaningful unit of measure, and when rounded to two significant digits, the ORs are not surprisingly 1.0 with 95% confidence intervals that also round to 1.0. It would be more interpretable to report the ORs associated with a 1-minute (rather than 1-second) increase in the duration of contact, and the biological interpretation of the ORs should be described in the text.

      All time-based contact parameters now expressed in minutes.

    1. Author Response

      Reviewer #1 (Public Review):

      Rapan et al. analyzed the cytoarchitectonic of the prefrontal cortex based on observer-independent analysis, confirming previous parcellations based on cyto-, myelo-, and immunoarchitectonic approaches, but also defining novel subdivisions of areas 10, 9, 8B, and 46 and identified the receptor density "fingerprint" of each area and subdivision. Furthermore, they analyzed the functional connectivity of the prefrontal cortex with caudal frontal, cingulate, parietal, and occipital areas to identify specific features for the various prefrontal subdivisions. Altogether, this study corroborates previous parcellations of the prefrontal cortex, adds new cortical subdivisions, and provides a neurochemical description of the prefrontal areas useful for comparative considerations and for guiding functional and clinical studies.

      Strengths:

      • This study provides a detailed cytoarchitectonic map of the prefrontal cortex enriched with receptor density and functional connectivity data.

      • The authors shared the data via repositories and applied their map to a macaque MRI atlas to further facilitate data sharing.

      Weaknesses:

      • The temporal cortex should be included in the functional connectivity analysis as it is known from anatomical studies that most prefrontal areas display rich connectivity with temporal areas. The aim of creating a comprehensive view of the frontal cortex makes the manuscript data-rich but cursory in discussing the relevant anatomical and functional literature.

      One of the main concerns pointed out by reviewers was that the functional connectivity analysis is incomplete without temporal lobe areas. Although our initial decision was to use only our parcellation scheme, we fully agree with reviewers. Thus, we have extended our functional connectivity analysis, and combined our frontal, parietal, cingulate and occipital parcellations with temporal areas as defined in the atlas of Kennedy and colleagues (Markov et al., 2014). In the revised version of the manuscript, old Figures 13-17, and related Supplementary Figures 12 and 13, have been replaced with new Figures 12-15 and Figures 12-15 – Figure supplements 12, 13 and 14, and the results are described in the updated Results, chapter 3.4 Functional connectivity analysis. The Discussion has also been adjusted regarding the updated results.

      Reviewer #2 (Public Review):

      Rapan and colleagues did perform an impressive multi-modal parcellation of the macaque frontal cortex. In addition to qualitative cytoarchitectonic and resting-state functional fMRI data analyses, the authors based their parcellation on quantitative receptor density analysis of 14 receptors. Compared with the classic Walker map of the macaque frontal cortex, the authors produced a more refined map. Those results should be discussed in light of previous work on the same topic (Petrides et al. 2012 Cortex; Reveley et al. 2017 Cerebral Cortex; Saleem and Logothetis 2012).

      In the Discussion, under chapter “4.1 Comparison with previous architectonic maps of macaque prefrontal region” (pages 44-52), we compared our parcellation to previously published maps, including the work of Petrides and colleagues (i.e., Petrides and Pandya 1984, 1994, 1999,2002, 2006, 2009; Petrides 2000, 2005; Petrides et al. 2012). With the exception of Caminiti et al. (2017), which integrates work by Belmalih et al. (2009); Borra et al. (2011, 2019) and Gerbella et al. (2010, 2013), we had restricted our citations to original mapping studies because we find it is important to discuss their reliability and objectivity, since they have been widely used in tracer-tract and neuroimaging studies, as well as the parcellation maps depicted in 3D atlases. Indeed, Saleem and Logothetis (Saleem & Logothetis, 2012) use the maps of Carmichael and Price (1994), Petrides and Pandya (1999, 2002) Petrides (2005) and Preuss and Goldman-Rakic (1991) for the parcellation of the prefrontal cortex in their atlas, and Reveley et al. (Reveley et al., 2017) use the map of Saleem and Logothetis (2012) in their 3D atlas. We now provide this information in the Introduction (lines 126-132):

      “In recent years, several digital macaque atlases have been created (Bezgin et al., 2012; Frey et al., 2011; McLaren et al., 2009; Moirano et al., 2019; Reveley et al., 2017; Van Essen et al., 2012) based on the previous parcellations. Indeed, maps of Carmichael and Price (1994), Petrides and Pandya (1999, 2002), Petrides (2005) and Preuss and Goldman-Rakic (1991), used in atlas of Saleem and Logothetis (2012), have been brought into stereotaxic space by Reveley et al. (2017).”

    1. Author Response

      Reviewer #1 (Public Review):

      The authors have approached the study of the mechanism of maturation of retroviruses lattice, where Gag polyprotein is the main component. The Gag polyprotein is common to all retroviruses and makes up most of the observed lattice underlying the virion membrane. Within the lattice, 95% of the monomers are Gag, and 5% are Gag-Pol, which has the 6 domains of Gag followed by protease, reverse transcriptase and integrase domains (coming from Pol) embedded within the same polyprotein. For the maturation and infectivity of HIV retrovirus, the Gag proteins within the immature lattice must be cleaved by the protease formed from a dimer of Gag-Pol. Importantly, the lattice covers only 1/3 to 2/3 of the available space on the membrane. The incompleteness of the lattice results in a periphery of Gag monomers with unfulfilled intermolecular contacts. Recently, the structure of the immature lattice has been partially resolved using sub-tomogram averaging cryotomography (cryET) and it has been shown that the incompleteness of the lattice provides more accessible targets for the protease (Tan A. et al. 2021). Based on these, the authors have wondered: does the incompleteness of the lattice allow for dynamic rearrangements that ensure that protease domains embedded within the lattice can find one another to dimerize and activate? To answer this, they started from experimental cryoET data and used reaction-diffusion simulations of assembled Gag lattices with varying energies and kinetic rates to test how lattice structure and stability can support the dimerization of the Gag-Pols. They found that although they represent only 5% of the monomers that assemble into the lattice, the stochastic assembly ensure that at least a pair of them are adjacent within the lattice. They next showed that if the molecules are distant from one another, they would need to detach, diffuse, and reattach stochastically at the site of another Gag-Pol molecule.

      I consider the work very interesting, which could contribute to a very important aspect of retroviruses maturation such as their infectivity. However, the observations made by the authors do not necessarily answer their initial question which seemed to be focused on studying the possible role of the incompleteness of the lattice on the protease activation rather than the mechanism of Pol activation itself. Maybe this is only a nuance to be polished in the writing.

      The weakness of the work comes from both the fact their entire study has been done by computational methods and the exclusion in their computational approaches of well-known cellular components with a role in retrovirus maturation, which might obey to the fact of keeping their models into the simplest possible since handling atomistic models is already a heavy task. Maybe complementary molecular or structural studies would strengthen their results.

      We appreciate Reviewer #1’s comments and interest in our work. In the revised manuscript, we have clarified our writing to emphasize that our primary goal was to quantitatively interrogate the mechanism by which dimerization of two protease domains can occur, as it is essential for activation of the protease and the subsequent maturation process. We do not address the steps that follow the essential dimerization process. The molecular details concerning how this (dimerized) protease enzyme initiates cleavage and ultimately cleaves the entire lattice off the membrane would be the subject of a separate study (see page 4).

      Regarding the concern about the computational nature of our work, we agree that the model is a (necessarily) simplified representation of the true biological system, and as we elaborate further in the Discussion section, including more components in the model would strengthen connection to experiment, which we plan in future work (see page 26). However, we have not relied solely on our computational results, but made several direct and quantitative comparisons to experimental structural (now!), biochemical, and imaging data. We have also validated the parameters of our model against theory.

      We agree that it would be interesting and worthwhile to make more direct comparisons as well to more molecular models of the immature lattice, particularly in future work with the inclusion of more specific co-factors like IP6, which we discuss in the Discussion section. We show that already in its current form, our model provides new quantitative evidence on the mechanisms that would allow two protease domains to find one another to initiate maturation, in a way that is consistent with structural and biochemical data. (See revisions on page 26 and 28 of the manuscript)

      Reviewer #2 (Public Review):

      Immature lattice assembly remains an arcane topic, and these simulations provide high resolution data such as assembly kinetics and large-scale lattice rearrangement. Further, the authors extend their model to compare directly with experiments, e.g. SNAP-HALO dimerization, which provides a basis to interpret their conclusions. The manuscript is difficult to read, as it is a technical manuscript that overuses jargon; overall, it seems written for a specialized audience. Additionally, there are several aspects of the model design that remain opaque, such as the implicit lipid method and the suppression of multi-site nucleation. Further, analyses such as time auto-correlation and mean first passage time are not given much context by the authors. Altogether, it is the opinion of this reviewer that several revisions to the manuscript should be incorporated to improve clarity and strengthen the significance of the authors' efforts.

      We appreciate the constructive comments from Reviewer #2. We've revised the text in multiple places to minimize the use of technical jargon and provide clearer explanations for specialized terms or concepts. Specifically, we've provided more detailed descriptions of the implicit lipid method and the rationale behind the suppression of multi-site nucleation to help readers better understand our model design. Additionally, we've added more context for the analyses such as time auto-correlation and mean first passage time, describing their significance and relevance to our study.

      Reviewer #3 (Public Review):

      The manuscript concerns the cleavage of the Gag polyprotein lattice from the HIV virion membrane, a key stage in HIV lifecycle, and one that is required for HIV to become infectious. Since cleavage requires homodimerization between the small fraction (5%) of such Gag polyproteins that carry a protease domain, referred to as Gag-Pol, this raises questions regarding how such homodimerization can take place, and whether it can happen on the required timescales, given that Gag-Pol is typically embedded in a lattice that is observed to form one large connected component.

      The authors address these questions in silico, using particle-based reaction diffusion simulations. Such simulations are rigid-body and "structure-resolved" meaning that they rigidly incorporate the geometry of the polyproteins, and their various binding interfaces, based on existing structural data. Other aspects of the simulations are also in-line with available data, including copy numbers, lattice curvature, and dissociation rates. This focused approach is a strength of the work and allows the authors to make credible claims that their simulations have relevance to HIV (as does their commitment to comparison with HALO-SNAP-based measurements of dimerization kinetics as well as iPALM experiments that characterize lattice dynamics).

      A central part of the model is that it allows for the "possibility of imperfect alignment of molecules in the lattice", presumably due to the incompatibility of regular hexagonal tiling and surfaces with non-zero Gaussian curvature, such as a sphere. This is implemented via the ad-hoc imposition of a free-energy penalty when complete hexamers are formed, implying that hexamers are less stable than six ideal bonds. By varying this strain penalty, the authors can change the stability of the lattice independently of individual binding affinities, allowing its use as an effective fitting parameter when comparing to HALO-SNAP data. In the latter case, agreement between simulation and experiment can only be found at moderate levels of lattice stability.

      However, such energetic penalties are present whenever the polyprotein structure must undergo deformations which, on surfaces with nonzero Gaussian curvature, should be the case for partial tilings as well as complete ones (where all six interfaces form bonds). This, therefore, appears to be a weakness of the work. An elastic implementation of polyprotein structure, for example, would permit strain to accumulate (and therefore stresses to propagate) throughout the lattice naturally, irrespective of whether complete hexamers were formed, and might reasonably be expected to impact the likelihood of different lattice structures. Whilst it is not clear how or whether this would lead to qualitatively or quantitatively different results, it is nevertheless worth remarking upon since the authors high-level claim is that lattice structure is an important determinant of the mean-firstpassage times to dimerization.

      Overall, I find this to be a valuable study, carried out in a solid and comprehensive manner. The primary impact of the work appears to be twofold: the unification of different experimental measurements under a single model, and the further identification of the salient parts of that model that most impact biological function. The results advance the understanding of one of the steps of the HIV lifecycle, via a better description of the mechanisms underpinning Gag-Pol dimerization. Notably, the authors stop short of drawing parallels to many related concepts and models in statistical physics, such as those concerning percolation and diffusion limited aggregation as well as the notions of dislocations and defects in crystalline matter on curved surfaces. These might reasonably have provided a basis for better understanding and quantification of the authors' simulations, as well as improving the scope for extensions and conceptual clarity.

      We appreciate the constructive and positive comments from Reviewer #3. We have revised the text and expanded the discussion given the points the reviewer has brought up.

      We have quantified the number and organization of incomplete hexamers or defects in the simulated lattices. This allows us to compare with experimental structures and also quantify how the assembly parameters would impact this organization. As we now remark on pages 10-11, with reversible binding during assembly, we see that fewer defects are present in the lattice, indicating that the hexameric lattice can improve its organization and stability when unbinding reactions can correct for weak contacts in the lattice. We thus speculate that because our lattices are statistically in good agreement with experiment even when binding is irreversible, that the assembly process does not rely on a significant amount of annealing. Otherwise the lattice structures would be more ideal.

      We further discuss on pages 27 that a model that also incorporates forces to control interactions would be important to measure the mechanical stability of the lattice, particularly as it couples to membrane bending. However, models that naturally incorporate forces (via interaction potentials) can be difficult to tune with respect to their kinetics and free energies, which are nontrivial to calculate, unlike our modeling approach here.

      In these sections we have now drawn parallels, as suggested by the reviewer, to the literature on defects in crystalline matter on curved surfaces, and their possible consequences for the mechanics of the lattice (Ref 41 Negri et al, Deformation and failure of curved colloidal crystal shells. Proc Natl Acad Sci U S A, 2015. 112(47)) (Ref 59 Zandi, R. and D. Reguera, Mechanical properties of viral capsids. Phys Rev E Stat Nonlin Soft Matter Phys, 2005. 72(2 Pt 1): p. 021917.)

      We did not include a connection to Diffusion-limited aggregation, as this process seems to result in fractal-like structures that lack the specific hexagonal order of the Gag protein lattice. The proteins impose a significant orientational order on the assembly process that makes growth significantly more compact, at least under the assembly conditions we used. Even for our fastest rates, the process is still only moderately diffusion influenced (ka <<109M-1s-1), with typically multiple collisions needed before succesful binding occurs, consistent with most protein-protein interactions.

    1. Author Response

      Reviewer #2 (Public Review):

      The manuscript by Mohebi et al. examines a critical open question regarding the interaction of cholinergic interneurons of the striatum and transmitter release from dopaminergic axons in behaving animals. Activation of cholinergic interneurons in the striatum can evoke dopamine release in brain slices and in vivo as measured with voltammetry. However, it remains an open question in what context and to what extent this acetylcholine-mediated dopamine occurs in behaving animals. Here, the authors argue that CIN activity triggers dopamine release in the nucleus accumbens which encodes the motivation to obtain a reward through increasing "ramps" of dopamine release. Their data suggest that the ramps are not reflected in the firing of dopaminergic neurons. Rather, they provide compelling evidence that the ramps of dopamine release correlate with ramps in cholinergic interneuron activity as measured with GCaMP6. What's more, the authors show that ACh-mediated dopamine release has no paired-pulse depression, a striking result that differs from all prior ex vivo brain slice data. The manuscript is extremely well written and the data are of very high quality. Overall, this study represents an important step forward in our understanding of how ACh-mediated dopamine release regulates behavior, and more broadly how axons can generate behaviors independently from somatic activity.

      Major comments

      1) The complete absence of any short-term plasticity in CIN-mediated dopamine release is a striking result that is important for the field. The authors should strengthen this result with additional quantitative analysis demonstrating the lack of STP. They have analyzed paired-pulse ratios, but they should analyze this for stimuli at the higher frequencies (4 Hz, etc) that are more physiologically relevant. For example, Fig 1e shows a CIN-evoked DA release at many optically-stimulated frequencies. The authors should quantify short-term plasticity by generating fits of the single stimulus signal and comparing the mathematical sum predicted from 4 stim DA signals at different frequencies to the recorded data. A similar analysis has been done with Ca signals (Koester and Sakmann, 2000).

      Thank you for this very helpful suggestion. We have performed this analysis as recommended, and now confirm the lack of STP even at the higher frequencies (see new Supplementary Figure 1).

      2) The authors show that optical activation of CINs results in DA release as measured by dLight. To clearly establish that these signals are generated by DA release driven by nicotinic receptors (and not a partial effect of some unknown artifact), it would be useful to show that the optical CIN-evoked dLight signals shown in Fig. 1 are inhibited by nicotinic receptor antagonists such as DHbE. This control experiment would significantly strengthen the result shown here.

      We agree that combining drug manipulations with photometry would be useful, but as noted above this is not a methodology in our current technical repertoire.

      3) Similarly, the authors show clear correlations between CIN activity and DA release during behavior. The authors should consider determining whether CINs play a causal role in triggering DA release during behavior. For example, does infusion of DHbE in the NAc prevent the light-mediated DA release during behavior? As an alternative hypothesis, some groups have been suggesting that CIN activity has almost no direct influence over DA. Therefore, testing whether a causal relationship exists between CINs and DA release would be an important experiment in addressing these two opposing viewpoints.

      As noted above we are not currently able to combine drug manipulations with photometry in behaving animals.

      4) The ramps that are described in this manuscript are an order of magnitude faster (increasing over 100s of milliseconds) than ramps described in other studies that occur over seconds. In fact, the two signals may be completely different functionally. Discussion of this topic would be helpful.

      Dopamine ramps have indeed been reported over multiple different time scales, and as discussed in Berke 2018, this seems to reflect the duration of the approach behavior. We think further discussion of this topic is better saved for another paper, especially as we are now actively studying ramping over longer time scales (Krausz et al. 2023).

      Reviewer #3 (Public Review):

      This report by Mohebi et al. provides new answers to old questions by showing that the activity of striatal cholinergic interneurons (CINs) escalates progressively during specific reward-related behaviors and that this correlates with previously observed ramps in dopamine (DA) release in the nucleus accumbens core. The report is strong and provides evidence for the authors' hypothesis that DA ramps are independent of DA neuron activity, but are instead the result of CIN activity and corresponding acetylcholine (ACh) release. The authors further demonstrate that the fidelity of CIN activation and consequent driving of DA release is even more robust in vivo than observed ex vivo slice preparations, which is fundamental for understanding the role of ACh-DA interactions in behavior. The findings complement the authors' previous evidence ventral tegmental area (VTA) DA neuron firing patterns do not show a ramping pattern; the previously reported VTA data are appropriately included here (in Fig. 3) to illustrate the absence of VTA firing during the time-locked increases in CIN activity and DA release. The present studies stop short of showing a direct link between CIN activity and DA release, however, which would require examining DA release during behavior in the presence of an antagonist of nicotinic ACh receptors. The authors also extend the understanding of the regulation of DA release by acetylcholine (ACh) by showing that optical activation of CINs in vivo promotes DA release responses that do not attenuate with repetitive stimulation. This contrasts with previous results in ex vivo striatal slices in which ACh-evoked DA release has been found to decline progressively from rundown and/or receptor desensitization. The authors propose that in vivo, AChE may be more effective in curtailing local ACh levels than in slices because of the slightly lower temperature typically used for slice studies, as well as the use of superfusion that might facilitate some AChE washout (AChE inhibitors are still effective in slices, of course). Overall, the report not only provides evidence for the cellular substrate for DA ramps but also shows the robustness of ACh-driven DA release in vivo. A few points to strengthen the report are listed below.

      1) The authors give a few details about how CINs were activated at the beginning of the results, but say only that DA dynamics were monitored using fiber photometry. Given that the methods are at the end, a brief summary should be given here to indicate whether this means direct monitoring of DA or indirect via GCaMP, for example. It would be helpful to note the sensor used in the abstract, as well. In this light, as it were, RdLight1 should be described upon the first mention.

      We have now clarified in both abstract and text that we are using the direct DA sensor RdLight1.

      2) The authors show that infusion of DHbE in the NAc likelihood of decisions to approach the center port, as did antagonism of DA receptors. This supports the authors' argument that ramping of CIN activity and consequent ACh release underlies observed ramps in DA release. However, to show a causal interaction requires testing whether the observed DA ramps are absent after DHbE infusion in the NAc, under the same conditions that attenuated behavior.

      As noted above we are not currently able to combine drug manipulations with photometry in behaving animals.

      3) In Fig. 3, the y-axis title for the upper panels should specify VTA, not simply "rate". This is stated in the legend, but should also be specified in the figure panel.

      We have updated the y-axis titles in this figure.

      4) A recent preprint in BioRxiv by AC Krok, NX Tritsch et al. shows a related correlation between ACh and DA release in vivo in a reward task, as well as differences in other conditions. This report shows also that cortical input to CINs indeed plays a role, as suggested in the concluding sections of the present report. Consideration of the data in the preprint in the context of the present results could be valuable for the field.

      We have also noted those pre-prints with interest, even though they investigated different brain regions using different approaches. There are established differences between CIN-DA interactions in dorsal vs. ventral striatum that we suspect are relevant here. But given the rapid pace of developments in this subfield, we prefer not to speculate too much at this point and instead review the overall body of work once it is published.

    1. Author Response

      Reviewer #1 (Public Review):

      “The abstract does not adequately summarize the content of the paper. There is no mention of stimulation, or bilateral connectivity, which is a large part of the paper. The names of all five species should appear in the abstract, not just X. laevis.”

      In the revised manuscript, we have included all the names of the species and types of stimuli used to elicit fictive vocalizations in the abstract. In regard to bilateral connectivity, we believe that the reviewer was referring to the rostral-caudal connections between the parabrachial nucleus and nucleus ambiguus, which are critical for fast, but not for slow trill production. We have added this piece of information in the abstract. Furthermore, we have clarified the bilateral nature of the two central pattern generators (CPGs) in male X. laevis. In our previous study (Yamaguchi et al., 2017), we demonstrated that transections of the two commissures (one at the parabrachial nuclei level and the other at the nucleus ambiguus level) did not eliminate fictive advertisement calls in male X. laevis brains, indicating the presence of fast and slow trill CPGs in left and right hemi-brains of male X. laevis. This information was originally included in the results section (“Unilateral transection desynchronizes the fast clicks, but not the slow clicks across species”). We have now added this information to the introduction section to provide a clearer description of the anatomical organization of the two CPGs (p5, ln10 – 14).

      “The conclusion that the "fast and slow CPGs identified in male X. laevis are conserved across species." is contradicted by the last paragraph, which states, "Fast trill-like CPGs are likely present only in fast clickers..." This inherent contradiction needs to be resolved.”

      To resolve this contradiction, we have revised sentences in the abstract to clarify our findings. Specifically, we now state that “We found that even though the courtship calls of different Xenopus species vary in their click rates and duration, the CPGs used to generate clicks are conserved across species. The fast CPGs found in male X. laevis, which critically rely on reciprocal connections between the parabrachial nucleus and the nucleus ambiguus, are conserved among species that produce fast clicks. Similarly, the slow CPGs found in the caudal brainstem of male X. laevis are shared among species that produce slow clicks” (p2, ln 10 – 15) By making this change, we hope to provide greater clarity regarding our findings and help to resolve any contradictions.

      “The testosterone results are over-emphasized.” “The conclusion that there is differential expression of testosterone receptors in the brain of each species is completely speculative and not supported by the data presented here.”

      We have extensively revised our manuscript to ensure a more accurate interpretation of the results regarding testosterone experiments. Revised conclusions are outlined below:

      Abstract: “In addition, our results suggest that testosterone plays a role in organizing fast CPGs in fast-click species, but it does not appear to have the same effect in slow-click species.” (p2, ln 15 – 17)

      Introduction: “Additionally, we found that fast trill-like CPGs are present only in species that produce fast clicks and their presence appears to be regulated by testosterone in these species. “ (p6, ln 2 – 4)

      Discussion: “However, this effect of testosterone appears to be limited to the fast clicker species. Male X. tropicalis, a slow clicker species, has been shown to have comparable plasma levels of testosterone to male X. laevis (mean plasma levels of testosterone of male X. laevis: 13 to 22ng/ml (Hecker et al., 2005; Hayes et al., 2010), male X. tropicalis: ~20ng/ml, (Olmstead et al., 2009)), yet the synapses between the PBN and laryngeal motoneurons in male X. tropicalis remained weak, and PBN showed little activity during fictive advertisement calls. These results suggest that testosterone acts differently on the central vocal pathways of fast and slow clickers, promoting the emergence of fast trill-like CPGs in X. laevis but not in X. tropicalis. Although further experiments with controlled testosterone levels are necessary to validate these results, we hypothesize that changes in the androgen receptors (e.g., expression patterns, ligand affinity) may have contributed to the divergence of fast and slow clickers.“ (p26, ln 13 – 24)

      ”The use of the word "development" implies embryology. Here, adults were treated and looked at 13 weeks later. There is no data presented about development. ”

      In our revised manuscript, we have replaced the term “development” with “presence” or “acquisition” of neural circuitry to enhance the clarity and help to prevent any potential misunderstandings.

    1. Author Response

      Reviewer #1 (Public Review):

      By the in vitro DNA damage response (DDR) assay with a defined DNA substrate using Xenopus extracts and in vitro binding assays with purified proteins, the authors nicely showed the role of APE1 (APEX1) in ATRIP recruitment for DDR activation, particularly a non-enzymatic (structural) role of APE1 in the binding to both ssDNAs and ATRIP. The results described in the paper are very convincing to support the authors' claim. However, these studies lack the quantification with proper statistics (and/or mentioning the reproducibility of the results). And, given the important discovery of APE1 in the DDR activation in vitro, it would be nice to demonstrate the role of APE1(APEX1) in ATR activation in vivo using siRNA-mediated knockdown of mammalian cells or yeast cells.

      Thanks for the suggestion. As shown in our response to the #2 Essential Revisions, we have addressed this question by additional experiment and added extra description in our revised manuscript showing that APE1 is important for the ATR DDR following oxidative stress in culture human cell U2OS cells (Figure 1-figure supplement 1B). In addition, we have performed at least three independent experiments and statistical analysis to support our claims.

      Reviewer #2 (Public Review):

      ATM and Rad3-related (ATR) interact with ATRIP and plays a central role in DNA damage response. Previous studies have established the idea that ATR is recruited to RPA-covered ssDNA via ATRIP-RPA interaction. In this paper, the authors propose a new RPA-independent mechanism for ATR recruitment.

      Thanks for the nice summary of our major findings from the manuscript.

      Reviewer #3 (Public Review):

      In this manuscript, the authors explore the mechanism of ATRIP recruitment to single-stranded DNA (ssDNA), which is important for ATR activation and the subsequent control of DNA repair and cell cycle progression. Using Xenopus egg extracts, in vitro interaction assays, and ssDNA constructs, the authors found that AP endonuclease 1 (APE1) plays a role in the recruitment of ATRIP to ssDNA independently of RPA. Moreover, APE1 domains are characterized for ssDNA, ATRIP, and RPA interaction, determining that the nuclease activities of APE1 are not required for this new mode of ATRIP recruitment. Overall, the work presented makes a compelling case for a novel role for APE1 in ATRIP recruitment that seems crucial for ATR activation (at least in the Xenopus system). The results are likely to have an important impact on our understanding of the determinants for activation of ATR signaling and cellular responses to DNA damage and replication stress. It remains unclear whether the findings will be extended to other organisms and be relevant for different types of DNA lesions. Also, there are several points of concern in the manuscript that require further clarification, especially regarding some of the quantitative analyses presented and the claimed importance of the RPA-independent mode of ATRIP recruitment for ATR activation.

      We thank the reviewer’s overall positive evaluation of our initial submission. We have included additional experimental data using mammalian cells showing the significance of APE1 in the ATR DDR, and also additional discussion of other studies in the literature. We also provided further clarifications or responses to the major/minor concerns (please see below detailed responses). In particular, we revised the proposed model of APE1 in ATRIP recruitment and ATR DDR (Please see revised Figure 5).

    1. Author Response

      Reviewer #1 (Public Review):

      The data on embryonic "ventral nerve cord" glia are generated from whole embryos, and even provided that the ventral nerve cord harbors 75% of all glia and thus the majority is ventral nerve cord, the data should not be called vnc-specific. The vnc-specific data set (adult CNS) that is already published (Allen et al., 2020) is strangely not even mentioned in the current manuscript. The idea of having a comprehensive description of glial transcriptional profiles is great - but I was missing the integration of the midline glial cells, which can be considered as ensheathing glial cells that - as the cortex glia - also express wrapper (Stork et al., 2009).

      • We agree with Reviewer 1 that the embryonic glia dataset represents all glia and not just VNC glia. We have amended the text accordingly.

      • We now cite the Allen et al., 2020. Apologies for this omission.

      • Midline Glia:

      The embryonic glial cells analysed in the previous version of our manuscript included only repo+ glia only and therefore did not include midline glia, which do not express repo (Jacobs, 2000). In the revised manuscript, we reanalysed the complete embryonic dataset and identified the midline glia based on known markers and in vivo validation (Figure 3 – figure supplement 1). We also provide a list of genes that show enriched expression in the midline glial cluster as a supplementary file (Source data file 1).

      We performed hierarchical cluster analysis on midline glia, all embryonic repo+ glial clusters and embryonic neuronal clusters to determine the relationship of midline glia to other glia. Interestingly, midline glia formed an outgroup to both neurons and repo+ glia (Figure 3 – figure supplement 1F), suggesting that they are quite distinct from other (repo+) glial classes. This is expected given their mesectodermal origin (Kosman et al., 1991; Thomas et al., 1988). Indeed, although midline glia express wrapper, otherwise known as a cortex glia marker (Banerjee et al., 2017; Noordermeer et al., 1998; Stork et al., 2009), they do not resemble cortex glia in form or function but instead ensheath commissural axons and play critical roles in axon guidance and VNC morphogenesis (Jacobs, 2000). Midline glia have been characterised extensively by several groups (Hartenstein, 2011; Hidalgo, 2003; Jacobs, 2000; Kearney et al., 2004; Vasenkova et al., 2006; Wheeler et al., 2006), therefore, given their distinct origin and the ambiguity surrounding their functional classification, we instead focused our analyses on repo+ glia in this manuscript.

      Unfortunately, I found most of what is reported in this work not to be entirely new. The classification of glial diversity in the adult brain was presented by the Meinerzhagen and Gaul labs (Edwards and Meinertzhagen, 2010; Edwards et al., 2012; Kremer et al., 2017). The description of two astrocyte-like cell types is a reduction of data that defined three morphologically distinct astrocyte-like cells (Peco et al., 2016), which is not discussed. Some other aspects were ignored, too. Two other morphological distinct types of ensheathing glia exist, ensheathing glia and ensheathing/wrapping or track-associated glia were described but this is not discussed (Kremer et al., 2017; Peco et al., 2016).

      We respectfully disagree with Reviewer 1’s assessment that much of the work presented in not new. This work represents the first Drosophila glial cell atlas with thorough validation of cluster marker expression in vivo. It is also the first systematic exploration of the relationship between glial morphology and transcriptional signature, a controversial topic in the field of glial biology. We fully agree that much of the adult glial morphology had been characterised previously by the Meinerzhagen and Gaul labs among many others and we acknowledge this explicitly in our manuscript and in references to Figures 2 (one out of a total of 9 main figures). Indeed, it is because Drosophila glial morphology has been so well characterised that a comprehensive exploration of the relationship between morphology and transcriptional signature was even feasible. Moreover, our revised manuscript also provides more in-depth morphological characterisation and quantification of glial morphology and defines subclasses and morphologies not described previously (e.g. channel perineurial glia and astrocyte morphologies of the lobula and lobula plate). Indeed, even the channel subperineurial glia, which were identified based on lineage relationships, nuclear position and molecular markers, were not described in morphological terms.

      The 3 distinct astrocyte populations defined in Peco et al., (2016) refer to cell body position and neuropil domains covered by astrocytes. We now include this categorisation in our quantification of astrocyte morphology (See response to (6) and Figure 1 – figure supplement 2) and discuss their relationship to the type 1 and type 2 astrocyte morphologies that we observed.

      As well we now include the ensheathing/wrapping or tract ensheathing glia as a morphological category of ensheathing glia in the manuscript (Figure 1A,N,O).

    1. Author Response

      Reviewer #1 (Public Review):

      This is a simulation study comparing the performance of two major approaches for dealing with “population structure” when carrying out Genome-wide Association Studies - Principal Component Analysis and Linear Mixed-effects Models - a subject of considerable practical importance. The author correctly notes that previous comparisons have been quite limited. In particular, any study not concluding that LMM was superior has relied on very simple models of structure.

      The paper is clearly written and beautifully reviews the theoretical underpinnings (albeit in a manner that will be difficult to penetrate without deep knowledge of several fields). The simulations are well-designed and far better than previous studies. From a theoretical point of view, the work is somewhat limited by being strongly anchored in a very classical quantitative genetics framework that is focused on allele frequencies and inbreeding coefficients, and totally ignores coalescent theory, but this is a minor quibble. The simulations are limited by utilizing ridiculously small sample sizes by the standards of modern human GWAS. And of course, they do not include all the complexities of real data.

      The quantitative genetics framework we used was ideal for motivating and interpreting LMMs in particular, since they model relatedness with a kinship matrix which consists of IBD probabilities, all of which arose from quantitative genetics.

      We also added the following text to the discussion: “However, our conclusions are not expected to change with larger sample sizes, as cryptic family relatedness will continue to be abundant in such data, if not increase in abundance, and thus give LMMs an advantage over PCA (Henn et al., 2012; Shchur et al., 2018; Loh et al., 2018).”

      The main conclusion of the study is that LMM really are generally superior - as expected on theoretical grounds. However, the authors do address whether switching to LMM really is practicable given the sample size and lack of data sharing that characterize human genetics. Nor is it clear whether the difference in performance matters in real life given that the entire framework used is an idealized one - the fact that real human data suffers from environmental confounders that are correlated with “ancestry” is not addressed, to take the most obvious example. That said, it is surely important to note that the approach routinely used by the majority of users (PCA with 10 PCs) is most used for historical reasons and has little theoretical or empirical justification.

      We added simulations with environment effects correlated with ancestry, which we hope will make our study even more relevant as it does make our evaluations even more realistic than before. In the presence of environment effects, LMM without PCs remains among the best approaches, although occasionally LMM with PCs or PCA will perform slightly better. However, modeling environment directly (with the true variables) improves performance much more than by using PCs to model environment indirectly, so we believe that is not a strong reason for continuing to use PCs (in LMMs or otherwise) unless there is no choice.

      We also added the following text to the discussion: “However, recent approaches not tested in this work have made LMMs more scalable and applicable to biobank-scale data (Loh et al., 2015; Zhou et al., 2018; Mbatchou et al., 2021), so one clear next step is carefully evaluating these approaches in simulations with larger sample sizes.” As stated earlier, we believe that the difference in performance between LMM and PCA will remain in larger sample sizes because cryptic relatedness is more prevalent in that setting.

      We excluded the “lack of data sharing” point from our discussion because it does not align well with the goals of our manuscript. The current solution to the lack of data sharing is meta-analysis, but its use does not give PCA or LMM an inherent advantage, since it can be applied to the summary statistics of either (or even a combination of models, in theory). There is interesting recent work on “federated” PCA and LMM association (both versions exist), that allow a single model to be fit jointly to separate datasets (residing in different buildings across the world) as if they were combined into a single dataset. Thus, these issues do not explain or motivate why PCA or LMM should be used.

      Reviewer #2 (Public Review):

      Yao and Ochoa present a very nice paper examining the age-old question of whether LMM or PCA is a better way to adjust for structure (population, family, admixture). The authors provide a very nice and detailed overview of the previous research addressing this question, summarizing it in a table. They find that LMMs are generally better at accounting for population structure. However, I feel there are a couple of important factors that are missing. One is the consideration of environmental structure. Another is that the relationship between PCA and LMM is usually a bit more complicated in practice than depicted here, where the devil really lies in the details. Also, I think there are a couple of key reasons why LMMs haven’t been adapted as quickly as one might have expected, including case-control imbalance and cohort meta-analyses, which I feel the authors could point out. In fact, I believe LMMs have become sort of popular in recent years (e.g. Japan Biobank GWAS results).

      We added environment simulations, which we agree was an important shortcoming of the previous version of our work.

      We now discuss how the PCA and LMM connection can be more complicated in practice, but as the main difference is in how LD is handled, once that is correctly adjusted, PCs and random effects are still mostly modeling the same relatedness signals. Ultimately, our main conclusion is unchanged, namely that only LMMs can model family relatedness, which is their key advantage.

      We briefly commented on case-control imbalance in our discussion (now made more clear), but since this involves binary traits, which we did not explicitly test in this work, it is out of scope.

      Cohort meta-analysis does not influence whether to use PCA or LMM, since it can be performed with summary statistics from either model (and in theory even a combination of different models per cohort). The broad use of meta-analysis does not in itself prevent users from using PCA or LMM within individual cohorts. The use of meta-analysis is very interesting in its own right, but it is outside the scope of this work.

      Reviewer #3 (Public Review):

      This paper examines the relative performance of linear mixed models (LMMs), principal components (PCA), and their combination (PCA-LMM) for genetic association studies in human populations. The authors claim that previous papers examining this question are inadequate and that: (i) there remains confusion on which method is best and in which context, (ii) that the metrics used in previous evaluations were insufficient, and (iii) that the simulation settings used in previous papers were not comprehensive. To fix these problems the authors perform an extensive set of simulations within several frameworks and suggest two new metrics for evaluating performance.

      Strengths:

      The simulation framework used in this paper and the extensive number of simulations provide an opportunity to examine the relative properties of the three approaches (LMM, PCA, PCA-LMM) in a variety of contexts.

      The parameters of the simulation framework are based on highly diverged populations, which is an increasingly common analysis choice that has not been examined in detail via simulation previously.

      The evaluation metrics used in this paper are AUC and a test of the uniformity of the p-value distribution under the null. This is an improvement over some previous analyses which did not examine power and relied on less sensitive tests of type I error.

      Weaknesses:

      This paper has a limited set of population frameworks just like all papers before it. The breakdown of which method is best (LMM, PCA, PCA-LMM) will be a function of the simulation framework chosen.

      Ameliorating this issue, we added additional simulations with low heritability and with environment effects. We are pleased to report that all of our conclusions hold at low heritability (h2 = 0.3), and for the most part under environment effects (which occasionally give LMM with PCs and PCA a small advantage, but often LMM with no PCs remains best, and we show PCs are no replacement for directly modeling these environment effects).

      The frameworks chosen for this paper are certainly not comprehensive in contemporary human genetic studies. In fact, the authors make a number of unusual choices. For example, the populations in the simulated study have extremely large Fsts. While this is also a strength, the lack of more standard study designs is a weakness. More importantly, there is no simulation of family effects, which is the basis of many of the PCA-LMM papers reported in Table 1.

      We now better motivate in the introduction our focus on association studies of multiethnic and admixed individuals, which are nowadays very common and which have greater FST values than earlier studies. In reference to higher simulated FSTs, we also now cite our recent work, which has found that many previous FST estimates are downwardly biased (Ochoa and Storey, 2021, 2019). We simulated data that was fit to each of our three real datasets using our unbiased methods, so those values that (understandably) appear high are actually more correct (for multiethnic populations such as those in 1000 Genomes, HGDP, etc) than previous estimates in the literature. In our previous work we also determined that only previous pairwise FST estimators are unbiased (under some conditions), and using a previous pairwise FST estimator (from Bhatia et al., 2013) we obtained equally high values between the most diverged human populations (values from a revised version of Ochoa and Storey, 2019 that isn’t on bioRxiv yet): In HGDP, the largest pairwise FST is 0.479, between Pima and PapuanSepik; In Human Origins, the largest estimate is 0.396, between Cabecar and Baining_Malasait; Lastly, in 1000 Genomes, the largest estimate is 0.135, between YRI and JPT. (1000 Genomes was generally less structured than HGDP and Human Origins, because the latter include more diverse populations.) Several previous estimates from the literature, all between one hunter-gatherer Sub-Saharan African subpopulation and one non-African subpopulation resulted in values of about 0.25 (Bowcock et al., 1991, Henn et al., 2011, Bergstrom et al., 2020). FST estimates are also greater from whole-genome sequencing versus array data (revised version of Ochoa and Storey, 2019).

      Family (household) effects is a case where PCA is not expected to outperform LMM, though standard LMMs do not model this effect explicitly either and may not do much better. As this is a feature of family studies that ought to be absent in population studies (as usually only siblings are in the same household, and not more distant relatives), it is also not entirely relevant to the majority of our simulations. In these ways, including such a feature in our simulations does not align with the goals of this present work, but we agree this is an important framework that deserves more attention in future evaluations.

      The discussion (and simulations) of LMM vs PCA, particularly LMMs with PCs as fixed effects misses the critical distinction of whether PCs are in-sample (in which case including PCs as fixed effects effectively serves as a preconditioner for the kinship matrix, speeding up iterative methods such as BOLT), or projections of individuals onto out-of-sample principal axes. There is also no discussion of LOO methods to address “proximal contamination”, also quite relevant in evaluating power as a function of the number of PCs.

      We added the following to our discussion concerning out-of-sample PC projections: “We do not consider the case where samples are projected onto PCs estimated from an external sample (Prive et al., 2020), which is uncommon in association studies, and whose primary effect is shrinkage, so if all samples are projected then they are all equally affected and larger regression coefficients compensate for the shrinkage, although this will no longer be the case if only a portion of the sample is projected onto the PCs of the rest of the sample.”

      We also added the following to the discussion concerning the LOCO approach: “Similarly, the leave-onechromosome-out (LOCO) approach for estimating kinship matrices for LMMs prevents the test locus and loci in LD with it from being modeled by the random effect as well, which is called”proximal contamination” (Lippert et al., 2011, Yang et al., 2014). While LOCO kinship estimates vary for each chromosome, they continue to model family relatedness, thus maintaining their key advantage over PCA.”

      The same new discussion paragraph closes with the following thoughts concerning LOCO and related approaches: “LD effects must be adjusted for, if present, so in unfiltered data we advise the previous methods be applied. However, in this work, simulated genotypes do not have LD, and the real datasets were filtered to remove LD, so here there is no proximal contamination and LD confounding is minimized if present at all, so these evaluations may be considered the ideal situation where LD effects have been adjusted successfully, and in this setting LMM outperforms PCA. Overall, these alternative PCs or kinship matrices differ from their basic counterparts by either the extent to which LD influences the estimates (which may be a confounder in a small portion of the genome, by definition) or by sampling noise, neither of which are expected to change our key conclusion.”

      Lastly, we added the following to a different discussion paragraph: “A different benefit for including PCs were recently reported for BOLT-LMM, which does not result in greater power but rather in reduced runtime, a property that may be specific to its use of scalable algorithms such as conjugate gradient and variational Bayes (Loh et al., 2018).”

      There is no discussion/simulation of spatial/environmental effects or rare vs common PCs as raised in Zaidi et al 2020. There are some open questions here regarding relative performance the authors could have looked at. Same for LMMs with multiple GRMs corresponding to maf/ld bins and thresholded GRMs. For example, it would be helpful to know if multiple-GRM LMMs mitigate some of the problems raised in the Zaidi paper.

      We added simulations with environment effects, which are based on a two-level hierarchy of population labels so they are spatial to the extent that these labels capture spatial relationships between populations. However, our small sample size data are not well suited to study rare variants and their structure, so its out of scope. (The sample size limitation is also covered in a new discussion paragraph.) We hope to tackle this very interesting question in future work.

      We added the following paragraph to our discussion: “Another limitation of this work is ignoring rare variants, a necessity given our smaller sample sizes, where rare variant association is miscalibrated and underpowered. Using simulations mimicking the UK Biobank, recent work has found that rare variants can have a more pronounced structure than common variants, and that modeling this rare variant structure (with either PCA and LMM) may better model environment confounding, improve inflation in association studies, and ameliorate stratification in polygenic risk scores (Zaidi and Mathieson, 2020). Better modeling rare variants and their structure is a key next step in association studies.”

  3. Apr 2023
    1. Author Response:

      We thank the editors and reviewers for their assessment of our manuscript, and their agreement that we present compelling evidence for post-transcriptional regulation of AURKA through the 3’UTR.

      In response to Reviewer 1, we acknowledge that much of our study is performed exclusively in U2OS cells, and that study of alternative polyadenylation in additional cell lines would serve to further generalize our findings. However, as U2OS are a well-known model cell line for cell cycle studies we believe our demonstration of cell cycle regulation of AURKA through its 3’UTR offers a depth of understanding that is perhaps of greater interest than confirming the existence of alternative AURKA 3’UTRs in additional cell lines, using our methods. We note that the recent rapid growth in RNA seq data resources allows easy confirmation of the broad existence of alternative polyadenylation events on a genome-wide scale. For example, AURKA-specific data extracted from a recent benchmark study of Nanopore long read RNA sequencing (Chen et al., 2021) clearly shows the existence of two distinct AURKA 3’UTRs differentially expressed between a number of different cancer cell lines. In addition, a recent study investigating the landscape of APA at single-cell resolution detected AURKA APA isoforms in HeLa and MDA-MB-468 cell lines (Wang et al., 2022). Their study further identifies AURKA among genes showing negative correlation between generalized distal polyA site usage index (gDPAU) and expression levels, meaning preference to use the proximal polyA site when expression levels increase, and include AURKA in the gene cluster showing slight increase in usage of the distal polyA site from G1 to M phase (Wang et al., 2022). Both studies are in support of the evidence presented in our manuscript.

      We agree with Reviewer 2 that better information on translation rates would improve our understanding of the impact of translation regulation on AURKA levels. Some insight on the translation rate of AURKA in the cell cycle can be derived from inspection of the ribosome profiling dataset published by Tanenbaum et al., 2015. From their analysis, translation efficiency of AURKA mRNA in G2 is 1.59 times that in G1 and in G1 it is 0.69 times that in M phase, whilst in G2 it is 1.10 times higher than in M. Such data reveal a reversible increase in translation of AURKA mRNA, alongside other mitotic regulators, in preparation for M phase (Tanenbaum et al., 2015). These results are in accordance with our findings that translation rates contribute modestly to cell cycle changes in AURKA levels in normal cells, and we concur with Reviewer 3’s comment that the contribution of increased translation rate to AURKA levels at mitosis is less than the change in mRNA levels at this point in the cell cycle.

      We think the significance of the regulatory mechanism we describe lies rather in the large effect it has on AURKA levels in interphase (when AURKA expression is normally repressed at both mRNA and translation rate). We hypothesise that it is interphase regulation that may be relevant to roles of AURKA in cancer (and to the association of APA with cancer) (Bertolin and Tramier, 2020; Naso et al., 2021). It is indeed the case that (i) AURKA regulation by miRNA, (ii) cooperation between APA and translation and (iii) cell-cycle dependent control of AURKA at the translation level, are already known. We believe the novelty of our study lies in drawing together these elements to provide new insight into AURKA regulation, using tools that allow similar investigation of other APA events, and contributing new ideas for future therapeutic interventions for disease proteins regulated via APA.

    1. Author Response:

      We would like to thank you for your thorough review of the manuscript. We will take all comments into account in the revised version of the manuscript. Please find below our provisional responses to your comments.

      eLife assessment

      This study reports useful information on the limits of the organotypic culture of neonatal mouse testes, which has been regarded as an experimental strategy that can be extended to humans in the clinical setting for the conservation and subsequent re-use of testicular tissue. The evidence that the culture of testicular fragments of 6.5-day-old mouse testes does not allow optimal differentiation of steroidogenic cells is compelling and would be useful to the scientific community in the field for further optimizations.

      Thank you for this assessment. We will carefully consider all comments and make the requested revisions to improve the manuscript.

      Public Reviews

      Reviewer #1 (Public Review):

      In this manuscript, the authors aimed to compare, from testis tissues at different ages from mice in vivo and after culture, multiple aspects of Leydig cells. These aspects included mRNA levels, proliferation, apoptosis, steroid levels, protein levels, etc. A lot of work was put into this manuscript in terms of experiments, systems, and approaches. However, as written the manuscript is incredibly difficult to follow. The Introduction and Results sections contain rather loosely organized lists of information that were altogether confusing. At the end of reading these sections, it was unclear what advance was provided by this work. The technical aspects of this work may be of interest to labs working on the specific topics of in vitro spermatogenesis for fertility preservation but fail to appeal to a broader readership. This may be best exemplified by the statements at the end of both the Abstract and Discussion which state that more work needs to be done to improve this system.

      As explained below, we will rework and reorganize the manuscript to make it clearer, more meaningful and more precise. We believe that this work may be of interest to a broader readership. Indeed, the development of a model of in vitro spermatogenesis could be of interest for labs working on the specific period of puberty initiation, on germ and somatic cell maturation and on steroidogenesis during this period, and could even be useful for testing the toxicity of cancer therapies, drugs, chemicals and environmental agents (e.g. endocrine disruptors) on the developing testis.

      Reviewer #2 (Public Review):

      Preserving and restoring the fertility of prepubertal patients undergoing gonadotoxic treatments involves freezing testicular fragments and waking them up in a culture in the context of medically assisted procreation. This implies that spermatogenesis must be fully reproduced ex vivo. The parameters of this type of culture must be validated using non-human models. In this article, the authors make an extensive study of the quality of the organotypic culture of neonatal mouse testes, paying particular attention to the differentiation and endocrine function of Leydig cells. They show that fetal Leydig cells present at the start of culture fail to complete the differentiation process into adult Leydig cells, which has an impact on the nature of the steroids produced and even on the signaling of these hormones.

      The authors make an extensive study of the different populations of Leydig cells which are supposed to succeed each other during the first month of life of the mouse to end up with a population of adult and fully functional cells. The authors combine quantitative in situ studies with more global analyzes (RT-QtPCR Western blot, hormonal assays), which range from gene to hormone. This study is well written and illustrated, the description of the methods is honest, the analyses systematic, and are accompanied by multiple relevant control conditions.

      Since the aim of the study was to study Leydig cell differentiation in neonatal mouse testis cultures, the study is well conceived, the results answer the initial question and are not over-interpreted.

      My main concern is to understand why the authors have undertaken so much work when they mention RNA extractions and western blot, that the necrotic central part had to be carefully removed. There is no information on how this parameter was considered for immunohistochemistry and steroid measurements. The authors describe the initial material as a quarter testis, but they don't mention the resulting size of the fragment. A brief review of the literature shows that if often the culture medium is crucial for the quality of the culture (and in particular the supplementations as discussed by the authors here), the size of the fragments is also a determining factor, especially for long cultures. The main limitation of the study is therefore that the authors cannot exclude that central necrosis can have harmful effects on the survival and/or the growth and/or the differentiation of the testis in culture. In this sense, the general interpretation that the authors make of their work is correct, the culture conditions are not optimized.

      When using the organotypic culture system at a gas-liquid interphase, the central part of the testicular tissue becomes necrotic. As previously reported (Komeya et al., 2016), the central region receives insufficient nutrients and oxygen. In vitro spermatogenesis therefore only occurs in the seminiferous tubules present in the peripheral region. As in our previous publications and recent RNA-seq analyses (Dumont et al., 2023), the central necrotic area was removed so that transcript and protein levels in the healthy part of the samples (i.e. where in vitro spermatogenesis occurs) could be measured and compared with in vivo controls. For histochemical and immunohistochemical analyses, only seminiferous tubules located at the periphery of the cultured fragments (outside of the necrotic region) were analyzed. Steroid measurements were performed on the entire fragments.

      The initial material was indeed a quarter testis, which represents approximately 0.75 mm3. No growth of the fragments was observed during the organotypic culture period. We agree with the reviewer that the composition of the culture medium is not the only parameter to be considered for the quality of the culture and that the size of the fragments is also a determining factor. We do not exclude that central necrosis can have harmful effects on the survival and/or the growth and/or the differentiation of the testis in culture. Optimization of the culture medium and culture design (so that the tissue center receives sufficient nutrients and oxygen) will be necessary to increase the yield of in vitro spermatogenesis.

      Organotypic culture is currently trying to cross the doors of academic research laboratories to become a clinical tool, but it requires many adjustments and many quality controls. This study shows a perfect example of the pitfall often associated with this approach. The road is still long, but every piece of information is useful.

      Reviewer #3 (Public Review):

      Moutard, Laura, et al. investigated the gene expression and functional aspects of Leydig cells in a cryopreservation/long-term culture system. The authors found that critical genetic markers for Leydig cells were diminished when compared to the in-vivo testis. The testis also showed less androgen production and androgen responsiveness. Although they did not produce normal testosterone concentrations in basal media conditions, the cultured testis still remained highly responsive to gonadotrophin exposure, exhibiting a large increase in androgen production. Even after the hCG-dependent increase in testosterone, genetic markers of Leydig cells remained low, which means there is still a missing factor in the culture media that facilitates proper Leydig cell differentiation. Optimizing this testis culture protocol to help maintain proper Leydig cell differentiation could be useful for future human testis biopsy cultures, which will help preserve fertility and child cancer patients.

      Methods: In line 226, there is mention that the central necrotic area was carefully removed before RNA extraction. This is particularly problematic for the inference of these results, especially for the RT-qPCR data. Was the central necrotic area consistent between all samples and variables (16 and 30FT)? How big was the area? This makes the in-vivo testis not a proper control for all comparisons. Leydig cells are not evenly distributed throughout the testis. A lot of Leydig cells can be found toward the center of the gonad, so the results might be driven by the loss of this region of the testis.

      When using the organotypic culture system at a gas-liquid interphase, the central part of the testicular tissue becomes necrotic. As previously reported (Komeya et al., 2016), the central region receives insufficient nutrients and oxygen. In vitro spermatogenesis therefore only occurs in the seminiferous tubules present in the peripheral region. As in our previous publications and recent RNA-seq analyses (Dumont et al., 2023), the central necrotic area was removed so that transcript levels in the healthy part of the samples (i.e. where in vitro spermatogenesis occurs) could be measured and compared with in vivo controls. The transcript levels of the selected genes were of course normalized to housekeeping genes (Gapdh and Actb) or to the Leydig cell-specific gene Hsd3b1.

      The central necrotic area was consistent between all samples and variables: it represents on average 16-27% of the explants.

      Moreover, we would like to point out that the gonads were cut into four fragments before in vitro cultures. It is therefore the central part of these explants that was removed and not the central part of the gonads. The central part of the gonads was thus included in our analyses.

      What did the morphology of the testis look like after culturing for 16 and 30 days? These images will help confirm that the culturing method is like the Nature paper Sato et al. 2011 and also give a sense of how big the necrotic region was and how it varied with culturing time.

      This point will be addressed in the detailed responses to reviewers.

      There are multiple comparisons being made. Bonferroni corrections on p-value should be done.

      This point will be addressed in the detailed responses to reviewers.

      Results: In the discussion, it is mentioned that IGF1 may be a missing factor in the media that could help Leydig cell differentiation. Have the authors tried this experiment? Improving this existing culturing method will be highly valuable.

      The decreased Igf1 mRNA levels found in the present study are in line with the RNA-seq data of Yao et al., 2017. As mentioned in the Discussion section, the addition of IGF1 in the culture medium led to a modest increase in the percentages of round and elongated spermatids in cultured mouse testicular fragments (Yao et al., 2017). However, the effect of IGF1 supplementation on Leydig cell differentiation was not investigated. The supplementation of organotypic culture medium with IGF1 is currently being tested in our research team.

      Add p-values and SEM for qPCR data. This was done for hormones, should be the same way for other results.

      p-values and SEM are shown for both qPCR and hormone data.

      Regarding all RT-qPCR data-There is a switch between 3bHSD and Actb/Gapdh as housekeeping genes. There does not seem to be as some have 3bHSD and others do not. Why do Igf1 and Dhh not use 3bHSD for housekeeping? If this is the method to be used, then 3bHSD should be used as housekeeping for the protein data, instead of ACTB. Also, based on Figure 1B and Figure 2A (Hsd3b1) there does not seem to be a strong correlation between Leydig cell # and the gene expression of Hsd3b1. If Hsd3b1 is to be used as a housekeeper and a proxy for Leydig cell number a correlation between these two measurements is necessary. If there is no correlation a housekeeping gene that is stable among all samples should be used. Sorting Leydig cells and then conducting qPCR would be optimal for these experiments.

      Hsd3b1 was used as a housekeeping gene only to normalize the mRNA levels of Leydig cell-specific genes. Therefore, Igf1 and Dhh transcript levels were not normalized with Hsd3b1 since Igf1 is expressed by several cell types in the testis (Leydig cells, Sertoli cells, peritubular myoid cells) and Dhh is expressed by Sertoli cells.  

      Regarding western blots, the expression of AR, CYP19 and FAAH could not be normalized with 3bHSD since AR is expressed by Leydig cells, Sertoli cells and peritubular myoid cells, CYP19 is expressed by Leydig cells and germ cells and FAAH is expressed by Sertoli cells. We will review the western blot results for CYP17A1.

      As shown in Figure 1B, the number of Leydig cells per cm2 of testicular tissue is not significantly different between the different time points in vivo (6 d_pp_, 22 d_pp_ and 36 d_pp_), in vitro (D16 FT and D30 FT) and between the in vivo and in vitro conditions (22 d_pp_ versus D16 FT, 36 d_pp_ versus D30 FT). Similarly, our data in Figure 2A show that Hsd3b1 mRNA levels are not significantly different between the different time points in vivo (6 d_pp_, 22 d_pp_ and 36 d_pp_), in vitro (D16 FT and D30 FT) and between 22 d_pp_ and D16 FT. However, Hsd3b1_mRNA levels were significantly lower in D30 FT tissues compared to 36 d_pp. We will measure the correlation between the number of Leydig cells per cm2 of testicular tissue and Hsd3b1 mRNA levels, as suggested by the reviewer.

      Figure 2A (CYP17a1): It is surprising that the CYP17a1 gene and protein expression is very different between D30FT and 36.5dpp, however, the immunostaining looks identical between all groups. Why is this? A lower magnification image of the testis might make it easier to see the differences in Cyp17a1 expression. Leydig cells commonly have autofluorescence and need a background quencher (TrueBlack) to visualize the true signal in Leydig cells. This might reveal the true differences in Cyp17a1.

      This point will be addressed in the detailed responses to reviewers.

      Figure 3D: there are large differences in estradiol concentration in the testis. Could it be that the testis is becoming more female-like? Leydig and Sertoli cells with more granulosa and theca cell features? Were any female markers investigated?

      We show in the present study that the expression level of the Sertoli cell-specific gene Dhh is not reduced in organotypic cultures. We also previously found that the expression level of the Sertoli-cell specific gene Amh was not reduced in in vitro matured testicular tissues (Rondanino et al., 2017). Moreover, our recent unpublished data show that Sox9, a testis-specific transcription factor, is expressed in Sertoli cells in organotypic cultures. These results suggest that Sertoli cells are not becoming granulosa-like cells and that the testis is not becoming more female-like. Markers of granulosa and theca cells were not investigated.

      Figure 3D and Figure 5A: It is hard to imagine that intratesticular estradiol is maintained for 16-30 days without sufficient CYP19 activity or substrate (testosterone). 6.5 dpp was the last day with abundant CYP19 expression, so is most of the estrogen synthesized on this first day and it sticks around? Are there differences in estradiol metabolizing enzymes? Is there an alternative mechanism for E production?

      This point will be addressed in the detailed responses to reviewers.

    1. Author Response

      Reviewer #2 (Public Review):

      In a neonatal model of bacterial meningitis induced by s.c. injection of E. coli, transcriptional changes were found across all major cell types including endothelial cells, fibroblasts and macrophages. Among macrophages, they describe 2 resident subsets and 2 inflammatory subsets. By immunohistochemistry of arachnoid and dura flatmounts, they show vascular changes upon infection, including clustering of CLDN5 and PECAM1, and disorganized capillary morphology, which was dependent on Tlr4 signaling but independent of arachnoid macrophages.

      The manuscript would benefit from rewriting, it is not written in a concise manner and the rationale for experiments, time points for analyses and their conclusions are not clear. The model of s.c. bacterial infection is not well introduced and overall changes in the periphery, survival curves or bacterial counts (in the KO models) in the meninges/brain are not mentioned.

      Thank you for those comments. We hope that the text is now more readable. We have added a separate section to describe the meninges model and added data on survival and E coli counts (Supplemental Figure 3).

    1. Author Response

      Reviewer #1 (Public Review):

      This work puts forward a comprehensive characterisation of colorectal cancer (CCCRC), by classifying it into 4 subtypes with distinct TME features. It uses 10 public databases: 8 microarray datasets for the training of molecular classification and 2 RNAseq for validation (CRC-RNAseq) to identify the 4 subtypes using unsupervised machine learning (consensus clustering). These 4 subtypes were found to be somewhat distinct in terms of immune response and the possibilities for effective treatments. They found that one subtype may be more sensitive to chemotherapy, two to WNT pathway inhibitor SB216763 and Hedgehog pathway inhibitor vismodegib, and one to ICB treatment. They show an association with patient outcome in terms of PFS, validated in the validation cohort. They used histology to correspond the subtypes to known pathological types, as well as investigating their T cell makeup. They also investigated the genetic tumour evolution that may occur between the subtypes. A single-sample gene classifier was put forward as a way of identifying the class of cancer. The evidence for the main results of the work is convincing, but a few areas need to be clarified and extended.

      In the determination of the 4 subtypes (C1-C4) the methodology is clear, and the definition of the training and validation data are clear and well presented. The techniques used are well suited to the problem. The performance of the classification as a predictor of prognosis is presented as KM curves of PFS and OS for the training and validation sets. The training data shows a significant log-rank p-value in both PFS and OS. The validation data shows a significant effect in PFS.

      What follows is quite an exhaustive process of finding differences between the cohorts using a multitude of techniques and datasets, including genomics, epigenetics, transcriptomics, and proteomics. These sections are mainly descriptive and do add understanding to the classification, especially with regard to the T-cell populations that are invasive.

      Improvements could be made to the latter sections of the main paper. The basis for the potential clinical responses of the subtypes is arrived at via a "pre-clinical model" based on 81 genes. It would benefit from clarification on what genes were used in model training and details of the final model. Similarly the description of the "Single-sample gene classifier" could be enhanced similarly with a better description of which genes are in the final classifier.

      Thank you for taking the time to review our article and for your positive feedback. Your thorough evaluation of our work has been invaluable to us, and we appreciate your recognition of the effort we put into it.

      1) The basis for the potential clinical responses of the subtypes is arrived at via a "pre-clinical model" based on 81 genes.

      The exact details of the filtering criteria used to obtain the list of pre-clinical model genes have added to the Methods section of the study (Lines 1061-108, Lines 503-511) (Supplementary file 3a). To explore the treatment for each CCCRC subtype using cancer cell line drug-sensitivity experiments, we developed a pre-clinical model based on subtype-specific, cancer cell-intrinsic gene markers according to a previously published study (Eide et al., 2017). Firstly, the “limma” package was used to identify DEGs with FDR < 0.05 between each of the four subtypes and the remaining subtypes in the CRC-AFFY cohort. To identify subtype-specific genes in one of the subtypes, we excluded those that were found to be differentially expressed in comparisons between one of the other subtypes and the remaining subtypes. The upregulated subtype-specific genes (log2 (fold change, FC) > 0 and FDR < 0.05) was ranked based on their log2FC and selected the top 500 genes for further gene screening. Secondly, the GEP of human CRC tissues versus patient-derived xenografts (PDX) in the GSE35144 dataset by the R package “limma” was used to remove those genes associated with stromal and immune components. DEGs with FDR > 0.5 and log2 FC < 1 between human CRC tissues versus PDX were considered as cancer cell-intrinsic genes. Thirdly, we also utilized human CRC cell lines to obtained cancer cell-intrinsic genes. A total of 71 human CRC cell lines with RNAseq data (log2TPM) was obtained from the Genomics of Drug Sensitivity in Cancer (GDSC) database (https://depmap.org/portal/download/all/), 43 of which had dose-response curve (area under the curve, AUC) values. The MSI status, FGA and TMB information of CRC cell lines was obtained from cbioportal website (https://www.cbioportal.org/study/summary?id=ccle_broad_2019). RNAseq data for 71 human CRC cell lines was used to further determine the cancer cell-intrinsic genes and genes among the top 25% within (i) the 10−90 % percentile range of the largest expression values and (ii) the highest expression in at least three samples. The subtype-specific genes and cancer cell-intrinsic genes were intersected to generate the gene list for developing the pre-clinical model. The pre-clinical model was developed using the nearest template prediction (NTP) function of R package “CMScaller”, which can be applied to cross-tissues and cross-platform predictions (Hoshida, 2010). The GEP (log2TPM) of 71 human CRC cell lines normalized by the Z-score were input into the pre-clinical model, and the cell lines were divided into four CCCRC subtypes. (Lines 1061-1088)

      Here we want to make a point that we changed from using the xgboost algorithm to using the NTP algorithm to build our pre-clinical model. Based on the genomic features of the cell line, we evaluated the reliability of the final pre-clinical model and found that the pre-clinical model built using the NTP algorithm is more reliable. As expected, the C4 subtype cell lines demonstrated the highest TMB values and MSI frequency while exhibiting the lowest FGA scores when compared to other subtypes (Figure 6-figure supplement 1G-I). In contrast, C1 and C3 subtype cell lines showed significantly higher FGA scores and significantly lower TMB values and MSI frequency. The C2 subtype cell lines had median FGA scores, TMB values, and MSI frequency. The pre-clinical model is publicly available at https://github.com/XiangkunWu/pre_clinical_model. (Lines 503-511)

      2) Similarly the description of the "Single-sample gene classifier" could be enhanced similarly with a better description of which genes are in the final classifier.

      We apologize for any confusion caused in our revised regarding the derivation of the CCCRC classifier. Specifically, we have added more details on the derivation of model genes and the establishment of the model, and ensured the availability of the CCCRC classifier. The method details and results of deriving the model genes and building the model are described next. (Lines 1102-1121) (Lines 562-579) (Supplementary file 3c)

      In order to facilitate the widespread application of CCCRC classification system, we established a simple gene classifier to predict CCCRC subtypes. Firstly, we filtered genes based on their mean expression and variance in the CRC-AFFY cohort, and genes with expression and variance below the bottom 25% were removed. Then, we applied the Random Forest algorithm (RF) in the R package "caret" to perform feature selection on the CCCRC subtype-specific genes of the CRC-AFFY cohort. The top 20 most informative features for each subtype were ranked and selected based on the impurity measure generated by the algorithm. This allowed us to identify critical genes that are strongly associated with each CCCRC subtype and develop the CCCRC classifier. Next, we randomly divided the CRC-AFFY cohort into training and validation sets at a ratio of 7:3 using “createDataPartition” function provided in the R package "caret" (seed=123). The GEP was normalized with Z-scores prior to model training and validation. The CCCRC classifiers were trained with the top 80 subtype-specific genes using the RF, Support Vector Machine (SVM), eXtreme Gradient Boosting (xgboost), and Logistic Regression algorithms implemented in the R package "caret". Finally, we validated the CCCRC classifier on the GSE14333 and GSE17536 datasets, as well as the CRC-AFFY cohort. We evaluated the predictive performance of the CCCRC classifier by evaluating measures such as accuracy value and F1 score, which were generated using the " confusionMatrix " function provided in the R package "caret". (Lines 1102-1121)

      We established the CCCRC classifier on the training set by utilizing multiple machine learning algorithms based on the GEP of 80 upregulated subtype-specific genes (Supplementary file 3c). Upon application to the test set, GSE14333, and GSE17536 datasets, the performance of the eXtreme Gradient Boosting (xgboost) algorithm was the best with the highest accuracy values and F1 scores compared to the Random Forest (RF), Support Vector Machine (SVM), and Logistic Regression algorithms (Figure 6-figure supplement 4). Notably, the CCCRC classifier based on the xgboost algorithm displayed robust performance across gene expression platforms, Affymetrix and RNA-sequencing platforms, exhibiting a balanced accuracy of > 80% for all subtypes (Supplementary file 3d). These findings demonstrated the stability and cross-platform applicability of our classifier. The CCCRC classifier based on the xgboost algorithm is publicly available at https://github.com/XiangkunWu/CCCRC_classifier, and the CCCRC subtype information of CRC patients can be obtained by directly inputting the GEP of 80 upregulated subtype-specific mRNA genes. The CCCRC classifier might facilitate the discovery of new biomarkers and the personalized treatment of clinical patients with CRC. (Lines 562-579)

      Reviewer #2 (Public Review):

      This study aimed to classify colorectal cancer (CRC) samples based on the expression of genes in selected gene lists, where the gene lists were chosen to represent aspects of the tumour microenvironment, tumour-associated immune cells, and tumour cells. The resulting clusters were then used to define a classifier, followed by a detailed description of molecular features of the tumours and tumour microenvironments assigned to each cluster. The authors claim this study is more "holistic" than previous work on CRC clustering/classifiers because they aimed to explicitly include additional components of the tumour microenvironment in both the clustering/classifier definition and in the subsequent description of molecular characteristics.

      The CCCRC clustering and the resulting classifier presented in this paper are derived from published RNAseq studies. The multi-omics aspect of the work is restricted to smaller sample numbers for which both transcriptomic and another omics dataset were available in public resources and comprises a description or correlative analysis of each omics data type within each of the assigned CCCRC subtypes.

      By applying solid computational methods to a compendium of published RNAseq datasets (n~1500 tumours), they found that tumour samples from colorectal cancers clustered into 4 subtypes ("CCCRC" subtypes) on the basis of 61 pre-defined gene expression signatures. These subtypes correlated with but did not correspond to, the previously described Consensus Molecular Subtypes (CMS) of colorectal tumours.

      Other types of molecular data were available for some tumours, obtained from the same published resources: whole-slide images, mutations, tumour proteomics, and/or scRNAseq. The authors reanalysed these datasets using standard methods and drew correlations with the CCCRC subtypes they had assigned in this work. To (semi-)quantify immune infiltration characteristics from whole-slide images (WSI), they additionally performed automated segmentation in addition to review by pathologists, which in combination produced a convincing WSI-derived dataset.

      In combination with existing CRC classifications, this study could facilitate future biomarker discoveries. This appears to be the authors' main claim, and the data and methods broadly support this claim.

      Thank you for taking the time to review our article and for your positive feedback. Your thorough evaluation of our work has been invaluable to us, and we appreciate your recognition of the effort we put into it.

      Some aspects of the work need to be clarified: 1) This work relies on the definition of 4 clusters of CRC tumours based on consensus clustering of the 61 gene lists, which in turn depends on the choice of clustering method and the choice of gene lists. Sufficient detail is provided about the gene lists and resulting clusters, but this paper does not show how robust the 4 clusters are to these choices; for example, the "Energy" gene list appears to be a relatively strong component of clusters C2 and C3.

      Thank you very much for providing such detailed and insightful feedback.

      1.1. The reviewer has raised a valid concern about the impact of gene list selection on the robustness of the clusters. To address this issue, we used the “pamr.predict” function of the R package “pamr” (Tibshirani et al., 2002) to extract centroids of each subtype that best represent each subtype and establish a PAMR classifier. PAM (Prediction Analysis of Microarrays) is a statistical technique to identify subsets of features that best characterize each class using nearest shrunken centroids (Tibshirani et al., 2002). The technique is general and can be used in many other classification problems. As shown in Figure 1-figure supplement 2E, a threshold of 0.566 with minimum 10-fold cross-validation error was selected to identify the 61 TME-related signatures that exhibit at least one non-zero difference between each subtype (seed = 11). These signatures were then used to construct a PAMR classifier with superior predictive capability, exhibiting an overall error rate of 15%. We used the established PAMR classifier to predict the CCCRC subtypes on the CRC-RNAseq cohort and the same four CCCRC subtypes were revealed, with similar patterns of differences in the TME components (Fig. S2F, G). This indicated that the 61 TME-related signatures best represent each subtype and are indispensable for achieving the identification of the four CCCRC subtypes. (Lines 161-168)

      1.2. The reviewer has raised a valid concern about the impact of the clustering method selection on the robustness of the clusters.

      We performed extensive data analysis attempts during our unsupervised clustering analysis, which primarily involved attempting various clustering methods, including K-means clustering, non-negative matrix factorization (NMF) clustering, and hierarchical clustering, as well as replacing different sources and categories of the TME-related signatures. To determine the optimal clustering method and TME panel, we evaluated whether the TME panel could reproduce the heterogeneity of TME, the stability of the clustering itself, the biological characteristics of the subtypes, the correlation between subtypes and prognosis, and the correlation between subtypes and microsatellite instability (MSI), consensus molecular subtypes (CMS) classification system, and other molecular subtype systems. Due to the abundance of exploratory data analysis results, we ultimately selected the best clustering method and TME panel combination for showcase.

      1.3. Also, we analyzed the sensitivity analysis of the effect of TME-related signatures on the clustering results. Since the effect of removing one of the TME-related signatures on the clustering results was not well evaluated, we attempted to remove the entire category. We performed consensus clustering analysis again using the same parameters (partitioning around medoids (pam) clustering; "Pearson" distance; 1,000 iterations; from 2-6 clusters). When we conducted consensus clustering analysis using only immune-related signatures, we identified three subtypes: low (C2), moderate (C3), and high (C1) immune infiltration subtypes. When we included both immune-related and tumor-related signatures, we identified four subtypes: immunomodulatory (S1), cold (S2/S3), and immune-excluded (S4) subtypes. It appears that the immunosuppressed subtype in the CCCRC classification system may have been assigned to both S1 and S4 subtypes. Limiting the consensus clustering analysis to only immune-related or immune- and stroma-related signatures, as done in previous studies (Bagaev et al., 2021; He et al., 2018), did not allow reliable identification of all four CCCRC subtypes. These sensitivity analyses underscored the necessity of our well-designed TME panel to achieve the identification of the four CCCRC subtypes. (Lines 172-176) (Figure 1-figure supplement 4)

      2) The authors examined whether their CCCRC classification showed differential disease progression in available retrospective cohorts of people treated with anti-PDL1 therapy. The authors presented this work as "significance of CCCRC in guiding the clinical treatment of colorectal cancer", but the data presented in this section cannot support clinical treatment decisions, which would require prospective studies and clinical trial designs. However, this section is potentially useful for generating hypotheses about potential biomarkers related to the CCCRC subtypes, and might, in the future with additional evidence, contribute to the design of a trial. The authors point out that additional experimental evidence would be required.

      Thank you for your constructive suggestions. We agree that our retrospective analysis of the CCCRC classification in relation to disease progression under immune checkpoint blockade treatment does not directly support clinical treatment decisions. We acknowledge that additional experimental evidence would be required to fully support the use of the CCCRC classification as a clinical tool for guiding treatment decisions. We have highlighted in the corresponding section of the article that this research is pre-clinical and still requires substantial basic experiments and clinical trials to validate. (Lines 536, 751)

      3) Other prognostic or predictive clinicopathological variables for colorectal cancer are not discussed in detail in the present work but are important for further work on the prognostic and predictive value of CRC molecular subtypes and biomarker derivation. Discrepancies in treatment response have previously been observed in separate CRC trials of biologically targeted agents with different chemotherapy backbones and other authors have suggested that treatment interactions with the tumour microenvironment might in part explain these discrepancies (e.g. Aderka (2019) PMID:31044725, and others).

      3.1) Other prognostic or predictive clinicopathological variables for colorectal cancer are not discussed in detail in the present work but are important for further work on the prognostic and predictive value of CRC molecular subtypes and biomarker derivation.

      Thank you for bringing up this point. We apologize for not analyzing other clinicopathological variables for colorectal cancer in more detail in my original work. We agree that these variables are important for further study of our CCCRC classification system to guide biomarker derivation and to guide clinical treatment decisions. We added in the article the relationship between CCCRC subtypes and clinicopathological variables, as well as the comparison with CMS subtypes (Lines 256-262, 661-666). In addition, we have identified a clerical error in our manuscript and have corrected it accordingly. Specifically, the use of PFS as the endpoint in some parts of the manuscript was a mistake and has been corrected to DFS. We would like to clarify that the endpoint for the CRC-AFFY and CRC-RNAseq cohorts is DFS and OS, while the endpoint for the GSE104645 dataset is PFS and OS. For the immune checkpoint blockade therapy cohort, the endpoint for PRJEB23709 (Gide) is PFS and OS, and for the GSE135222 (Jung) dataset, the endpoint is PFS. Progression Free Survival (PFS) refers to the time from randomization (or treatment initiation) to the first occurrence of disease progression or death from any cause. The definition of Disease-Free Survival (DFS) is the time from randomization to the appearance of evidence of disease recurrence.

      We further analyzed the association of CCCRC subtypes with clinicopathological characteristics (Supplementary file 1f, Supplementary file 1g). We found that the C4 subtype was mostly diagnosed in right-sided CRC lesions and in females, which was consistent with the CMS1 subtype. The C1 and C3 subtypes were mainly observed in left-sided CRC lesions and in males, consistent with the CMS2 and CMS4 subtypes. The C3 subtype was strongly associated with more advanced tumor stages, which was the similarity to the CMS4 subtype, while the C4 subtype was associated with higher histopathologic grade, which was the similarity to the CMS1 subtype. Furthermore, our analysis using the Kaplan-Meier method demonstrated that patients with the C4 subtype had significantly higher disease-free survival (DFS) and overall survival (OS) compared to those with the C2 and C3 subtypes in the CRC-AFFY (Figure 1I, Figure 1-figure supplement 7A) and CRC-RNAseq cohorts (Figure 1-figure supplement 7B, C). Multivariate Cox proportional hazard regression analysis showed that the C4 subtype was an independent predictor of the best OS and DFS, whereas the C3 subtype was an independent predictor of the worst OS and DFS after adjustment for age, gender, tumor site, TNM stage, grade, adjuvant chemotherapy or not, MSI status, BRAF and KRAS mutations, and the CMS classification system in the combined cohort (the CRC-AFFY and CRC-RNAseq cohorts) (Supplementary file 1h). Considering that the C1, C2/C3, and C4 subtypes partially overlap with the CMS2, CMS4, and CMS1 subtypes, respectively, we also analyzed the prognostic differences between them in the combined cohort. We found that the DFS/OS of patients with the C1 subtype was worse than those with the CMS2 subtype (Figure 1-figure supplement 7D, E), the DFS/OS of patients with the C2 subtype was better than those with the CMS4 subtype (Figure 1-figure supplement 7F, G), the DFS/OS of patients with the C3 subtype was not significantly different from those with the CMS4 subtype (Figure 1-figure supplement 7F, G), and the DFS/OS of patients with the C4 subtype was significantly better than those with the CMS1 subtype (Figure 1-figure supplement 7H, I). Notably, the C2 subtype within the CMS4 subtype also had a better prognosis than the C3 subtype within the CMS4 subtype (Figure 1-figure supplement 7J, K). The above analysis demonstrated that the CCCRC classification system were closely associated with clinicopathological characteristics, were able to refine the CMS classification system and MSI status, as well as contributed to the understanding of the mechanisms underlying the different clinical phenotypes resulting from TME heterogeneity.

      3.2) Discrepancies in treatment response have previously been observed in separate CRC trials of biologically targeted agents with different chemotherapy backbones and other authors have suggested that treatment interactions with the tumour microenvironment might in part explain these discrepancies (e.g. Aderka (2019) PMID:31044725, and others).

      The reviewer's comments greatly contributed to the quality of our study. Aderka et al. discussed the reasons for the differences in the results of the CALGB/SWOG 80405 and FIRE-3 clinical trials, which may be related to differences in the chemotherapy backbone used and TME heterogeneity (Aderka et al., 2019). Both trials evaluated the combination of cetuximab or bevacizumab with a different chemotherapy backbone: in the CALGB/SWOG 80405 trial, 75% of patients received oxaliplatin, while in the FIRE-3 trial, all patients received irinotecan. The CCCRC classification system also facilitates the understanding of the differences in the results of the CALGB/SWOG 80405 and FIRE-3 clinical trials (Heinemann et al., 2014; Lenz et al., 2019). We have added this content to the discussion section of the article (Lines 753-777). Based on our examination of the results summarized in Figure 4 of the work by Aderka et al. (Aderka et al., 2019), we found that differences in the treatment outcomes of the CMS1 and CMS4 subtypes were the crucial factor behind the divergent results observed in the two clinical trials. The CMS1 and CMS4 subtypes have a microenvironment rich in CAFs. Our CCCRC classification results also showed that CMS1, in addition to mainly consisting of the C4 subtype, also contains a considerable number of the C2 subtype, while the CMS4 subtype mainly consists of the C2 and C3 subtypes. Furthermore, our study results indicated that the C2 subtype is suitable for chemotherapy in combination with bevacizumab, possibly because the combination can inhibit the CAFs and abnormal blood vessel formation in the microenvironment, thus alleviating the immune suppression of the immune cells. However, the C3 subtype is not suitable for chemotherapy in combination with bevacizumab because it only accumulates CAFs and abnormal blood vessel formation but lacks T cell infiltration. Therefore, we boldly speculate that the CMS1 and CMS4 subtypes in the CALGB/SWOG 80405 clinical trial may contain more C2 subtypes than those in the FIRE-3 clinical trial, leading to the CMS1 and CMS4 subtypes in the CALGB/SWOG 80405 clinical trial being more suitable for chemotherapy in combination with bevacizumab than cetuximab compared to the FIRE-3 clinical trial. Overall, the integration of CCCRC and CMS classification systems provides valuable insights for understanding the divergent outcomes of the two clinical trials (Lines 753-777).

      Reviewer #3 (Public Review):

      In their study: Comprehensive characterization of tumor microenvironment in colorectal cancer via histopathology-molecular analysis, Wu et al., aim to examine the contribution of the tumour microenvironment (TME) on biological and clinical heterogeneity in colorectal cancer (CRC).

      To achieve this the authors use a vast array of publicly available datasets across a variety of biological modalities (transcriptomic, epigenetic, mutational). Using thoughtfully curated genesets the authors classify CRC into 4 holistic comprehensive characterised CRC (CCCRC) subtypes which comprise immune, stromal, and tumour features of CRC biology.

      The authors investigate the association of their novel CCCRC subtypes with current "gold standard" classification schemes.

      The authors' integration of deep learning methods for HE classification and subsequent association with "Tumor level" CCCRC subtypes is a refreshing addition to the study. Comment on the degree of heterogeneity observed in HE samples and correlation to the heterogeneity of CCCRC subtypes would be a welcomed addition. It is likely publicly available datasets from such platforms as 10X Genomic Visium would be available for this type of analysis.

      Whilst one of the main outcomes of the study is the addition of another classification scheme to the field of colorectal cancer, the CCCRC scheme represents a holistic perspective on CRC classification.

      The authors provide a welcomed graphical overview of the complex narrative of the study in Figure 7.

      The authors focus on the classification of inter-patient heterogeneity and its associated predictive and prognostic utility. There appears to be a significant degree of overlap between immunosuppressive and immune excluded, and proliferative and immuno-modulatory signatures in Figure 1A. One of the major limitations of patient response to treatment is intra-patient heterogeneity, it would be nice for the authors to comment briefly on the degree of intra-patient heterogeneity of the CCCRC subtypes.

      Overall the authors succeed in providing a holistic deep characterization of CRC from the perspective of a variety of biological modalities. The authors provide a novel classification scheme for the field of CRC which demonstrates prognostic and predictive utility, which would benefit from further validation from external datasets. The authors demonstrate a pathway for integration and interpretation of complex high-dimensional data into clinically translatable currency such as the H&E.

      Thank you for taking the time to review our article and for your positive feedback. Your thorough evaluation of our work has been invaluable to us, and we appreciate your recognition of the effort we put into it.

      1) Comment on the degree of intra-patient heterogeneity of CCCRC subtypes would be nice.

      We have added intra-tumor heterogeneity analysis for each subtype (Lines 196-198). The level of intratumor heterogeneity (ITH) was significantly linked to poor prognosis and drug resistance (Caswell and Swanton, 2017). The ITH data used in our study for the CRC-RNAseq cohort was obtained from a previous study conducted by Thorsson et al. (Thorsson et al., 2018). As expected, the ITH of the C2 and C3 subtypes was higher than that of the other subtypes, while the ITH of the C4 subtype was the lowest (Figure 1F). Our analysis using the Kaplan-Meier method demonstrated that patients with the C4 subtype had significantly higher overall survival (OS) and disease-free survival (DFS) compared to those with the C2 and C3 subtypes. Furthermore, the C3 subtype was resistant to chemotherapy, cetuximab, bevacizumab, and ICB therapy. Our investigation of drug sensitivity data of cell lines also indicated that the C2 and C3 subtypes were generally not responsive to most drugs.

      2) A significant degree of overlap between immunosuppressive and immune excluded, and proliferative and immuno-modulatory signatures in Figure 1A is apparent and should be commented upon.

      Our research revealed that both C2 and C3 subtypes exhibited a high level of tumor stroma, while C1 and C4 subtypes were characterized by active DNA damage and repair and high tumor proliferation. Additionally, C2 and C4 subtypes had an abundance of immune components. This was consistent with our finding that there may be interconversion between the C1 and C4 subtypes, between the C4 and C2 subtypes, and between the C2 and C3 subtypes in this evolutionary pattern. The interconversion between C2 and C4 subtypes in this evolutionary pattern was the rarest situation, indicating that once the tumor enters the C2 subtype, it is difficult to reverse and will progress to the C3 subtype. (Lines 637-644)

      3) It is likely publicly available datasets from such platforms as 10X Genomic Visium would be available for this type of analysis.

      To investigate the spatial distribution relationship between four CCCRC subtypes of tumor cells, T cells, and stromal cells, we conducted a re-analysis of publicly available CRC spatial transcriptomics data (ST) obtained from the 10X website (https://www.10xgenomics.com/resources/datasets). The Space Ranger output files were then processed with Seurat (V4.1.1) (Hao et al., 2021) using SCTransform for normalization (Hafemeister and Satija, 2019). RunPCA were used to dimension reduction and RunUMAP to visualize the data. We used “ssGSEA” method implemented in the R package “GSVA” to score the six cell types (C1-C4 subtype cancer cells, mesenchymal cells, and T cells) (Hänzelmann et al., 2013). The “ssGSEA” method has been previously demonstrated to be highly reliable and suitable for ST data analysis (Wu et al., 2022). The cell-type-rich region was defined as the ssGSEA score of each cell type from one spot larger than the 75% quantile of this cell type. The markers for the six cell types are listed in the Supplementary file 1a and Supplementary file 3a. (Lines 1090-1102)

      The Cytassist and Visium samples had a total of 9080 and 2660 spots, respectively. We used “ssGSEA” method to quantify the six cell subpopulations of each spot and also visualized only the spots corresponding to the top 25% of the score ranking for each cell type (Figure 6-figure supplement 2AB, Figure 6-figure supplement 3AB). In Cytassist samples, we observed different spatial distribution patterns of the four subtypes of tumor cells (Figure 6-figure supplement 2B). Specifically, the C3 subtype of tumor cells was predominantly located in the tumor periphery with an enrichment of mesenchymal cells and T cells (areas selected by black dashed circles). In contrast, the C4 subtype of tumor cells was mainly present in the center of the tumor, accompanied by the presence of T cells. The C1 and C2 subtypes of tumor cells were distributed in relatively uniform areas, mainly in the tumor periphery, with fewer mesenchymal cells and T cells. However, the distribution areas of C2 subtype and C3 subtype of tumor cells also partially were in overlap (the area selected by red dashed circles). The same distribution patterns can also be observed in the Visium sample (Figure 6-figure supplement 3B). Further analysis of the correlation between the ssGSEA scores of each cell type in the cell-type-rich regions and those of other cell types was conducted (Figure 6-figure supplement 2D, E, Figure 6-figure supplement 3D, E). We found that in the C3 subtype-rich region of tumor cells, the C3 subtype score of tumor cells was significantly positively correlated with the mesenchymal cell score, while in the T cell-rich region, the C3 subtype score of tumor cells was significantly negatively correlated with the T cell score. The C4 subtype score of tumor cells was significantly positively correlated with the T cell score and negatively correlated with the mesenchymal cell score in the C4 subtype-rich, T cell-rich, and mesenchymal cell-rich regions. The C1 subtype and C2 subtype scores of tumor cells were negatively correlated with mesenchymal cell and T cell scores. Overall, these results were generally consistent with previous histopathologic analysis findings. (Lines 538-562)

    1. Author Response

      Reviewer #1 (Public review):

      1.0) This paper investigates the metabolic basis of a node, posterior cingulate cortex (PCC), in the default node network (DMN). They employed sophisticated MRI-PET methods to measure both BOLD and CMRglc changes (both magnitude and dynamics) during attention-demanding and working memory tasks. They found uncoupling of BOLD and CMRglc in PCC with these different tasks. The implications of these findings are poorly interpreted, with a conclusion that is purely based on other work independent of this study. Various suggestions could allow them to place some speculations in line with a stronger interpretation of their results.

      This is one of several papers in recent years investigating the metabolic underpinnings of activated (or task-positive) and deactivated (or task-negative) cortical areas in the human brain. In this study, they used BOLD fMRI and glucose PET scan to examine the metabolic distinction of the default node network (DMN), which is known to be deactivated during attention-demanding tasks, with different types of cognitively demanding tasks. Unlike the BOLD response in posteromedial DMN which is consistently negative, they found that CMRglc of the posteromedial DMN (a task-negative network) is dependent on the metabolic demands of adjacent task-positive networks like the dorsal attention network (DAN) and frontoparietal network (FPN). With attention-demanding tasks (like Tetris) the BOLD and CMRglc are both downregulated in DMN (specifically the posterior cingulate cortex, PCC, a task-negative node of DMN), but working memory induces CMRglc increase in PCC and which is decoupled from the negative BOLD response in PCC.

      We thank the reviewer for the constructive feedback and the possibility to improve our manuscript. We agree that the interpretation of the results should be strengthened to provide a stronger focus on our data. Regarding the uncoupling of BOLD and CMRGlu during working memory, we acknowledge the need to further elaborate on this topic in our discussion. These suggestions and comments have been incorporated into the revised manuscript as outlined below.

      1.1) These complicated results are the main findings, and to provide a biological basis to these data they rather surprisingly, but without their own experimental evidence, conclude that the negative BOLD and negative CMRglc in PCC during attention-demanding tasks is due to decreased glutamate signaling (which was not measured in this study) and the negative BOLD and positive CMRglc in PCC during working memory is due to increased GABAergic activity (which was not measured in this study). It is rather surprising that without measurement, a conclusion is made which would at best be considered a hypothesis to be tested. Thus, independent of these hypothesized mechanisms, they need to summarize their results based on their own measurements in this study (see 3 for a hint).

      Thank you for bringing up this point and for the insightful suggestion concerning point 3. We have now explicitly stated that the interpretation regarding glutamate and GABAergic signaling is of speculative nature as theses were not measured in the current work, moreover, we have substantially reduced this section. As such, we agree with the reviewer that this represents an interesting hypothesis to be tested in future work. For further details please see response to comments 1.3 and 1.4.

      Discussion, page 16, line 341:

      On the neurotransmitter level, one of the current hypotheses regarding BOLD deactivations proposes that CMRO2 and CBF are affected by the balance of the excitatory and inhibitory neurotransmitters, specifically GABA and glutamate (Buzsáki et al., 2007; Lauritzen et al., 2012; Sten et al., 2017). In the PCC, glutamate release prevents negative BOLD responses (Hu et al., 2013), whereas a lower glutamate/GABA ratio is associated with greater deactivation (Gu et al., 2019). As glutamate elicits proportional glucose consumption (Lundgaard et al., 2015; Zimmer et al., 2017), decreases in glutamate signaling in the pmDMN could indeed explain both, the decreased BOLD response and decreased CMRGlu during the Tetris® task. Conversely, increased GABA supports a negative BOLD response in the PCC (Hu et al., 2013), as do working memory tasks (Koush et al., 2021) and pharmacological stimulation with GABAergic benzodiazepines (Walter et al., 2016). In consequence, the observed dissociation between BOLD changes and CMRGlu during working memory could indeed result from metabolically expensive (Harris et al., 2012) GABAergic suppression of the BOLD signal (Stiernman et al., 2021). However, we need to emphasize that glutamate and GABAergic signaling was not measured in the current study, thus, the above interpretations are of speculative nature. Nonetheless, future work may test this promising hypothesis, e.g., using pharmacological alteration of GABAergic and glutamatergic signaling or optogenetic approaches modulating GABAergic interneuron activity.

      Furthermore, to maintain a more concise discussion that is closer aligned with the measured results, we have removed the following paragraph:

      Discussion, page 15, line 309:

      The associations of these metabolic demands between the DMN and task-positive networks is also reflected in their distance along a connectivity gradient, which is hierarchically organized from unimodal sensory/motor to complex associative functions and the DMN being at the end of the processing stream (Margulies et al., 2016; Smallwood et al., 2021). A corresponding decrease in pmDMN glucose metabolism was observed for tasks that activate unimodal networks and the DAN, but not for the FPN. The inverse influence of attention and control networks on the pmDMN may therefore suggest that connectivity gradients are supported by the underlying energy metabolism.

      1.2) It is mentioned that the FDG-PET scans allow quantitative CMRglc, both in terms of units of glucose use but also with high time resolution. Based on the method described, it isn't clear how this is possible. Important details of either prior work or their own work have been excluded that show how the time course of CMRglc (regardless of whether it's absolute or relative) can be compared with the BOLD time course. Furthermore, it is extremely difficult to conceive that quantitative CMRglc can be estimated without additional measurements (e.g., blood samples, etc). Significant methodological details have to be provided, which even should make their way to results given the importance of their BOLD-CMRglc coupling and decoupling in the same region.

      We thank the reviewer for this important comment and apologize for the lack of clarity. We would like to emphasize that in the current work only spatial patterns of CMRGlu and BOLD signal changes were compared, but not the time course of these signals. The manuscript was edited throughout to clarify this point.

      Introduction, page 5, line 110:

      Studies using simultaneous fPET/fMRI have shown a strong spatial correspondence between the BOLD signal changes and glucose metabolism in several task-positive networks and across various tasks requiring different levels of cognitive engagement (Hahn et al., 2020, 2016; Jamadar et al., 2019; Rischka et al., 2018; Stiernman et al., 2021; Villien et al., 2014).

      Introduction, page 5, line 123

      Specifically, it is unknown whether the observed dissociation between patterns of metabolism and BOLD changes in the DMN generalizes for complex cognitive tasks, and whether this in turn depends on the brain networks supporting the task performance and their interaction with the DMN.

      Results, page 7, line 143:

      From this dataset (DS1) we evaluated the spatial overlap of negative task responses in the cerebral metabolic rate of glucose (CMRGlu quantified with the Patlak plot) and the BOLD signal specifically in the pmDMN. […] After that, the distinct spatial activation patterns across different tasks were used to quantitatively characterize the CMRGlu response of the pmDMN in DS1.

      The method of functional PET (fPET) imaging indeed enables the evaluation of changes in glucose metabolism with a relatively high temporal resolution. That is, a conventional bolus application and subsequent quantification yield a single CMRGlu image per scan of about 60 min (typical frame length ~1-5 min) or a single SUV image from a static scan. In contrast, the constant infusion employed in fPET allows to assess baseline metabolism and changes induced by different tasks in a single scan by using a frame length currently down to 6-30 s (Rischka et al., 2018), where the latter was also used in the current study. A general description of the fPET approach is now also included in the manuscript.

      Introduction, page 5, line 99:

      In this context, functional PET (fPET) imaging represents a promising approach to investigate the dynamics of brain metabolism. fPET refers to the assessment of stimulation-induced changes in physiological processes such as glucose metabolism (Villien et al., 2014; Hahn et al., 2016) and neurotransmitter synthesis (Hahn et al., 2021) in a single scan. The temporal resolution of this approach of 6-30 s (Rischka et al., 2018) is considerably higher than that of a conventional bolus administration. This is achieved through the constant infusion of the radioligand, thereby providing free radioligand throughout the scan that is available to bind according to the actual task demands. Here, the term “functional” is used in analogy to fMRI, where paradigms are often presented in repeated blocks of stimulation, which can subsequently be assessed by the general linear model.

      Regarding the absolute quantification of CMRGlu, arterial blood samples were obtained from all subjects of DS1. These were used for absolution quantification of CMRGlu with the Patlak plot. Full details were already provided in the methods section and are now also mentioned in the results.

      Results, page 7, line 140:

      Simultaneous fPET/fMRI data and arterial blood samples were acquired from 50 healthy participants during the performance of the video game Tetris®, a challenging cognitive task requiring rapid visuo spatial processing and motor coordination (Hahn et al., 2020; Klug et al., 2022). From this dataset (DS1) we evaluated the spatial overlap of negative task responses in the cerebral metabolic rate of glucose (CMRGlu quantified with the Patlak plot) and the BOLD signal specifically in the pmDMN.

      Methods, page 19, line 399:

      For glucose metabolism, these changes are absolutely quantified in μmol/100g/min with the arterial input function and the Patlak plot.

      Methods, blood sampling, page 24, line 536:

      Before the PET/MRI scan blood glucose levels were assessed as triplicate (Gluplasma). During the PET/MRI acquisitions manual arterial blood samples were drawn at 3, 4, 5, 14, 25, 36 and 47 min after the start of the radiotracer administration (Rischka et al., 2018). From these samples whole-blood and plasma activity were measured in a gamma counter (Wizard2, Perkin Elmer). The arterial input function was obtained by linear interpolation of the manual samples to match PET frames and multiplication with the average plasma-to-whole-blood ratio.

      Methods, cerebral metabolic rate of glucose metabolism, page 25, line 561:

      Quantification was carried out with the Patlak plot (t* fixed to 15 min) and the influx constant Ki was converted to CMRGlu as CMRGlu = Ki * Gluplasma / LC * 100 with LC being the lumped constant = 0.89 (Graham et al. 2002, Wienhard 2002).

      1.3) It is surmised that the glutamatergic/GABAergic involvement of these metabolic differences in PCC is from another study, but what mechanism causes the BOLD signal to decrease in both stimuli? This is where the authors have to divulge the biophysical basis of the BOLD response. At the most basic level, the BOLD signal change (dS) can be positive or negative depending on the degree of coupling with changed blood flow (dCBF) and oxidative metabolism (dCMRO2) from resting condition. Unfortunately, neither CBF nor CMRO2 was measured in this study. In the absence of these additional measurements, the authors should at least discuss the basis of the BOLD response with regard to CBF and CMRO2. If we assume that both attention-demanding and working memory tasks decreased BOLD response in PCC in the same way, we have identical dCBF/dCMRO2 in PCC with both tasks, i.e., their results seem to suggest an alteration in aerobic glycolysis with different tasks. With attention-demanding tasks, CMRglc decreases similarly to CMRO2 decreases in PCC, whereas with working memory tasks, CMRglc increases differently from CMRO2 decreases. This suggests PCC may the oxygen to glucose index (OGI=CMRO2/CMRglc) would rise in PCC attention-demanding tasks, but fall in PCC with working memory tasks. This is obviously an implication rather than a conclusion as CBF or CMRO2 were not measured.

      1.4) Given the missing attention that gives rise to the BOLD contrast mechanism, it is almost necessary to discuss the biophysical basis of BOLD contrast and specifically how metabolic changes have been linked to both increases and decreases in neuronal activity in the past. Although this type of work has largely been conducted in animal models, it seems that this topic needs to be discussed as well.

      We would like to thank the reviewer for sharing these insightful ideas and for bringing up these aspects that indeed appear to be essential for the manuscript. Since the points 1.3. and 1.4 complement each other, we have combined them and created a shared response. To fully address the points, the following paragraphs were added to the manuscript.

      Discussion, page 15, line 310:

      Metabolic and neurophysiological considerations effects

      The distinct relationships between BOLD and CMRGlu signals that emerge during specific tasks highlight the different physiological processes contributing to neuronal activation of cognitive processing (Goyal and Snyder, 2021; Singh, 2012). While CMRGlu measured by fPET provides an absolute indicator for glucose consumption, the BOLD signal reflects deoxyhemoglobin concentration, which depends on various factors, such as cerebral blood flow (CBF), cerebral blood volume (CBV) and the cerebral metabolic rate of oxygen (CMRO2) (Goense et al., 2016). In simple terms, the BOLD signal relates to the ratio of ∆CBF/∆CMRO2. Assuming that the observed BOLD decreases during Tetris® and WM emerge from the same mechanisms, this would result in a comparable ∆CBF/∆CMRO2 in the pmDMN for both tasks. Given that these types of tasks (external attention and cognitive control) elicit a reduction in CBF in the pmDMN (Shulman 97, Zou 2011), CMRO2 also decreases albeit to a lesser extent (Raichle 2001). Therefore, the respective metabolic processes can be described by their oxygen-to-glucose index (OGI), the ratio of CMRO2/CMRGlu. Accordingly, our results suggest two distinct pathways underlying BOLD deactivations in the pmDMN that differ regarding their OGI. During Tetris® there is a BOLD deactivation with a high OGI, resulting from a larger decrease in CMRGlu than CMRO2. This metabolically inactive state is in line with electrophysiological recordings in humans (Fox et al., 2018) and in non-human primates showing a decrease of neuronal activity in the pmDMN that covaries with the degree of exteroceptive vigilance (Shmuel et al., 2006; Bentley et al., 2016; Hayden et al., 2009). Therefore, we suggest that the negative BOLD response during external tasks reflects a reduction of neuronal activity and their respective metabolic demands. On the other hand, the relatively increased CMRGlu without the corresponding surge in CMRO2 hints at another kind of BOLD deactivation with a low OGI in the pmDMN during working memory, indicating energy supply by aerobic glycolysis (Vaishnavi et al., 2010; Blazey et al., 2019). Previous work in non-human primates has indeed suggested a differential coupling of neuronal activity to hemodynamic oxygen supply in this region (Bentley et al., 2016). Furthermore, tonic suppression of PCC neuronal spiking during task performance was punctuated by positive phasic responses (Hayden et al., 2009), which could indicate differences between both tasks also at the level of electrophysiologically measured activity.

      Reviewer #2 (Public Review):

      2.0) This paper provides an important and insightful investigation into patterns of activations that emerge in external task states. The authors use state-of-the-art methods and novel analytic approaches to establish that deactivations in the default mode network during external tasks are driven by activity in brain regions that are important in the current tasks (such as the visual or dorsal attention networks). It will be important in the future to understand whether this is a symmetrical phenomenon by studying this behaviour in states that maximize activity within the default mode network and also drive reductions in networks that are not relevant to these situations.

      We thank the reviewer for the encouraging feedback and the constructive comments on our manuscript. We particularly appreciate the interest in the research and the insightful suggestions for future work.

      Reviewer #3 (Public Review):

      3.0) The authors report a study where, using multiple datasets with [18F]FDG PET bolus + continuous infusion ("functional PET") and BOLD fMRI data, they re-evaluate the metabolic and hemodynamic properties of the default mode network (DMN) in a task-evoked context, with a focus on posteromedial DMN due to its relevance for across-network integration. They show how posterior DMN is differently engaged depending on the chosen task: while visual and motor tasks lead to BOLD deactivations and glucose metabolic decrease, specifically in the dorsal posterior cingulate cortex (PCC) area, working memory tasks produce BOLD deactivations but metabolic increases, specifically in ventral PCC, as shown in their previous paper (Stiernman et al. 2021, https://doi.org/10.1073/pnas.2021913118). This aims to solve the controversies elicited by findings of both increased and decreased glucose consumption in the presence of BOLD deactivation in the DMN.

      Additionally, they show how task-evoked glucose metabolism in posterior DMN seems to be shaped by that of the corresponding task-positive networks, with a positive link with dorsal attention and a negative link with frontoparietal network metabolism. This is explored using a type of directional connectivity analysis called "metabolic connectivity mapping", drawn from their previous work (Riedl et al. 2016, https://doi.org/10.1073/pnas.1513752113; Hahn et al. 2020, https://doi.org/10.7554/eLife.52443). They go on to speculate that concomitant BOLD deactivation and reductions in glucose expense might relate to decreased glutamatergic signaling, while BOLD deactivations accompanied by increased glucose consumption might depend on increased GABAergic neuronal activity.

      This is a relevant topic because it not only shows how the DMN is flexibly engaged in different tasks but also allows us to better understand the complex relationships between BOLD fMRI and [18F]FDG PET signals, which are still not fully characterized to this day. Of course, while in resting state the situation is further complicated by the more uncertain physiological meaning of the resting BOLD signal, task-evoked states are expected to provide a more interpretable intermodal link between metabolism and hemodynamics, due to the known major changes in blood flow, blood volume, and glucose metabolism - which underlie BOLD and [18F]FDG signal changes - in response to neural activation. However, even in task states, there is not always a strong association between the two responses, as previously shown by the authors themselves (Rischka et al. 2018, https://doi.org/10.1016/j.neuroimage.2018.06.079). This is something I think the authors should stress out a little more, as they have previously done (Rischka et al. 2018, https://doi.org/10.1016/j.neuroimage.2018.06.079), both in the introduction and in reference to Figure 1, which shows clear differences between BOLD and [18F]FDG activations/deactivations (e.g., widespread negative responses in the cerebellum for [18F]FDG).

      Overall, the analyses reported in the manuscript are simple and seem mostly sound, drawing from well-established methods in PET and fMRI activation studies, with additional approaches previously developed by some of the authors themselves (e.g., "metabolic connectivity mapping", Riedl et al. 2016, https://doi.org/10.1073/pnas.1513752113). Moreover, a clear strength of the paper is the high number of subjects, at least from a PET perspective, i.e., n = 50 for the Tetris task, plus group averages of previously published data for working memory (Stiernman et al. 2021, https://doi.org/10.1073/pnas.2021913118) and motor tasks (Hahn et al. 2018, https://doi.org/10.1007/s00429-017-1558-0).

      The conclusions are in line with the results, and, though a little speculative, are potentially relevant for further exploration aimed at characterizing the neurotransmitter pathways underlying positive and negative BOLD and [18F]FDG responses. Moreover, the language is sufficiently clear to allow a proper understanding of the aims and the results, as well as the details of the analyses. As a side note, the title should probably be adjusted to "Task-evoked metabolic demands of the posteromedial default mode network are shaped by dorsal attention and frontoparietal control networks", to emphasize that the findings do not necessarily generalize to the resting state.

      In conclusion, I am overall quite positive about this manuscript, which seems to nicely position itself within the existing literature, making some additional contributions.

      We thank the reviewer for the thorough evaluation and the positive feedback on our manuscript, we appreciate the constructive and insightful suggestions. We agree that the differential spatial patterns of activation between the BOLD signal and CMRGlu response require further attention. To address this point in more detail, we have added the following information to the manuscript.

      Introduction, page 5, line 110:

      Studies using simultaneous fPET/fMRI have shown a strong spatial correspondence between the BOLD signal changes and glucose metabolism in several task-positive networks and across various tasks requiring different levels of cognitive engagement (Hahn et al., 2020, 2016; Jamadar et al., 2019; Rischka et al., 2018; Stiernman et al., 2021; Villien et al., 2014). […]. However, also regional differences in activation patterns have been observed previously between these modalities in these and previous studies (Wehrl et al., 2013). Moreover, a dissociation between BOLD changes (negative) and glucose metabolism (positive) has recently been observed even in the same region of the DMN during working memory (Stiernman et al., 2021), namely the posteromedial default mode network (pmDMN).

      Results, caption Figure 1, page 8, line 173

      White clusters represent the intersection of significant CMRGlu and BOLD signal changes, irrespective of direction. Note, that also relevant differences between both imaging parameters can be observed, such as decreased CMRGlu in the cerebellum (in both datasets), without changes in the BOLD signal.

      We appreciate the reviewer’s proposal for the title as it raises awareness that the activation patterns reflect task-specific inference.

      Title:

      Task-evoked metabolic demands of the posteromedial default mode network are shaped by dorsal attention and frontoparietal control networks

      We have limited the discussion of underlying neurotransmitter effects and explicitly mention that these are of speculative nature. For manuscript adaptation on this point, we would like to refer to points 1.1, 1.3, 1.4 that address this topic as well.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors assessed the association between exposures and obesity by environment-wide and epigenome-wide association studies. The strength of this study is that exposures, body mass index, and waist-hip ratio were measured three times from adolescence to early adulthood, and the associations were repeatedly evaluated. A weakness of this study is that a loose significance threshold was used for the epigenome-wide association study and only a small number of study subjects were measured in early adulthood. Since this is an observational study, the confounding effect should be considered when interpreting the exposures associated with obesity reported in this study.

      Thank you very much for your positive comments and helpful suggestions. We agree that the study has the limitation of the loose significance threshold used for the epigenome-wide association study and the limited sample size in early adulthood. Following the reviewer’s suggestion, we have revised the threshold of significance in the epigenome-wide association study to 1×10-6. We have added more discussion on confounding, and we are more cautious in the interpretation of the results.

      Reviewer #2 (Public Review):

      Since this study is a long-term cohort study in children and adolescents, it is advisable to decide whether to highlight differences by age group or to show consistent effect after exposure. In particular, obesity and related diseases are closely related to socio-economic environmental factors, and its impact might be different according to age (group) at exposure.

      Thank you very much for your insightful suggestions. We agree that the associations of exposures, including socio-economic and environmental factors, with obesity might vary by age, that is why we examined the associations of early life exposures with BMI and WHR at different ages. It is possible that the same exposure may impact obesity differently by age, so we also assessed the associations of exposures selected at earlier ages of BMI/WHR with BMI/WHR at older ages, and compared consistency in the direction of association at different time points. We have added more explanation in the introduction and methods:

      (Introduction, paragraph 4)

      “Considering that exposures related to obesity at the outset and at the end of puberty may be different, and the associations of the same exposure with obesity may vary by age, we conducted the EWAS at different ages.”

      (Methods, Statistical analysis)

      “Third, to assess whether the associations differed by age, we checked the associations for the selected exposures from the earlier age groups (~11.5 years and ~17.6 years) in the follow-up survey (n=308) at age ~23 years and compared the direction of associations with those at earlier age groups (~11.5 years and ~17.6 years). Associations with consistent directions of associations in earlier age groups (~11.5 years or ~17.6 years) with those at ~23 years suggest a consistent association by age.”

      The part described in comparison with previous studies is a good attempt. However, some results are consistent with those of previous studies and some are not. This may be related to the time difference in socio-economic environmental factors rather than simply the difference between the West and China (Hong Kong). According to modernization/urbanization, changes in living environment, changes in family relationships, and changes in the care environment can also be factors especially in children.

      Thank you very much for your positive comments and raising this interesting point. We totally agree that inconsistency with results of previous studies is not merely due to the difference between the West and China (Hong Kong), but also related to changes in structural socio-economic and environmental factors, as well as changes in living environment, family relationships, social and community network, housing and care environment, that affect individuals’ health. Hence, we provided the necessary clarification by adding the following sentences:

      (Discussion, Strengths and Limitations)

      “Fourth, the inconsistency between some of our findings and previous studies, such as chocolate, sweets, tea and coffee consumption, should be interpreted cautiously. It may not only reflect differences between the West and China (Hong Kong), but also may be due to changes in structural socio-economic and environmental factors, as well as changes in living environment, family relationships, social and community networks, housing and the care environment.”

      In studying the effect of environment on gene expression, it can be thought that the influence of genes and the degree of expression might be different depending on the age of the subject (newborn, infant, infant, adolescent, adult) duration of exposure and these still need to be elucidated.

      Thank you very much for raising this important point. We fully agree with you. It would be interesting to examine the association of gene expression at different ages with obesity. However, we only collected blood samples at the Biobank Clinical follow-up (age ~17.6 years), so in this study we only conducted the epigenome-wide association study for DNA methylation at ~17.6 years with BMI and WHR at ~23 years. We have added this in the limitation:

      (Discussion, Strengths and limitations)

      “Fifth, we only collected blood samples at the Biobank Clinical follow-up (age ~17.6 years), so we only conducted the epigenome-wide association study for DNA methylation at ~17.6 years. It would be worthwhile to examine the association of DNA methylation at different ages with obesity.”

    1. Author Response

      Reviewer #1 (Public Review):

      This study presents a valuable comparison of fibre orientation estimates from three different modalities: diffusion MRI, scattered light imaging, and x-ray scattering. The comparison is interesting as each modality is sensitive to different aspects of tissue microstructure - water anisotropy, micron-scale structural coherence, and myelin lamella respectively. Where scattered light and x-ray imaging can be only applied ex vivo, diffusion MRI has in vivo applications but suffers from being an indirect estimate of the microstructure of interest. By acquiring all modalities in both a vervet monkey and human brain sample, the authors provide quantitative, pixel/voxel-wise comparisons of fibre orientation estimates within the same tissue samples. The authors show convincing agreement in fibre orientations from all three methods, giving confidence in the fidelity of the methods for neuroanatomical investigations. Differences are also observed: SLI is shown to have less reliable estimates of fibre inclination, and the CSD analysis presented overestimates the number of crossing fibre populations when compared to the microscopy methods, particularly in single fibre regions such as the corpus callosum, a known artefact in some diffusion analyses.

      In the current PDF, it is very difficult to see fibre orientations in figures due to low resolution, limiting the reader's ability to assess the results. Higher-resolution images would provide more information and easier comparisons.

      The methods are generally clear though some additional information is needed:

      1) to specify the resolution that the orientations are compared in each figure and how data was up-/down-sampled for these comparisons respectively. For example, each SAXS pixel contains many SLI pixels. It is currently unclear whether the mean SLI orientation from a neighbourhood is equivalent to the SLI compared, or whether a comparison was made for each SLI pixel. Similarly, for the dMRI-microscopy comparisons.

      2) I also could not follow why two SLI methods are presented in the methods: SLI scatterometry relating to Figure 2, and angular SLI relating to all other results. Further clarification is needed.

      3) Since the quality of the data co-registration can strongly impact pixel/voxel-wise comparisons, quantification of the registration accuracy or overlays demonstrating the quality of the co-registration would be valuable.

      A primary weakness of the work as a diffusion MRI validation study is that though diffusion MRI supports many different models to extract fibre orientations with different outputs, here only a single model is compared to the microscopy data, which may affect the generalisability of the results. Further, it only compares the primary orientations from the diffusion MRI and does not consider each fibre population's magnitude (density of fibres) or the orientation dispersion, both of which can influence downstream analyses.

      The paper could be strengthened by a more detailed discussion on the differences between the imaging modalities - e.g. in terms of imaging resolution, signal-generating mechanisms, and sensitivity to specific aspects of the tissue microstructure - and how these differences may limit their application to specific neuroanatomical investigations, or ability to validate one another. For example, the microscopy sections are 80 microns thick whilst the diffusion voxel is 200 microns. I expect this could contribute to the difference in the number of fibre populations per voxel.

      The hypothesis that dMRI signal contributions from extra-axonal water result in additional fibre populations could be investigated by running CSD on both low and high-b-value data (for example using the openly available MGH dataset, Fan 2016) where fewer secondary fibre populations should be observed at high b-value.

      We sincerely thank Reviewer #1 for the constructive feedback, which helped us to significantly improve our manuscript. We hope to have done our best to address all concerns:

      First, we regret the insufficient resolution of figures. The resolution must have been reduced during the submission process, when generating the pdf version of our manuscript. We have now submitted all figures as separate files with the highest possible resolution. In addition, all parameter maps are publicly available and can be opened and zoomed in, e.g. with ImageJ, to see the fiber orientations of individual image pixels.

      As requested by the reviewer, we have modified our manuscript and added additional methods information.

      1) Concerning the data up-/downsampling: We have now specified in each figure caption at which resolution the images were compared and added the following explanation to the newly named Methods section “Image registration and pixel-wise comparison”: To minimize loss of information, the pixel/voxel-wise comparisons were performed at the spacing of the highest resolution image, i.e. the lower-resolution diffusion MRI (dMRI) and small-angle X-ray scattering (SAXS) images were upscaled to match the higher-resolution scattered light imaging (SLI) images. As a result, the fiber orientation of one SAXS pixel (px=150µm) was compared to the fiber orientations of 50x50 SLI pixels (px=3µm), and not to the mean; similarly for comparisons with dMRI.

      2) Concerning the two SLI methods: We have added the following explanation to the Methods section “Scattered Light Imaging” to clarify why we used two different methods: To generate the scattering patterns (upper Figure 2C), a time-consuming SLI scatterometry measurement was performed in which the sample was illuminated from 6,400 different angles, as described in Menzel et al. (2021b). This was necessary to achieve sufficiently resolved scattering patterns for a visual comparison with SAXS scattering patterns. The fiber orientations can also be extracted from the peak positions in the azimuthal profiles (cf. bottom Figure 2C), without taking the overall shape of the scattering patterns into account. Therefore, all other results were obtained from more time- and data-efficient angular SLI measurements in which the sample was illuminated from 24 different angles around a circle and the fiber orientations were derived from the peak positions in the resulting line profiles, as described in Menzel et al. (2021a).

      3) Concerning the quality of the co-registration: We thank the reviewer for this comment. We agree that the accuracy of image registration has a high impact on pixel/voxel-wise comparisons and determines the quality of our cross-validation study. We have added a new Discussion section “Quality of cross-validation” and inserted a new figure (Figure 4–figure supplement 1) to demonstrate the accuracy of image registration, both for the vervet and human brain samples: The reference and registered images are shown both in direct comparison (top and middle images, respectively) and as overlays (bottom images), as suggested by the reviewer. Reference and registered images show good correspondence (white/gray matter boundaries coincide). Only the fornix of the vervet brain section is not aligned (it moved when re-mounting the sample) so that this region was evaluated separately, as described in the manuscript. We found standard linear transformations (scaling, rotation, and translation) to be sufficient for achieving a fair comparison between the different modalities, demonstrating the experimental feasibility of our approach. There might still be individual voxels that were not sufficiently well aligned, especially when comparing sections (SLI/SAXS) to volumetric measurements (dMRI). However, this would only increase the angular differences between the fiber orientations. Our results can therefore be considered as an upper bound. Using standard linear transformations, we could already show that in-plane crossing orientations from SAXS and SLI, and through-plane orientations from SAXS and dMRI correspond very well to each other.

      We understand the focus of our work lying rather on the cross-validation/evaluation of light and X-ray scattering, in comparison to dMRI which is much longer established, than on a “diffusion MRI validation study”: the myelin specific SAXS orientations and crossings were cross-validated with the high-resolution SLI orientations, and SLI out-of-plane fibers were validated using SAXS/dMRI as ground truth data.

      The reviewer rightly noted that we used a single analysis method to extract fiber orientations from dMRI data (based on the MRtrix3 dwi2response and dwi2fod commands, using the dhollander and msmt_csd algorithms, respectively). Although to our knowledge this method is one of the most widely used for deriving fiber orientations for subsequent tractography, it is true that other methods might yield different results and that we cannot draw conclusions for diffusion MRI in general. We have included these considerations in the newly named Discussion section “Comparison of SAXS and SLI fiber orientations to dMRI”.

      It is also true that our comparison focused on primary dMRI orientations without taking fiber density or dispersion into account. We decided to do so because deriving such metrics from SLI or SAXS data has not been implemented yet. However, we expect this to happen in the following years, enriching future studies. We have also included these aspects in the Discussion section.

      We agree with the reviewer that our paper could be strengthened by a more detailed discussion on the differences between the imaging modalities. We have added a paragraph to the new Discussion section “Quality of cross-validation”: We compared results from three different imaging techniques (SLI, SAXS, dMRI) which all have different signal-generating mechanisms and resolutions. The different resolutions should be taken into account when interpreting the comparative studies. To investigate the relationship between SLI peak distance and fiber inclination, we used dMRI/SAXS images with at least 50 times lower in-plane resolution as reference (Figure 6). This is sufficient to validate the theoretical predictions, but insufficient to validate individual pixel values. To validate crossing fiber orientations from SAXS, we used SLI images with 30 times higher in-plane resolution, leading to a broad distribution of angular differences (depending on the region), but the mean difference around zero is evidence for a good overall correspondence (Figure 4). Finally, when comparing fiber orientations in SAXS and SLI to dMRI (Figure 5), it should be taken into account that dMRI voxels (with 200µm size) contain more fiber layers than the corresponding SAXS or SLI voxels (with 80µm section thickness), so that dMRI voxels might include additional fiber populations not present in SAXS or SLI data. On the other hand, fiber orientations that occur both in dMRI and SAXS voxels – like the out-of-plane fiber orientations from SAXS and dMRI (e.g. Figure 6B-C) – can be considered as reliable, given the substantially different contrast-generating mechanisms.

      Finally, we thank the reviewer for the suggestion to study different b-values (last comment). We agree that an analysis based on different b-values might yield different results. Especially, an analysis with high b-values is expected to be more specific to the fiber orientations, as most other components of the signal would have already been attenuated. To investigate this hypothesis, we have run a separate analysis with high b-values only (5 and 10ms/μm2) and added a new supplementary figure (Figure 5–figure supplement 4) that compares the results for all b-values to high b-values only. We found that the fiber orientation distributions are almost identical between all b-values and high b-values only.

      Reviewer #2 (Public Review):

      This work is a cross-validation of an x-ray tomography technique (SAXS) and an optical microscopy technique (SLI) for imaging axonal orientations ex vivo. These innovative methods were introduced in recent papers by the authors, who have teamed up here to compare them side-by-side on the same tissue samples for the first time. The two methods are both label-free (do not require staining) and they are quite complementary. SAXS can provide full 3D orientation measurements on intact tissue, but it operates at a mesoscopic resolution and requires access to a synchrotron. SLI can measure the orientations of multiple fascicles per voxel at a microscopic resolution and relies on more widely accessible equipment, but its accuracy suffers for fiber orientations perpendicular to the imaging plane and it requires tissue to be sectioned before it is imaged. Therefore it makes a lot of sense to explore the complementary strengths of these two techniques, and to use one to "fill in the blanks" of the other. The paper also compares the orientation measurements obtained with SAXS and SLI to those obtained with diffusion MRI. The latter provides only indirect measurements based on water diffusion, at a mesoscopic resolution somewhat lower than that of SAXS, but has the benefit of being feasible in vivo.

      A limitation of this study is that conclusions on the comparison between SAXS and SLI are drawn from only 2 sections of a partial monkey brain sample and 2 sections of a partial human brain sample. Conclusions on diffusion MRI are drawn only on the 2 human sample sections. This is particularly an issue for the comparison to diffusion MRI, as the diffusion MRI voxels are wider than the section thickness, hence one cannot preclude that any orientations detected with diffusion MRI but not with SAXS and SLI come from the portion of the voxel that is missing from the corresponding SAXS/SLI section.

      The stated aim of the paper is to provide a framework for combining the complementary benefits of SAXS and SLI, rather than simply presenting the results of a cross-validation study. This is a significant and ambitious aim. However, in order for this to serve as a framework, there would have to be clear prescriptions for how researchers interested in obtaining ground-truth measurements of axonal orientations would do so by using these two methods in tandem. This is not adequately developed in the paper in its present form. For example, the results show reasonable agreement between SAXS and SLI orientations when fibers lie within the SLI imaging plane and decreasing agreement for fibers with increasing through-plane inclination. How would the two methods be combined in voxels where they disagree? Would one use SLI orientations in voxels with fewer through-plane fibers and SAXS orientations in voxels with more through-plane fibers? How would voxels be assigned to each category? How would the orientation vectors from the two modalities be composed and how would the resolution difference between the two be handled? When the through-plane measurement of SLI is unreliable, is its in-plane measurement still reliable? That is if there were one mainly in-plane and one mainly through-plane fiber population, would the orientation of the former still be measured correctly by SLI? There is also considerable agreement reported here between through-plane orientations obtained with SAXS and diffusion MRI. Would this mean that diffusion MRI itself could be used to supplement SLI with through-plane orientations? Any clear set of prescriptions along these lines would represent a framework for imaging orientations by combining modalities. This, however, would require detailed steps for how to perform the combination and use the multi- vs. uni-modal framework to reconstruct connectional anatomy.

      A key advantage of SAXS is that it can be performed on intact samples, i.e., before any nonlinear distortions of the tissue are introduced by sectioning. Thus it can provide an undistorted reference, with contrast on axonal orientations that would be absent in, say, a structural MRI of comparable resolution. This contrast could be used to drive registration of the distorted SLI sections to an undistorted SAXS volume, and therefore is a key way in which the two techniques can complement each other. Here, however, this is not explored, as SAXS is performed after sectioning. It is not clear if this is the authors' prescription for how a combined SAXS/SLI framework would be implemented, or if it was done specifically for this study.

      First, it would seem that SAXS on the intact sample would be lower maintenance, requiring less setup time and hence potentially less overall beamtime than performing SAXS on each section separately. This would make it more practical for routine deployment beyond a few sections.

      Second, because the SAXS data are now nonlinearly distorted, they cannot be affinely aligned to the MRI volumes. While, in principle, performing both SAXS and SLI on the sections may facilitate the comparison between the two, having to unmount, rehydrate, and remount the sections in between may negate this advantage, as now there is no guarantee that SAXS and SLI can be affinely registered to each other. Here all these registration steps are performed affinely, so it is unclear to which extent the computed errors between modalities are characterizing the inherent limitations of the respective contrasts, or limitations of the registration technique. Some of the alignment is performed manually, for example, specific regions of the images are realigned by hand, and the slice of the diffusion MRI volume that is aligned to the SAXS/SLI sections is chosen by hand. Again, for this to serve as a framework that can be deployed on whole samples, there would have to be clear prescriptions for how to perform these steps robustly, how to ensure that the MRI can be acquired in a coordinate frame parallel to the sections, etc.

      Finally, the paper puts forth a general conclusion that diffusion MRI overestimates the number of fiber populations per voxel, on the basis of small ODF peaks appearing perpendicular to the main ODF peaks. Of all conclusions in the paper, this is the least convincingly supported by evidence. First, these small perpendicular peaks are a known artifact, which would be typically eliminated by ignoring ODF peaks below a certain amplitude, a common practice in diffusion tractography algorithms. The authors refrain from using an amplitude threshold, with the rationale that it may also remove true diffusion orientations. However, they apply a threshold when they detect SLI peaks (a rather stringent 8% of the maximum). Second, the explanation that these artifactual peaks may appear due to vessel walls is not convincing. Vasculature is sparse. A single vessel wall will not impact the diffusion signal in the same way as a bundle of parallel axons. In an axon bundle, water molecule displacements are restricted in all directions except parallel to the axons. A single vessel wall in a voxel will not have the same effect on displacements (which are much smaller than the size of the voxel). From Figure 5, it looks like there would be at most 1-2 of these vessels in a diffusion MRI voxel, and they would not be in all voxels. This cannot explain the widespread appearance of these small artifactual peaks. Third, many ODF reconstruction methods have parameters that can be adjusted to make these artifactual peaks more or less prominent. The default parameters may be optimal for in vivo but not ex vivo data, due to the effects of fixation. In light of these concerns, I would caution against making such a general statement about all diffusion MRI in the human brain, especially on the basis of a single diffusion reconstruction method applied to a single location in one brain.

      We sincerely thank Reviewer #2 for the constructive feedback, which helped us to significantly improve our manuscript. We hope to have done our best to address all concerns:

      First, regarding the limited number of tissue sections used for our study (second paragraph):

      It is true that we only evaluated a limited number of samples – mainly due to the limited beam time available for SAXS experiments. We believe that the main conclusions concerning the cross-validation of SAXS crossing fibers and SLI out-of-plane fibers still remain valid.

      The reviewer correctly points out that the dMRI voxels (with 200um size) are wider than the section thickness (80um) so that additional fiber orientations detected with dMRI might come from the portion of voxels missing in the corresponding SAXS/SLI measurement. We have added a clarifying paragraph in the newly named Discussion section “Comparison of SAXS and SLI fiber orientations to dMRI” as well as in the new Discussion section “Quality of cross-validation”. Nevertheless, we do not expect additional fiber orientations in comparable homogeneous regions like the corpus callosum, and fiber orientations that occur both in dMRI and SAXS/SLI – like the out-of-plane fiber orientations from dMRI and SAXS (e.g. Figure 6B-C) – can be considered as reliable, given the substantially different contrast-mechanisms of the microscopy and dMRI techniques.

      Concerning the aim of our paper and the questions raised by the reviewer in the third paragraph:

      We understand that the term “framework” is not the appropriate word in this context, as it can raise false expectations. Our aim was rather to provide a basis (“groundwork”) to enable combined measurements of SLI/SAXS (and dMRI) on the same tissue samples and cross-validate the techniques (the crossing fiber orientations in SAXS and the through-plane fiber orientations in SLI have not been validated using other techniques so far). We have changed the wording throughout the manuscript, explaining that we focused on laying the “groundwork” instead of providing a “framework”, and reformulated the corresponding sentences.

      Our aspiration was to provide a protocol how the complementary imaging techniques can be performed on the same tissue sample. When talking about a “combination” of techniques, we were referring to combined measurements (i.e. measurements on the same sample), and not to a combined analysis (e.g. in form of combined parameter maps and fiber orientation vectors). The latter, while very much needed in the field, would require many more and heterogeneous samples, and work beyond the scope of this manuscript, which we hope to perform in the future. Along these lines, we have removed the term “combined” throughout the manuscript, and wrote e.g. “measurements of SLI and 3D-sSAXS on the same tissue sample” instead of “combined measurements of SLI and 3D-sSAXS” to avoid confusion.

      However, it is of course a valid question how SAXS and SLI can be combined in voxels where they disagree, how the orientation vectors can be composed, and how the resolution difference between the methods can be handled. We have added a new Discussion section “Towards a combination of SLI, SAXS, and dMRI” to elaborate on how a combined analysis (e.g. in form of combined fiber orientation maps) can be achieved and what challenges we are facing.

      Concerning the reviewer’s question if the orientation of an in-plane fiber population would be correctly measured by SLI if there was another through-plane fiber population: We only evaluated regions belonging to a single fiber population (SLI azimuthal profiles with one or two dominant peaks) and regions belonging two in-plane crossing fiber populations (SLI azimuthal profiles with two dominant peak pairs). Voxels containing both in-plane and through-plane fibers were excluded from the analysis. The determined in-plane SLI orientations can thus be considered as reliable. We have added these aspects to the new Discussion section

      “Quality of cross-validation”.

      Regarding the reviewer’s question if dMRI itself could be used to supplement SLI with through-plane orientations: Diffusion MRI could indeed be used as a reference to enhance the interpretation of through-plane fiber orientations from SLI measurements. One disadvantage over SAXS is the lower resolution and that it cannot directly be performed on the same tissue section as SLI. These aspects have also been added to the new

      Discussion section.

      Concerning the reviewer’s suggestion to perform SAXS before sectioning and the problem of image registration (fourth paragraph):

      It is true that SAXS tensor tomography can be applied to larger tissue volumes and that it is not limited to tissue sections. However, the reconstruction of crossing fibers has so far only been realized in sections (Georgiadis et al., 2022) and not in intact samples. As we wanted to cross-validate these fiber crossings using SLI as reference, we decided to perform the SAXS measurements on the same tissue sections as the SLI measurements. A comparison to results from SAXS tensor tomography might still be interesting in the future. We have added these considerations to the new Discussion section “Towards a combination of SLI, SAXS, and dMRI”.

      It is also true that cutting a section from a brain tissue sample might introduce non-linear distortions; in particular, it is challenging to identify this particular section in the original tissue volume; unmounting and remounting of an already existing section introduces much less distortions. We have added a new figure (Figure 4–figure supplement 1) which shows that a co-registration with linear transformations (scaling, rotation, and translation) is already sufficient to allow for a fair comparison between the different image modalities, both for vervet and human brain samples. Only the fornix of the vervet brain section moved during remounting of the sample, and was therefore evaluated separately, as described in the manuscript. In any case, even if the angular differences in some image pixels were larger due to an imperfect co-registration, a perfect co-registration would only yield even smaller differences. Hence, the reported angular differences can be considered as upper bound, demonstrating that SAXS and SLI fiber orientations show already a very good correspondence. We have added a corresponding paragraph to the new Discussion section “Quality of cross-registration”.

      Finally, we agree that a clear prescription would be necessary to enable combined analysis on whole tissue samples. As mentioned further above, our aim was to provide the groundwork for combined measurements on the same tissue sample and cross-validate the different techniques, and not to provide combined fiber orientation maps or similar. We have added our thoughts on how to combine the different image modalities to the new Discussion section “Towards a combination of SLI, SAXS, and dMRI”.

      Concerning the final concern of the reviewer that an overestimation of the number of fiber populations per voxel is not sufficiently supported (last paragraph):

      We understand this concern and have removed all phrases that could be understood as generalized claims for MRI, including any reference to fiber orientations overestimation. Furthermore, we have extended the Discussion to indicate the non-generalizability of our results.

      Regarding the first point that the minor perpendicular ODF peaks could be removed by applying a suitable amplitude threshold: This is a valid remark and was discussed partly in the first version of the manuscript, when referring to increasing the threshold of secondary lobes prior to running tractography algorithms and to the problem that it might decrease the sensitivity for the cases where there exist actual but less prominent secondary fiber populations. We have extended the Discussion to address the concerns of the reviewer.

      Regarding the second point that the minor ODF peaks are probably not caused by vessel walls: We thank the reviewer for the valid remarks and have removed all mentions of blood vessels in the manuscript, including the arrows in Figure 5H.

      Regarding the third point that parameters can be adjusted to make the artifactual peaks more/less prominent, and that default parameters might be optimal for in vivo but not ex vivo data: We have added the remark that model parameters can be fine-tuned to decrease the percentage of false-positives to the Discussion.

      Finally, it is true that we only used a single diffusion reconstruction method and measured only a single location in one human brain with dMRI. As mentioned at the very beginning, the number of samples was limited, and we included the reviewer’s concerns in the newly named Discussion section “Comparison of SAXS and SLI fiber orientations to dMRI”. For the main purposes of the paper like the cross-validation of out-of-plane fibers in SAXS/SLI, the dMRI data was still sufficient as we could show a good correspondence between dMRI/SAXS in these regions.

    1. Author Response

      Reviewer #1 (Public Review):

      The study tackles the topic of male harm (sexual selection favoring male reproductive strategies that incur a reduction of female fitness) from an interesting angle. The authors put emphasis on using wild-collected populations and studying them within their normal thermal range of reproductive conditions. Where previous studies have used temperature variation as a proxy for stressful environmental change, this approach should instead clarify what can be the role of male harm on female fitness in natural conditions. A minor caveat regarding this point is the fact the polygamy treatment also has a heavily male-biased sex ratio (3:1). The authors argue that this sex ratio is within the range of normal variation in that species, but it is likely that the average is still (1:1) in natural populations and using a male-biased sex ratio could magnify the intensity of male harm. This does not undermine the conclusions regarding the temperature sensitivity of sexual conflict but should be acknowledged.

      The authors find that varying temperature within a range found in natural conditions affects the reproductive interactions between males and females, particularly through male-harm mechanisms. Male harm, measured as a reduction in lifetime reproductive success (LRS) from monogamy to polygamy settings is present at 20C, stronger at 24, and absent or undetectable at 28C. Female senescence is always faster in the polygamy mating systems as compared to monogamy, but the effect appears strongest at 20C. Mating behaviors of males and females in these different settings are used to attempt to uncover underlying mechanisms of the sensitivity of male harm to temperature.

      A weakness of the manuscript in its current form is the lack of clarity about the experimental design, which makes understanding the results a long and involved procedure, even for someone who is familiar with the field. If the authors consider revising the manuscript, I suggest giving a better overview of the experimental design(s) earlier in the manuscript, perhaps supported by a diagram or flowchart. I also suggest structuring the results better to aid the reader (e.g., make clearer distinctions between results that come from the different experiments). Finally, some additional figures and statistical tests corrected for multiple testing would help get a better feel of some aspects of the dataset.

      I believe that the conclusions are generally justified and the results overall convincing. Overall, this is an impressive study with a lot of dimensions to it. Its complexity is a challenge and may require additional effort from the authors to make it easier to access. The core of the question is answered by LRS measures, but the authors have also provided a wealth of behavioral data as well as other fitness components. The manuscript could be greatly improved by putting more effort into linking the different metrics together to track down potential mechanisms for the observed variation in male-harm-induced reduction in female LRS. The discussion would also benefit from considering the female side of the sexual conflict coevolution arms race.

      We are thankful for the nice words and constructive appraisal of our work. As stated above, reviews like this are extraordinarily helpful. The reviewer mentions four main points that we have addressed:

      1. We now expand a bit on the justification to use a (3:1) male-biased sex ratio in the methods section (lines 150-155). We also acknowledge potential limitations of this design in the discussion (lines 563-571).
      2. To clarify the methods, we have placed this section before the results. This, in itself, has significantly improved the clarity of the manuscript. We have also substantially re-written the methods and results (including adding some tables) to streamline the text while providing all the necessary details, and have also included several diagrams to illustrate all our experiments (in the SM, see Figs. S1.1 to S1.5) along with a general schematic figure of the general design that we present early on in the main text (in the introduction, see Fig. 1).
      3. As suggested, we have re-run all analyses using the Benjamini-Hochberg procedure in order to correct for inflation of type I error rate due to multiple testing. We have also included in the SM a complementary set of models that also test for this via post hoc Tukey contrasts. Both these approached corroborate our initial findings, and thus contribute to strengthen our results.
      4. We now explicitly discuss the female side of things in the discussion (lines 636-647).

      Reviewer #2 (Public Review):

      Londoño-Nieto et al. investigated the influence of temperature on the form and intensity of sexual conflict in Drosophila melanogaster. They aimed to test the effect of naturally occurring temperature fluctuations on a wild population of Drosophila while disentangling pre- and postcopulatory episodes of sexual conflict. To this end, they exposed females to males under monogamy or polyandry, hence manipulating the degree of male harm experienced by females. The effect of temperature was explored by exposing these groups to 20, 24, or 28{degree sign}C. They found that female fitness suffered from male harm most at 24{degree sign}C and less at the other two temperatures. Interestingly, pre- and postcopulatory episodes of sexual conflict were affected differently by temperature. Overall, these data suggest that the relationship between sexual conflict and temperature can be strong and complex. Hence, these results can have important implications for the impact of sexual conflict on population viability, especially in light of the climate crisis.

      We want to thank the reviewer for the time invested in reading and reviewing our work. We are glad to read that the reviewer found our results interesting and considered our study to be of importance to the field.

      This paper tackles a highly relevant question using an established model organism for sexual conflict and contains a rich dataset obtained using a series of carefully planned experiments and analysed in an appropriate way. Importantly, the authors used biologically meaningful temperatures and mating treatments, which increases the relevance of the data. The main conclusions are well supported by the data. Nevertheless, the devil is in the detail, and given the way the authors frame their study (i.e. testing a natural population under naturally occurring temperature fluctuations) and their results (i.e. sexual conflict is buffered by temperature effects in the wild) there are some limitations to be considered:

      We appreciate the positive feedback! The reviewer identified potential limitations and made good suggestions that have only served to improve our manuscript considerably, for which we are very grateful. Details follow on how we have dealt with each specific comment.

      1) The authors frame their study as addressing the question of how sexual conflict reacts to naturally occurring temperature fluctuations in the wild. Nevertheless, the population used in this experiment had been kept for nearly 3 years in the laboratory prior to the experiment. Importantly, the authors ensured that the laboratory population maintained genetic diversity, by regularly crossing wild lines into it. Nevertheless, this population remained for some time in the laboratory under standardized conditions. The applied temperature fluctuations are in a biologically meaningful range (though only during the reproductive season), but it remains unclear if the applied fluctuations were in a standardized way (i.e. pre-programmed) or included random fluctuations (i.e. a more natural setting). This laboratory setup has certainly clear advantages, for example, it enables the exclusion of any effects other than the temperature on sexual conflict. Nevertheless, how these will then ultimately play out in the wild could be a different story.

      Agree. We clarify now that we meant pre-programmed fluctuations and acknowledge this limitation in the methods (lines 124-131).

      2) The authors highlight clearly that temperature fluctuations in the wild might play an important part in how sexual conflict plays out in natural populations. This very interesting and highly relevant point might lead the reader to assume that this is what was actually tested in the experiment. Nevertheless, in the experiments, different constant temperatures were applied to the flies, while only the stock population was kept at a fluctuating temperature regime. Hence, the influence of fluctuations during episodes of sexual conflict remains untested. While the present data show that sexual conflict can be modulated by temperature, the effect of naturally occurring fluctuations on the net cost of sexual conflict to a population remains unclear.

      Again, a fair point that we acknowledge in the current version (lines 571-575). “Second, our treatment temperatures were stable, designed to study how coarse-grain changes in temperature across the adult lifespan of flies may influence how sexual conflict unfolds in nature. Thus, future studies will need to encompass how fine-grained fluctuation (i.e., repeated variation of temperature across an adult’s lifespan) may affect male harm for a more comprehensive picture of temperature effects on sexual conflict in the wild”.

      3) The authors conclude that the effect of sexual conflict can be buffered by temperature in the wild. In general, I agree with this, although a more conservative way of framing this would be to say that temperature modulates or moderates sexual conflict instead of buffers it. If there really is a buffering effect of temperature in the wild remains to be tested, I believe. This will depend on how actual changes in temperature affect this dynamic (see point 2). In addition, I think another interesting open question is what the mechanism behind the observed differences might be. Are male and female interests really more aligned at different temperatures (i.e. males plastically reduce harm)? This would really buffer the harm of sexual conflict at those temperatures. Nevertheless, alternatively, males might not be perfectly adapted to manipulate the female optimally at lower or higher temperatures. This would mean that if the temperatures change, males might evolve to increase the manipulation of females, and hence the scope for sexual conflict might not change in the end under this scenario. Nevertheless, as the authors themselves state: 'An intriguing possibility is thus that SFPs are more effective at lowering female re-mating rates at warm temperatures, thereby buffering these costs.' Therefore, a temperature-dependent increase in the effectiveness of male manipulation might counterintuitively reduce sexual conflict in this species.

      We echo both points in the current version of the paper (see lines 633-655).

      4) In the end the authors argue that the climate crisis might have 'unexpected positive consequences via its effect on male harm'. Sexual conflict is indeed widespread, but it takes many different forms (as has been nicely described in the introduction of this paper). Because the studied system seems to be quite a specific example, it is questionable how far spread this phenomenon is in nature. In addition, it remains unclear how male harm will evolve in response to the climate crisis (see point 3). Finally, the relative fitness of females increased in the present experiment, as the tested range was within the reproductive optimum of the species. Nevertheless, the relative importance of the positive effect of sexual conflict on fitness outside of optimal temperatures seems questionable.

      Agree. Altogether, we have tried to tone down our conclusions regarding the implication of our results for a climate change scenario, and acknowledge all the points highlighted by the reviewer in the current version of the manuscript (see lines 563-575).

      Nonetheless, I believe these results to be of exceeding interest to the scientific community and of importance to the field. It opens up many potential research directions and adds further data to the fascinating field of sexual conflict, SFPs, and male harm in Drosophila.

      We are thrilled to read that the reviewer found our study of exceeding interest.

      Reviewer #3 (Public Review):

      In this paper, the authors explore the effects of the environment, specifically temperature, on male harm to females. Male harm is the phenomenon where males reduce female fitness in polyandrous systems, where a single female may mate with multiple males. The selection of males to increase their reproductive success in male-male competition can lead to genetic conflict that increases male fitness at the expense of female fitness. Typically, male harm has been studied in single environments under optimal conditions. However, there is an increasing focus on the effect of the environment on fitness costs of male harm to females, as a way to better understand the effect of male harm on population fitness in more realistic ecological contexts. In this paper, the authors add to these studies by exploring the effect of temperature on male harm and female fitness, using the fruit fly Drosophila melanogaster, as a model system. They find that temperature affects the impact of male harm on female fitness, with male harm having the greatest effect at 24˚C relative to 20˚C and 28˚C. The authors then go on to disentangle how temperature affects the various components of male harm that impact female fitness (e.g. harassment, ejaculate toxicity). The paper demonstrates that male harm depends on ecological context, which has implications for understanding its impact on population fitness under realistic ecological scenarios, particularly with respect to climate change.

      The strength of the paper is that it demonstrates that male harm (presented as differences in female life reproductive success between monogamous and polyandrous matings) changes with temperature. The authors dissect this general observation by showing that different aspects of precopulatory reproductive behavior, for example, male-male aggression, copulation rate, and female rejection rate, also change with temperature. Further, they demonstrate that correlates for male ejaculate quality also change with temperature, suggesting that temperature also affects postcopulatory mechanisms of male harm.

      The weakness of the paper is that the method and results section are difficult to follow, which negatively impacts the interpretation of the data. The experiments are complex and need to be for what the authors are studying. Nevertheless, the paper is written in a way that makes it challenging for the reader to fully understand how precisely the experiments were conducted. Further, the authors do not explain clearly how some of the experiments relate to the phenomenon ostensibly being assayed. For example, a more detailed explanation of why mating duration and remating latency are assays for ejaculate quality in the context of sperm competition would be very helpful in interpreting the data. Further, a clearer explanation of the statistical analyses conducted

      Thank you for the positive, detailed and constructive review. We agree with all the weaknesses laid out and we have strived to address all of them in the current version. This includes a mayor rearrangement, structuring and re-write of the methods and results section and extra statistical analyses. Please find the details below.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors generated a detailed single-cell RNAseq dataset for the microfilariae stage of the human nematode parasite Brugia malayi. This is an impressive and important achievement, given that it is difficult to obtain sufficient material from human parasites and the microfilariae are protected by a chitin sheath. The authors collected microfilariae from jirds and carefully worked out a protocol of digestion, dissociation and filtering, to obtain single-cell material for sequencing.

      The single-cell resource was complemented with a dataset derived from FACS-sorted large secretory cells, allowing the identification of several specific proteins expressed in this unique microfilarial cell-type important for immune evasion.

      The authors also generated new data for secretory cells of Caenorhabditis elegans and concluded that there is limited similarity between the composition of Brugia and C. elegans secretory cell types.

      In a further set of experiments, the authors analysed gene expression changes in dissociated Brugia cells to the commonly used anthelminthic drug ivermectin. This revealed specific gene expression changes across various cell types, providing new insights into how the drug effects the parasite.

      Finally, the authors developed a method to keep dissociated Brugia cells alive in culture for two days. This method will aid cellular studies of this parasite.

      The authors may want to explore the new resource in more detail to reach more specific biological conclusions. For example, the authors mention that the large secretory cells are critical to parasite survival and immune evasion. With a more complete list of genes expressed in these cells the authors could try to reach more specific conclusions or predictions. Are there newly identified secreted factors that could contribute to immune evasion? It would be important to read in more detail about such proteins (including an analysis of the sequences and phylogenies), especially if the authors could identify new candidates as potential vaccine or diagnostic targets. Likewise, can the data be used to understand in more detail the mechanism of immune evasion or ivermectin action?

      Thank you for this comment. We have since added a source data file with the list of secretory cell DEGs along with gene ontology (GO) analysis. We have added a main figure to the revised manuscript that takes a deeper look at transcripts enriched in the secretory cell compared to other annotated cell types. Lastly, we included a deeper look at the paralogous expansion of C2H2 transcription factors that localize near exclusively to the secretory cell. This family of transcription factors is diverse and the significant presence in the secretory cell may play a role in adapting to varying host environmental conditions or in the expression of proteins contributing to immune evasion. Our single-cell data show specific transcriptional shifts in cells expressing putative IVM targets and recapitulate changes identified in whole-parasite drug exposure experiments and highlight the importance of cell connectivity to the in vitro phenotype. These supplemental analyses of the secretory cell will seed future lines of investigation about secretion and aid in further dissecting anthelmintic mode of action.

      The authors searched for known secreted proteins, including antigens, vaccine targets, and diagnostic markers and mapped the expression of these to the single-cell atlas. It is not clear from the paper how comprehensive previous studies to identify secretory proteins were. With the new resource in hand, the authors could look at all secreted proteins (with a signal peptide) expressed in the ES and other cells. The paper would benefit from a more comprehensive overview of the classes of secretory proteins and their expression.

      Thank you for this suggestion. We have completed a computational prediction of signal peptides in differentially expressed secretory cell transcripts (Figure 4) and show that although there is an enrichment of signal peptide-containing sequences enriched in the secretory cell compared to other cell types, less than half of the proteins identified contained signal peptide sequences.

      This was unsurprising as most secreted proteins identified in the literature (diagnostic and vaccine targets) do not have a signal peptides. The routes of exit for these prominent circulating targets remain murky. We also carried out transmembrane prediction on protein-coding genes that are differentially expressed in the secretory cell (and other cell types) and note that some of these are established components of exosome-like vesicles, emerging as important players in host modulation. This additional analysis has been added to a new figure (Figure 4) and the accompanying results section.

      The authors show that an abundance of C2H2 transcription factors is localizing almost exclusively to the secretory cells. It would be useful to see a classification of these proteins and phylogenetic analysis relating them to C2H2 from C. elegans and other animals.

      The C. elegans genome contains 106 annotated C2H2 zinc finger transcription factors. Based on a reverse phylogenetic approach, we identified a total of 241 orthologous C2H2 zinc finger transcription factors in B. malayi, many of which exhibit strong and/or exclusive expression in the secretory cell. This analysis has been added to an additional figure (Figure 4) describing the secretory cell in more depth alongside signal peptide and transmembrane domain analysis of differentially expressed genes in the secretory cell compared to other identified major cell types.

      In general, a more detailed bioinformatic analysis of secretory products and more discussions of potential functions (e.g. serpins etc.) would make the paper more interesting and could stimulate more mechanistic thinking.

    1. Author Response

      Reviewer #1 (Public Review):

      Tunneling nanotubes, contrary to exosomes, directly connect remote cells and have been shown to allow the transfer of material between cells, including cellular organelles and RNAs. However, whether sorting mechanisms exist that allow to specifically transfer subspecies of RNAs, especially of mRNA, has not been shown, and the transcriptional consequences of RNA transfer have not been addressed yet.

      Using cocultures (or mix or single cultures as controls) of human MCF7 breast cancer cell line, and immortalized mouse embryo fibroblasts (MEFs), followed by separation of human and mouse cells by cell sorting, the authors performed deep sequencing of the human mRNAs detected in mouse cells. An accurate analysis of the transferred material shows that all donor cell mRNAs transfer in a manner that correlates with their expression level, with less than 1% of total mRNA being transferred in acceptor cells.

      These results show that the process of RNA transfer is nonselective and that the consequences on the cells receiving the RNAs should depend on the phenotype of the sending cells.

      Although we did not address this last point in the original paper, we concur with this statement since we presented evidence to this effect in our previous publication (Haimovich et al., 2017) and which we discussed in the in the original Discussion section (lines 498-508 in the original manuscript; lines 529-539 in the revised manuscript). We have now amended the Introduction (line 91 of the modified manuscript) to reflect this idea.

      These results are complemented by the last part of the manuscript where the authors convincingly show that the coculture of the two cell lines results in significant transcriptomic changes in acceptor MEF cells that could become CAF-like cells.

      Reviewer #2 (Public Review):

      In this manuscript, the authors characterize the extent of RNA transfer between cells in culture, with an emphasis on trying to identify RNAs that are transferred through tunneling nanotubes (TNTs). They use an in vitro human-mouse cell co-culture model, consisting of mouse embryonic fibroblasts and human MCF7 breast cancer cells. They take advantage of the CD326 cell surface molecule, which is specifically expressed on MCF7 cells, to separate the two cell populations using magnetic beads conjugated to anti-CD326 antibodies, followed by deep sequencing to identify human RNAs present in mouse cells. They identify many 'transferred' RNAs. Further analysis of sequencing data together with experiments using synthetic reporters indicate that RNA transfer is non-selective, that the amount of transfer strongly correlates with the level of expression in donor cells, and does not appear to require specific RNA motifs. The authors also note that co-culture with MCF7 cells leads to significant changes in the MEF transcriptome.

      The experiments are overall carefully designed, and the data are clearly and quite carefully presented to point out limitations in interpretation and to distinguish speculations from experimental conclusions.

      We thank the reviewer for this comment.

      It should however be kept in mind that it is unclear to what extent these limitations influence the conclusions reached. For example, the identification of transferred RNAs relies on the purity of the isolated cell populations ad, while the authors provide some supporting evidence for this, nevertheless potential caveats remain. For instance, the isolated MEF samples used for analysis appear to lack single MCF7 cells, but still contain components, labeled as 'double stained' and 'unstained' cells, which are uncharacterized. The authors present some arguments as to why these would not contribute to 'transferred' reads, but given the low level of detectable transferred RNAs, and the unclear origin of these components, whether they influence the results could be debatable.

      It is unlikely that these populations contributed to the human mRNA signals in the MEFs, since the percentage of these populations was substantially higher in the “Mix” samples than in the “Co-culture” samples. We now added the following text (lines 174-181 in the revised manuscript) which clarifies this point: “In addition, we found small sub-populations of double-stained and unstained cells within the purified populations that we suspect are mostly MEFs (see Methods). These sub-populations were greater in the Mix-derived MEFs vs. the Co-culture-derived MEFs (i.e. 0.08% and 0.03% double-stained, and 2.8% and 2.67% unstained in Mix samples vs. 0% and 0.03% double-stained, and 1% and 1.9% unstained in the Co-culture samples). As a consequence, if these double-stained and unstained cells had contributed to the background of human reads in the MEFs, we would’ve expected to have many more human reads in the Mix-derived MEFs.” However, this was not the case, rather we observed a 6.6-fold increase in human RNA presence in the Co-culture-derived MEFs (versus that in the Mix-derived MEFs) after subtraction of the single culture background. In addition, we note that the level of detectable human RNAs in the MEFs is not low, rather it is the percentage of human RNA that undergoes transfer that is low.

      Furthermore, the small number of replicates (2 replicates for the genome-wide studies and 1 replicate for most of the subsequent experiments) minimizes the confidence in the conclusions.

      We apologize for not stating it clearly that the smFISH, RT-qPCR ,and quadrapod experiments were all performed in 2 replicates. This information has now been added to the figure legends.

      In this context, it is also notable that the profile of transferred RNAs between the two replicates of co-cultured samples appears quite different by PCA analysis. It is thus conceivable that there might be specificity in the RNA 'transferome', influenced by unknown experimental variables, which is though masked when averaging those samples in subsequent analyses.

      We have replied to Reviewer #1 on this issue. PCA analysis (Figure 2B) of the heat map data (Figure 2A) reveals the similarity between the different samples, whereby 78% of the variability in the data is revealed by PC1 and 6.7% by PC2. Given that PC2 measures only 6.7% of the variation in the data, it likely results from small differences in the individual co-culture samples (such differences are often observed within replicas of RNA-seq experiments) and not via major differences in the measured transferomes. This indicates that the co-culture samples were overall quite similar as can also be observed from the heat map shown in Figure 2A, as differentiated from the controls (e.g. Mix, Single culture). Thus, we do not believe that further replicas will greatly change the results showing the abundant presence of human RNAs in the mouse cells after subtraction of the Mix background. We included additional sentences in the text and figure legend to clarify this point (lines 208-212 in the revised manuscript).

      While the manuscript emphasizes the role of TNTs in RNA transfer, the actual involvement of TNTs relies solely on the observation that potential TNTs form between co-cultured cells. Other means of transfer, such as through engulfment or phagocytosis of cell fragments, could still possibly contribute.

      While it is possible that transfer might occur through other means, our earlier paper (Haimovich et al., 2017) showed that engulfed apoptotic bodies rarely contribute to mRNA transfer, even upon near-100% of donor cell death. Moreover, RNAs in apoptotic bodies found in acceptor cells can be clearly identified by smFISH, as the RNAs are tightly clumped together. Likewise, our quadrapod experiments (Figure 6-figure supplement 1) might have revealed RNA transfer if engulfment of cell fragments had occurred.

      Furthermore, the dependence of mRNA transfer on direct cell-to-cell contact is demonstrated for 5 RNAs and extrapolated to transcriptome-wide RNA transfer, an assumption which might, or might not, be valid.

      We concur that we extrapolate from the few validated examples and have now added the following text (line 604-611 in the revised manuscript): “We validated several examples of transferred mRNAs that transfer via a contact-dependent mechanism, likely TNTs (Figure 6 and Figure 6-figure supplements 1 and 3), and extrapolate from them to the entire transcriptome. Although it is possible that some or many mRNAs transfer by means other than TNTs, we think it unlikely, since the results on TNT-mediated cell-to-cell transfer in both this and our previous publication (Haimovich, 2017), as well as by others (Ortin-Martinez et al., 2021; Su and Igyarto, 2019), tested a variety of mRNAs from different families and which localize to various sub-cellular localizations. This indicates that the pathway we have uncovered is more general than the few examples presented here.” In addition, we now cite in the Discussion (lines 611-621 in the revised manuscript) a new pre-print recently posted to bioRxiv that shows similar results of mRNA transfer in a human-mouse cells co-culture model.

      Finally, the results on gene expression changes induced by co-culture (Figures 7, 8) are of unclear relevance. As the authors point out, it is uncertain whether RNA transfer or other paracrine or adhesion-mediated signaling events, underlie these changes. It is therefore not easy to see how these results relate to the rest of the presented work. Furthermore, while the authors expand on the potential significance of changes observed in genes related to cancer-associated fibroblasts or to immunity-related genes, these remain speculative and untested.

      We concur that the part of the paper regarding the consequences of co-culture (upon the endogenous transcriptome) does not clarify the specific contribution of the “transferome” to the phenomenon. Future co-culture studies measuring transcriptome-wide transfer using the quadrapod co-culture system versus cell-cell contact co-culture could be performed. Yet, to make the distinction between TNT-dependent and -independent effects when cells are in contact will require further mechanistic knowledge of TNT-mediated mRNA transfer, which is beyond the scope of this paper. Nevertheless, we believe that the data on the endogenous gene expression in co-culture is important and could be useful to the cancer research community outside the context of the transferome information.

      Overall, the manuscript presents evidence indicating that RNA is transferred non-selectively in co-cultured cells, under specific conditions and between the cell types tested. The impact of the work is reduced by the lack of mechanistic understanding underlying this transfer and the uncertainty of whether this phenomenon has any subsequent physiological relevance.

      Our global analysis of TNT-mediated transfer (the transferome) is only a second step towards understanding this important and only recently identified process (i.e. the first step). Obviously, we would be happy to gain more mechanistic insight and knowledge of physiological relevance. We are currently working on several projects to try and answer some of these questions, but as one can understand, these are technically challenging, and have not yet come to fruition.

    1. Author Response

      Reviewer #1 (Public Review):

      The human genetic variant Dantu increases the surface tension of red blood cells making it hard for malaria parasites to invade. This was shown beautifully by Kariuki et al in 2020 (doi.org/10.1038/s41586-020-2726-6) by analysing blood from children using in vitro assays with cultured malaria parasites. Now Kariuki et al show that parasite growth is indeed restricted in vivo by infecting Dantu adults under controlled conditions with cryopreserved Plasmodium falciparum sporozoites and analysing parasite growth by qPCR. The authors compare parasite growth, peak parasitaemia and if / when treatment was sought for malaria symptoms between non-Dantu (111) and Dantu heterozygous (27) and homozygous (3) participants. Dantu either completely prevented malaria parasite detection in the blood (for 21 days) or slowed down parasite growth considerably.

      The authors present compelling in vivo evidence that Dantu conveys protection by preventing malaria parasites from establishing a blood-stage infection. Because the effect on parasite growth is crystal clear the link to uncomplicated malaria follows - no/less parasites leads to less participants experiencing malaria symptoms and seeking treatment. It should however be noted that the paper does not show that Dantu reduces symptomatology at identical parasite densities to non-Dantu. Its protective effect seems to be purely parasitological.

      Given that all volunteers were exposed to malaria prior to being experimentally infected (in various transmission settings ranging from low to high) the authors state that they adjusted for factors like schizont antibody concentration in their multi-variate analysis. More details on the assumptions and which dependent / independent variables were included would benefit interpretation. It would be also good to see if Dantu individuals were spread homogeneously across all transmission settings - if e.g. they all had history of intense malaria exposure and thus strong pre-existing anti-malaria immunity this might account in part for reduced parasite growth when compared to non-Dantu from lower transmission settings. Being able to de-convolute the effect of pre-existing immunity from Dantu would strengthen the paper.

      Thank you for the positive feedback and summary of the key findings. We absolutely agree that breaking down the impact of Dantu genotype by transmission would have been very interesting, but the sample numbers for some of the genotypic groups were simply too small to make stratification by area of residence meaningful. Instead, to address the core issue of whether prior immunity is a complicating factor in our analysis, we used measurements of antibodies to whole schizont extract as a proxy indicator of transmission setting or “malaria exposure” in our multivariate analyses. There was no difference in anti-schizont antibody levels across Dantu genotype groups – these data are now included in Figure 3 – figure supplement 1, as requested. This suggests that differences in pre-existing anti-malaria immunity between Dantu and non-Dantu cannot explain the differences seen in our current study. Regarding the comment about assumptions and variables in the multivariate analysis, we have added more details as requested, as outlined in further detail in subsequent points below.

      The authors also presents data on other red cell polymorphisms known to modulate malaria infection and improve outcome: G6PD, blood group O, alpha thalassaemia and ATP2B4. However, no statistically significant differences between non-carriers and hetero/homozygous individuals were observed. This is probably because these mutations exert their effect not directly on parasite growth but modulate disease symptoms when parasite burden is high - which cannot be investigated in controlled human malaria infection settings as ethical considerations mandate treatment of all volunteers at parasite densities >500 parasites/ ul or any parasitaemia with symptoms. Controlled infections need to be complemented with other methods to understand the protective impact of genetic polymorphisms.

      We thank the reviewer for this helpful observation with which we completely agree. To acknowledge this issue, we have added some consideration of this point to the Discussion section of the revised manuscript, within the sub-section that discusses protective mechanisms of other red cell polymorphisms on page 14.

    1. Author Response

      Reviewer #2 (Public Review):

      Despite high bone mineral density, increased fracture risk has been associated with T2D in humans. In this study, the authors established a model that could mimic some aspects of T2D in mice and then study bone turnover and metabolism in detail.

      Strengths

      This is an exciting study, the methods are detailed and well done, and the results are presented coherently and support the conclusions.

      Previous work from Dr. Long's group over this last decade has established a requirement for glycolysis in osteoblast differentiation. They showed the requirement for glycolysis not only for the anabolic action of PTH but also as an effector downstream of Wnt signaling. Using the T2D mouse model they have generated, they test if manipulating glycolysis and oxidative phosphorylation can rescue some of the detrimental effects on bone in this model. They use several novel approaches, they use glucose-labeling studies that are relatively underutilized, and it provides some insights into defective TCA cycle. They also utilize BMSCs that have been sorted for performing single-cell sequencing studies to identify specific populations modified with T2D. Unfortunately, the results are modest and need some clarification on what these populations add to the story.

      We appreciate the positive comments. Although T2D had only modest effect on the relative pool size of each cell population, the changes in metabolic pathways (glycolysis and oxphos) in several clusters were notable and provided support to the central notion that T2D altered cellular metabolism in osteoblast-lineage and other bone marrow cells.

      The authors use two approaches: a drug (Metformin) and a number of mouse genetic models to over-express genes involved in the glycolytic pathway using Dox inducible models. The results with overexpressing HIF1 and PFKFB3 show a potential rescue of bone defects with T2D, and Glut1 overexpression does not rescue T2D-induced bone loss.

      Concerns

      The authors have generated several overexpression models to manipulate the glycolytic pathway to recuse T2D-induced bone loss. The use of DOX in drinking water has been shown to affect mitochondrial metabolism. Did the authors control for these effects? Since both the groups of mice got the DOX in drinking water, there is internal control.

      The experiments were controlled for any potential effects of DOX per se as all animals were subjected to the same DOX regimen.

      Only one of the rescue experiments had control with the Chow diet. There are some studies that have shown a high-fat diet to be protective of bone loss in TID models.

      We have now added the chow diet control for the Hif1a rescue experiment as well (Fig. 7).

      The use of metformin to correct metabolic dysfunction and, thereby, bone mass is an exciting result. Did the authors test to see if they had in any way rescued this phenotype because of reducing ROS levels? The decrease in OxsPhos seen with the seahorse experiments suggests there could be mitochondrial dysfunction often associated with ROS generation.

      I appreciate the reviewer’s insight here. We have not examined ROS levels but agree that changes in ROS levels could potentially contribute to the bone phenotype in diabetes.

      All of the experiments used male mice (because STZ use and ease of T2D establishment in males). It would be better if this were made clear in the title.

      The title has been revised to specify male mice.

      Is the T2D model presented really represent what is observed in humans? Some experiments to test the other factors implicated in T2D and whether those are modulated in the rescue experiments might help address this.

      Our T2D model exhibited all typical features of T2D patients, those including obesity, glucose intolerance and insulin resistance. We have shown that metformin modestly improved glucose tolerance and insulin sensitivity in the T2D mice (Fig. 6C, E). We have not examined whether those global metabolic features were modulated in the genetic rescue experiments which targeted only osteoblasts.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper establishes a strong case for the post-translational modification of C/EBPalpha to play a strong role in its effects, in this case, to promote macrophage differentiation in collaboration with PU.1. The cellular system being used for most of the experiments here takes advantage of the dual roles of PU.1 in B cells, which normally do not express C/EBP family factors, and in myeloid cells, which normally do express C/EBP family factors. The authors and others have previously shown that PU.1 and C/EBPalpha are very powerful collaborators, both needed to establish a macrophage identity. Thus, the title of the paper provocatively implies that the C/EBP modification that keeps it from being methylated on Arg35 works by increasing the re-distribution of PU.1 from B cells to myeloid gene sites in combination with C/EBP. Indeed, the authors show proximity ligation data to show that PU.1-C/EBPalpha juxtaposition is more frequent in the nucleus if C/EBPalpha cannot be Arg-methylated. The paper also shows careful and thorough characterization of the B to myeloid lineage conversion gene expression changes and the mapping of the Arg residues in C/EBPalpha that are most important to keep demethylation. Similarly, the paper provides strong evidence that it is Carm1, and not another protein arginine methyltransferase, that is responsible for the regulatory modification. This is a valuable and well-characterized demonstration of a mechanism that should be considered more generally as a regulator of transcription factor action.

      The mechanism proposed by the authors is that C/EBPalpha relocates PU.1 to macrophage sites and that C/EBPalpha R35A binds and relocates PU.1 more efficiently than wildtype, and this seems likely and appealing. However, it is not as strongly supported by data within the paper itself as the other points in the paper are. There is a puzzling gap in the data: no direct evidence is shown that C/EBPalpha is really relocating PU.1 from B cell to macrophage regulatory elements at all. Despite the figure titles (Fig. 4 and Fig. S4), there is no ChIP-seq data to show PU.1 binding sites before and after interaction with either wildtype or R35A mutant C/EBPalpha, just accessibility data. There is also a question of whether such a redistribution would occur fast enough to account for the impressive speed of the R35A mutant's other effects. These questions seem fairly straightforward to address. If relevant data could be added, it would greatly increase the impact and generality of the paper. The paper could be published with this claim converted to a suggestion, based on the current data, or it could be published in a higher-impact form if additional data could be provided to demonstrate the relocation more directly. The authors would be more expert about the logistics of the experiment, but it seems that a direct ChIP-seq-based comparison should be feasible and powerful for the argument of the paper.

      We have now included PU.1 and C/EBPa ChIP-seq experiments, using C/EBPaWT and C/EBPaR35A- induced cells, replacing the virtual ChIP-seq experiments. Integrating the data obtained with our dynamic ATACseq data, the new findings largely support the previously proposed PU.1 redistribution (‘theft’) model. To make the data easier to understand, we now first show the PU.1 and C/EBPa binding to distinct B cell- and macrophage- restricted GREs contained in a single genomic fragment (new Fig. 5). The findings nicely visualize how PU.1 becomes redistributed from B-GREs to M-GREs, in a C/EBPa mutant-accelerated manner. We were also happy to see that a genome-wide analysis of the data again shows the accelerated redistribution of PU.1 by C/EBPaR35A (new Fig. 6). Finally, the comparison of the ChIP-seq and ATAC-seq data also added more mechanistic detail, such as by revealing that chromatin remodeling of lineage restricted GREs can be uncoupled from the regulation of associated genes.

      Finally, the effect of the mutation is assumed to be only on the interface for interaction between C/EBPalpha and PU.1 (or other co-factors). However, C/EBPalpha is such a short-lived protein that any modification that slightly increased its half-life could increase its potency. It seems important to present some quantitative protein staining evidence to clarify whether the steady-state level of C/EBPalpha in C/EBPalpha R35A-expressing cells is really unchanged from C/EBPalpha wild-type-expressing cells.

      We agree that this is an important issue and have therefore now performed a cycloheximide experiment with 3T3 cells expressing inducible forms of the two proteins. The data in Figure S4C show that C/EBPaR35A exhibits a similar stability than wild type protein and is expressed at 20-30% lower levels under steady-state conditions in uninduced cells. They also show that C/EBPa is surprisingly stable. These new findings are in line with the comparison of the two proteins by Western blots of mutant and wild type transfected 293T cells and of infected B cells, which also show similar levels of the two proteins (Fig. 7C and D). Therefore, the finding that expression of C/EBPaR35A is similar or slightly lower than that of the wild type argues against the possibility that an elevated expression level of the mutant could explain the effects observed.

      Finally, although not requested by the reviewer, we have now addressed the possibility that that the effect of the alanine replacement of R35 is mostly due to a change from a charged to a non-charged hydrophobic residue. This is not the case, as a replacement of arginine 35 by the charged amino acid lysine still leads to an accelerated BMT induction (Figure S7).

    1. Author Response

      Reviewer #1 (Public Review):

      This paper is based on the premise that ketamine exerts antidepressant effects that are rapid by increasing glutamatergic transmission. However, the authors note that how this effect occurs is unclear because ketamine antagonizes the NMDA receptor, a glutamatergic receptor. Others have suggested a compensatory change in the glutamatergic transmission and the authors suggest how this might occur. The authors should clarify if prior studies suggested a mechanism different from theirs and if so, which might be correct.

      There are also other mechanisms, such as the block of NMDA receptors on interneurons and the disinhibition of principal cells. It is important to clarify if this has already been addressed in the literature. Also, if their cultures are primarily glutamatergic neurons or they include interneurons and glia.

      The authors show calcineurin is reduced after ketamine exposure and this increases AMPA receptor GluA1 phosphorylation. They also show that Calcium permeable AMPA receptors (CP-AMPARs) increase.

      They also use suggest that the CP-AMPARs and other changes lead to enhanced synaptic plasticity, which could lead to antidepressant effects.

      Although a lot of work is done in cultured hippocampal neurons, 14 days in vitro, they show effects in vivo that are consistent with the data from cultures. For example, ketamine increases GluA1 phosphorylation. Also, blocking CPAMPARs in vivo reduces anxiety/depressive behaviors such as the open field and tail suspension tests.

      Overall the study appears to be done well and the presentation, writing, and references are good. There are important concerns regarding statistics, behavior, and pharmacology and several minor concerns.

      Major concerns

      1) Statistics.

      What was the stat test if the control was always 1? Often the control group is 1.00 with no SD but in other tests, the control group is 1.000 with an SD.

      In the previous submission, we neglected to include this information. Immunoblotting data have variable raw values; hence, the control group was used to normalize each group and was compared to the experimental groups. Thus, the control value for immunoblotting was always 1.000 without SD. Similarly, for imaging data, the average peak amplitude in control cells was used to normalize the peak amplitude in each cell and was compared to the experimental groups' average; thus, the control group is 1.000 with SD. The Franklin A. Graybill Statistical Laboratory at Colorado State University has been consulted for statistical analysis in the current study, including sample size determination, randomization, experiment conception and design, data analysis, and interpretation. Grouped results of single comparisons were tested for normality with the Shapiro-Wilk normality or Kolmogorov-Smirnov test and analyzed using the unpaired two-tailed Student’s t-test when data are normally distributed. Differences between multiple groups with normalized data were assessed by nonparametric Kruskal-Wallis test with the Dunn’s test.

      2) Behavior.

      It is not clear that the open field and tail suspension tests measure antidepressant actions. Why were more standard tests such as forced swim or sucrose preference, novelty-suppressed feeding, etc not used?

      We agree with the Reviewer’s concern. However, both the open field test and tail suspension test have long been used to determine animals’ anxiety-like and depression-like behaviors, respectively, in rodents (Seibenhener and Wooten, 2015; Ueno et al., 2022). Specifically, the open field test has been widely used to measure the ketamine effects on anxiety-like behavior in rodents (Guarraci et al., 2018; Pitsikas et al., 2019; Shin et al., 2019; Akillioglu and Karadepe, 2021; Yang et al., 2022; Acevedo et al., 2023). The tail suspension test has also been used to examine the ketamine effects on depression-like behavior in animals (Fukumoto et al., 2017; Yang et al., 2018; Ouyang et al., 2021; Rawat et al., 2022; Viktorov et al., 2022). Studies suggest that the forced swim test and the tail suspension test are based on the same principle: measurement of immobility duration while rodents are exposed to an inescapable situation (Castagne et al., 2011). Importantly, it has been suggested that the tail suspension test is more sensitive to antidepressant agents than the forced swim test because the animal will remain immobile longer in the tail suspension test than the forced swim test (Cryan et al., 2005). For this reason, we chose to use the tail suspension test instead of the forced swim test. This information has now been included in the revised manuscript. Additionally, because ketamine produces antidepressant effects within one hour after administration in humans (Berman et al., 2000; Zarate et al., 2006; Liebrenz et al., 2009), our study aims to understand the mechanism underlying ketamine's rapid (less than an hour) antidepressant effects. Given that sucrose preference test and the novelty suppressed feeding test need multiple days, it would not be suitable to achieve our goals.

      3) Pharmacology.

      The conclusions rest on the specificity of drugs.

      Is 5 uM FK506 specific?

      20 μM 1-naphthyl acetyl spermine (NASPM)?

      10 mg/kg IEM-1460?

      We neglected to add the rationale for the drug concentrations in the previous submission. Previous research, including our own, has employed FK506 at a variety of different concentrations to inhibit neuronal calcineurin activity (1 - 50 μM) (Hsieh et al., 2006; Schwartz et al., 2009; Kim and Ziff, 2014). Specifically, we have shown that 5 μM FK506 treatment for 12 hours significantly reduces neuronal calcineurin activity to increase GluA1 phosphorylation, which induces the expression of CP-AMPARs to elevate AMPAR-mediated synaptic activity (Kim and Ziff, 2014). Moreover, previous studies, including our own, have used NASPM at a variety of different concentrations to inhibit CP-AMPARs (3 - 250 μM) (Tsubokawa et al., 1995; Koike et al., 1997; Noh et al., 2005; Nilsen and England, 2007; Hou et al., 2008; Kim and Ziff, 2014). In fact, we have shown that 20 μM NASPM significantly reduces CP-AMPAR-mediated synaptic and Ca2+ activity (Kim and Ziff, 2014; Kim et al., 2015b). Finally, multiple reports demonstrate that 10 mg/kg IEM-1460 significantly reduces in vivo CP-AMPAR activity (Wiltgen et al., 2010; Szczurowska and Mares, 2015; Adotevi et al., 2020). This information has now been included in the revised manuscript.

      Reviewer #3 (Public Review):

      Ketamine has been shown to be effective at producing a rapid-antidepressant effect at low doses, but the underlying molecular mechanism of this effect is still not clear. Previous studies have suggested that the effect of low-dose ketamine may occur by promoting neuronal plasticity in the hippocampus. However, this goes against the findings that ketamine acts as a noncompetitive NMDA receptor antagonist, which should prevent NMDAR-dependent plasticity. Furthermore, a therapeutic dose of ketamine has been shown to increase neuronal Ca2+ signaling, which again does not conform to its antagonistic action on NMDA receptors. In this paper, the authors provide evidence that therapeutic low-dose ketamine increases the expression of Ca2+-permeable AMPA receptors (CP-AMPARs) by increasing phosphorylation of GluA1 subunit of AMPARs and surface expression of GluA1-containing CP-AMPARs. They further provide evidence that this is likely mediated by a decrease in calcineurin activity and that blocking CP-AMPARs prevent the antidepressant effect of ketamine in mice. One interesting finding of this study is that the authors see heightened sensitivity of ketamine in female mice, both at the level of behavioral readout and for molecular correlates. This finding is interesting in light of the different pharmacokinetics of ketamine reported in females and that ketamine metabolites can bind estrogen receptors.

      Based on their data and previous findings, the authors outline a plausible molecular signaling mechanism for the antidepressant effect of ketamine. Specifically, the authors propose that reduced neuronal activity, which could be triggered by ketamine-induced NMDAR antagonism, causes homeostatic plasticity to upregulate GluA1-containing CP-AMPARs. Their data would support this idea, as phosphorylation of GluA1 as well as increased surface expression and functional incorporation of CP-AMPARs at synapses have been shown before in models of homeostatic plasticity.

      1) Overall, the study is well-done and the data presented support the main conclusions. One main question is whether the current finding provides a conceptual advancement in our understanding of the molecular signaling involved in ketamine's antidepressant effects.

      We thank the reviewer's critique. In fact, research suggests multiple potential mechanisms of ketamine-induced neural plasticity. The main mechanism by which ketamine produce their therapeutic benefits on mood recovery is the enhancement of neural plasticity in the hippocampus (Miller et al., 2016; Aleksandrova et al., 2020; Kavalali and Monteggia, 2020; Grieco et al., 2022). However, ketamine is a noncompetitive NMDAR antagonist that inhibits excitatory synaptic transmission (Anis et al., 1983). A hypothesis to explain these paradoxical effects is that ketamine acts via direct inhibition of NMDARs localized on inhibitory interneurons, leading to disinhibition of excitatory neurons and a resultant rapid increase in glutamatergic synaptic activity to activate Ca2+ signaling pathway (Deyama and Duman, 2020; Gerhard et al., 2020). This stimulates the brain-derived neurotrophic factor (BDNF) signal pathway, which subsequently increases the translation and synthesis of synaptic proteins to enhance AMPAR-mediated synaptic plasticity (Deyama and Duman, 2020). Another potential explanation is that ketamine inhibits NMDARs on excitatory neurons, which induces a cell-autonomous form of homeostatic synaptic plasticity resulting in increased excitatory synaptic drive onto these neurons (Miller et al., 2016; Kavalali and Monteggia, 2020). Homeostatic synaptic plasticity is a negative-feedback response employed to compensate for functional disturbances in neurons and expressed via the regulation of AMPAR trafficking and synaptic expression (Wang et al., 2012). According to this hypothesis, ketamine disrupts basal activation of NMDARs on excitatory neurons, which engages a mechanism of homeostatic synaptic plasticity that results in a rapid compensatory increase in synaptic AMPAR expression in these neurons in a protein-synthesis dependent manner (Kavalali and Monteggia, 2023). Additionally, there is a NMDAR inhibition-independent mechanism mediated by hydroxynorketamine (HNK), the ketamine metabolite that lacks NMDAR inhibition properties (Carrier and Kabbaj, 2013; Franceschelli et al., 2015; Zanos et al., 2016). The current study offers a new neurobiological basis for ketamine’s actions that depend on the NMDAR inhibition-mediated elevation of GluA1-containing AMPAR trafficking, which is likely independent from the previous described mechanisms including the BDNF-induced protein synthesis-dependent (Deyama and Duman, 2020) or the NMDAR inhibition-independent pathway (Carrier and Kabbaj, 2013; Franceschelli et al., 2015; Zanos et al., 2016). Nonetheless, there are still many important questions surrounding the molecular mechanisms of ketamine's actions. This new information has now been included in the revised manuscript.

      2) There are previous studies that showed an increase in CP-AMPARs in the nucleus accumbens and an increase in the expression of GluA1 in the hippocampus with low-dose ketamine. In addition, ketamine's antidepressant effect has been shown to require GluA1 phosphorylation. The main contribution of this paper might be that it provides the potential molecular signaling within the same preparation (i.e. hippocampal neurons) and provides a causal link of CP-AMPARs in mediating the behaviorally measured antidepressant effect of ketamine.

      The study showing that ketamine induces the insertion of CP-AMPARs in the nucleus accumbens did not examine whether this change resulted in antidepressant behaviors (Skiteva et al., 2021). Therefore, it is difficult to conclude that the ketamine-induced expression of CP-AMPARs in the nucleus accumbens plays a role in behaviors. Moreover, as described above, a recent study shows that the hippocampus is selectively targeted by ketamine (Davoudian et al., 2023). We thus chose the hippocampus as our experimental model to test our hypothesis. However, we are unable to rule out the potential role of nucleus accumbens in ketamine’s antidepressant actions.

      3) Another question is whether the behavioral effect of ketamine is due to molecular changes in the hippocampus as outlined in this paper. A more targeted inhibition of CP-AMPAR function could resolve this issue. With the systemic application of CP-AMPAR antagonist as done in this study, it would be hard to know the role of CP-AMPAR upregulation in the hippocampus in mediating ketamine's effect. Especially, considering that low-dose ketamine has been shown to upregulate CP-AMPARs in the nucleus accumbens. While it would have been nice to know the site of action, this does not alter the conclusion that CP-AMPARs are involved in mediating the antidepressant effect of ketamine on behavioral readouts.

      We agree with this point. We have thus removed “the hippocampus” in the title and have further made equivalent revisions in the other parts of the revised manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors have used computational models and protein design to enhance antibody binding, which should have broad applications pending a few additional controls. The authors' new method could have a broad and immediate impact on a variety of diagnostic procedures that use antibodies as sensitivity is often an issue in these kinds of experiments and the sensitivity enhancement achieved in the two test cases is substantial. Affinity maturation is a viable approach, but it is laborious and expensive. If the catenation method is generalizable, it will open up opportunities for antibody optimization for cases where affinity maturation is either not feasible or otherwise impractical. Less clear is how this method might enhance therapeutic potency. Issues that arise when using therapeutic antibodies are often multifactorial and vary depending on the target and disease state. Many issues that occur with antibody-based therapies will not be rectified with affinity enhancement.

      We agree with the limitation.

      Reviewer #2 (Public Review):

      The paper presents an interesting design approach to having homodimeric IgGs with higher binding affinity to the antigens on a surface by fusing a weakly homodimerizing protein (a catenator) to the C-terminus of IgG. Considering the homodimeric IgGs with likely enhanced antigen binding ability and their stabilization with a reversible catenation when bound to the surface is an interesting idea. With agent-based modeling - the simulations based on Markov Chain Monte Carlo (MCMC) sampling - and proof of concept experiments, it has been possible to show the enhanced antigen binding ability of the homodimer Igs for many folds, where the weakly homodimerizing ability of the catenator is indicated to have a central role, enabling proximity effect driven catenation on the antigen bound surfaces. While the results render the enhanced binding affinity of the catenated homodimeric IgGs, the study would benefit from a more elaborated interpretation and discussions of the results.

      The following discussion is now stated in the revision (pages 19-20, in the revision); “While we demonstrated that dual catenator-fused heterodimeric IgGs can enhance binding avidity, the oligomer formation or potential intramolecular homodimerization of the catenator necessitates the development of a more robust catenator for application to conventional homodimeric IgGs. Specifically, the ideal catenator should geometrically disallow intramolecular homodimerization, exhibit fast association kinetics, and be able to withstand the standard low pH purification step. On the other hand, our demonstration indicates that this approach can be applied to bispecific antibodies employing a heterodimeric Fc.”

      One interesting base of the discussion may include how the fusion of the catenator may likely affect the binding behavior, the intrinsic binding behavior, and/or on the global structural changes, of IgGs (monomeric and homodimeric (catenated) per se beyond its proximity-driven contribution. Would it lead to a more restricted structure in the mobility in the unbound states so as to decrease the entropic cost for the binding and thus increase the binding avidity/affinity (in addition to external proximity-driven association). In other words, what would be the role of entropy in the free energy of binding, given that the enthalpic contributions remain the same? Possible effects of the length of the catenator should also in parts be related to the entropy. For example, if a longer and more flexible catenator is considered, what would the resulting observation experimentally and computationally be?

      The binding site occupancy depends on [catAb]/KD. Figure 4-figure supplement 2 shows the binding site occupancy and (KD)eff as a function of (KD)catenator. In this simulation, [catAb] was fixed (10-9 M) while KD was varied (from 10-8 to 10-6). In the figure legend and in the main text, we now explicitly state that KD was varied from 10-8 to 10-6 (page 30, in the revision). To address this comment, we set KD = 10 nM (as used for simulation in Figures 3 and 4), and varied [catAb] from 0.1 to 10 nM. The binding site occupancy and (KD)eff as a function of [catAb] are plotted for three different set values of (KD)catenator (1 μM, 10 μM and 100 μM). The new figures are now presented as Figure 4-figure supplement 3. This simulation shows that the enhancement of (KD)eff by increasing the concentration of catAb is much less dramatic than that by increasing the affinity for catenator homodimerization at [catAb] > 10 nM.

      On the other side, simple simulation approaches have a high value with a level of abstraction while still keeping the physical and biological relevance. In the simulations, i.e. in the sampling of various states, three main terms/rules to govern the behavior are implemented. One is a term favoring an increase in the ability to bind (preventing to unbinding) to the surface upon the catenation of IgGs. This may need to be substantiated for the simulations not imposing a preassumed ability to increase the binding (or decrease the unbinding) ability upon the catenation.

      We agree with the review in that the third rule favors the binding ability of catenated IgGs, because it assumes that catenated antibodies are not allowed to dissociate from the binding site. While this assumption is not exactly correct, we think that it is valid, considering the behavior of a multivalent ligand. When the IgG portion dissociates completely from the binding site, it is still anchored by the catenation arm, and thus it will rebind the same binding site immediately. This postulation agrees with the quantitative analysis showing that multivalent ligand exhibits orders of magnitude binding likelihood increase when the ligand size is comparable to the stretch length of a conjugating linker [Liese, S. & Netz, R. R., ACS Nano, 12, 4140 (2018)].

      The weakly homodimerizing state of the catenator appears as one of the important aspects of the proposed design strategy. Would it also be possible that the experimental observations may readily also imply the higher binding ability of the catenator fused IfgG without the homodimerization on the surface (due to the reduced entropic cost for the binding)? The presentation of the evidence of the homodimerization of the catenator and the catenated IgGs on the surface would strengthen the findings and discussions.

      To fully address this comment, we would need to consider the detailed molecular behavior of the IgG part, the catenator and the linker, probably using molecular dynamics simulation, which we think is outside the scope of the current work. We like to qualitatively describe what we think about the raised issues. Fused to the C-terminus of Fc, the catenator won’t affect the complementary determining region (CDR) of Fab which is located on the opposite side of the C-terminus of Fc. This notion is supported by the observation that the SDF-1α-fused antibodies exhibited association kinetics similar to those of the mother antibodies (Figure 5).

      Regarding the mobility of the structure, we presume that the fused catenator would not interact with the antibody portion and thus it would not affect the intrinsic structural mobility of the antibody.

      Since the catenator is fused to the C-terminus of Fc by a flexible linker, the homodimerization of catenator would decrease the entropy upon catenation. However, the enthalpic contribution would overcome the entropic loss, and result in negative free energy of the catenator homodimerization.

      Figure 2-figure supplement 1 (in the revision) shows the simulation for five different values of the reach length (R), which is the sum of the linker length and half of the catenator length. The simulation results show that the likelihood of catenation decreases as the linker length increases over the distance (d) between the two adjacent catAb-2Ag complexes, while it is maximum when the reach length equals d. Since the catenator length is fixed, increasing the linker length (such that R > d) will lower the catenation effect.

      Reviewer #3 (Public Review):

      The authors proposed an antibody catenation strategy by fusing a homodimeric protein (catenator) to the C-terminus of IgG heavy chain and hypothesized that the catenated IgGs would enhance their overall antigen-binding strength (avidity) compared to individual IgGs. The thermodynamic simulations supported the hypothesis and indicated that the fold enhancement in antibody-antigen binding depended on the density of the antigen. The authors tested a catenator candidate, stromal cell-derived factor 1α (SDF-1α), on two purposely weakened antibodies, Trastuzumab(N30A/H91A), a weakened variant of the clinically used anti-HER2 antibody Trastuzumab, and glCV30, the germline version of a neutralizing antibody CV30 against SARS-CoV-2. Measured by a binding assay, the catenator-fused antibodies enhanced the two weak antibody-antigen binding by hundreds and thousands of folds, largely through slowing down the dissociation of the antibody-antigen interaction. Thus, the experimental data supported the catenation strategy and provided proof-of-concept for the enhanced overall antibody-antigen binding strength. Depending on specific applications, an enhanced antibody-antigen binding strength may improve an antibody's diagnostic sensitivity or therapeutic efficacy, thus holding clinical potential.

      Thanks for the favorable comments.

    1. Author Response

      Reviewer #1 (Public Review):

      The introduction does not clearly set up the background for the key questions that the manuscript addresses. One of the key parts of the manuscript is to attempt to determine whether locomotory behaviour evolves because of direct or indirect selection of the traits. However, the authors don't provide an argument for why a salty environment would select for locomotory traits. Indeed, in the discussion, the authors point out that it is likely an unmeasured trait (body size) correlated with locomotory traits that are under selection. They present arguments for why this might be the case and point to un-included data that show body size significantly genetically covaries with all of the traits studied. Since the authors appear to have these data, and one of their key questions is comparing direct vs. indirect responses to selection, it would be more powerful to include the body size data and estimate selection on all traits together.

      We now include body size in all of our phenotypic and genetic analyses. We also include estimates of selection gradients from the ancestral selection differentials and the Gmatrix. We detail in the Introduction the biological significance of locomotion traits and their potential relationship with body size, in low and high salt environments. The experimental results show that divergence in locomotion traits (Figure 6) correlates with adaptation (Figure 5), because of direct and indirect selection (Figure 9).

      Phenotypic plasticity was estimated from a series of univariate models, with estimates arranged in a vector. As the authors point out in the manuscript, traits that are not included in a model but covary with traits that are can largely bias estimates of the traits that are included. For this reason, it would make sense to estimate phenotypic plasticity using a multivariate model, as has been done for G matrices.

      We analyze the ancestral phenotypic plasticity and the phenotypic divergence during evolution using a multivariate approach (MANOVA). This approach simplifies the text as from the eigen decomposition of the SSCP matrices we can estimate canonical traits of ancestral phenotypic plasticity (pmax; see Table 1 with notation definitions) and phenotypic divergence in the new target high salt environment (dmax). We continue to do the univariate analysis as it allows us to estimate BLUPs for each inbred line (used for visual representation), as well as the significance of phenotypic divergence at each replicate population relative to the ancestral population (delta_q). Both multivariate and univariate approaches led to similar results (shown as supplementary figures).

      The estimation and interpretation of G matrices are a critical part of the manuscript. The authors state that broad sense estimates of G are a good proxy for additive genetic variation in this system, but in the Discussion they also state that overdominance was likely important during evolution to the salt environment, leading to some lack of clarity on whether dominance is important or not.

      We are sorry for the lack of clarity. We have eliminated the discussion on overdominance as it was peripheral to our results. Broad-sense genetic variances should be a good proxy for additive genetic variances when there is no inbreeding depression and no directional dominance or dominance epistasis; cf. Lynch and Walsh 1998. We previously showed that there is no inbreeding depression for the trait we use as surrogate for relative fitness (self-fertility) and also that there is no directional dominance for locomotion behavior traits. We now explain our use of broad-sense genetic (co)variances as a proxy for additive genetic (co)variances in the Introduction and Methods.

      It is also unclear how uncertainty in estimated G matrices was assessed. Showing that G differs from noise is critical to the majority of the results presented. The authors cite Morrissey and Bonnet (2019) as providing the method for generating the null distribution of G, however, this paper does not appear to propose or describe a method to do this.

      Thanks for this comment. Morrissey and Bonnet (J Heredity, 2019) was incorrectly cited and the explanation for finding the expected noise distributions was misleading. In brief, we produced a set of 1000 G-matrices each computed after shuffling the line ID and the block ID from the phenotypic dataset. This was done to produce random expectations of the genetic variances as the MCMC estimates are positive-definite. We computed the posterior mode for each of these 1000 G-matrices to obtain a null distribution (shown in orange). To infer significance, we compared the posterior mode of the empirical estimate with the 95% CI of the posterior mode distribution obtained from the randomized G-matrices. When determining which eigenvectors explain standing genetic variation we also used the distribution of posterior modes of the randomized G-matrices. However, as pointed out by Sztepanacz and Blows (Genetics, 2017), the eigenvalues of the eigenvectors do not follow a uniform distribution, as would be expected by chance. Because of this we asked the question of whether the amount of variance in the eigenvectors of the empirical G-matrix (gmax, g2, etc.) was expected, by projecting the random G-matrices onto these eigenvectors. This is a null that is conditional on the observed data. We show these results in Figure 2 - supplement figure 3. Both approaches are similar, particularly for the first 2 eigenvectors. There is now a paragraph in the Discussion about finding potential consequences for adaptation of traits with little genetic variance.

      Although the figure captions state that they are showing estimates of genetic variances, it appears to be heritability (bounded between 0 and 1). Whether the authors are studying heritability or genetic variance is an important difference, particularly in the context of a changing environment and phenotypic plasticity, where environmental variation is important and expected to change. For example, the result that G is smaller in evolved populations could simply be due to their being larger environmental variance in the salt environment (as you would expect). This is unrelated to an evolutionary response.

      There might have been some confusion because transition rates are positive and not normally distributed. To achieve normality they were log transformed. We have not reported estimates of heritability, all estimates presented are of genetic variances, unscaled. The only exception is body size where the raw data was multiplied by 50 in order to have a similar phenotypic scale as the transition rates when estimating genetic (co)variances, not heritability. We agree that the evolution of environmental stochastic variance is interesting but not immediately relevant to the questions we address.

      It seems that comparisons to the ancestral population were done for A160, not the founding population for each evolved line at G0. It is not clear whether the founder effects of each replicate are important and if this is the most appropriate comparison (the Discussion suggests that founder effects are important).

      We have better detailed in the Methods, and also with an introductory section in the Results section, the derivation of the experimental populations. The population acronyms might have been misleading. The A6140 is a population that was domesticated to the lab conditions for 140 generations (replicate #6 of the domestication process). We report the evolution of 3 GA populations, which were all derived from A6140 with minimal sampling problems for the estimated effective population sizes (sampled 10^4 individuals from A6140 for each GA, for Ne of 1000 during domestication - Chelo and Teotónio Evolution 2013 -). Therefore, GA populations after 50 generations of evolution are appropriately compared with their (unique) ancestor population. We no longer discuss potential founder effects.

      Overall, there is much interesting data collected and analysed in this manuscript, addressing a valuable question. However, it is not obvious whether the estimates of G matrices are different from noise, and heritability may not be the most appropriate scale to ask questions about phenotypic plasticity and evolution in a novel stressful environment that may affect levels of environmental variation.

      Please see previous replies. Our ancestral G-matrix estimates indicate that at least 3 eigentraits are different from random expectation in both environments (Figure 2, supplement figure 3), and in high salt evolved populations continue to have more than expected genetic variance at 3-5 eigentraits (Figure 7, supplement 2). We are conservative in these estimates as depending on the null we could consider more eigentraits. In the previous version of the manuscript we concluded that only 2 ancestral eigentraits were orthogonal due to an error in the code (we did not divide by 2 the null expectations). But even presuming that only one eigentrait (gmax) has genetic variance in the ancestral population, we previously reported that mutational variance is not in the same trait (see Mallard et al., G3, 2023; and mmax in Table 3), and further that the trait under selection is neither gmax or mmax (compared in Table 3 the selection gradients with gmax or mmax). At a minimum there are 3 genetically or environmentally independent traits. As noted in previous replies, we estimate and present genetic variances throughout. We do not present estimates of environmental variances and feel that doing so would make the manuscript overly complicated.

      Reviewer #2 (Public Review):

      Response to selection: It was not clear to me that it was appropriate to interpret locomotor behavior as having evolved in response to the salinity environment. Specifically, where is the evidence that any change in trait means is a (direct or indirect) response to selection imposed by increased salinity rather than the neutral drift of a trait due to the reduction in population size caused by the salinity? Strong evidence of adaptive evolution would be provided by all 3 replicates significantly diverging from the ancestor in the same direction. Model 2 seems to aim to test the null hypothesis that the three replicates diverged from one another via a random effects model - but with only three replicates, there is very low power, and variance is likely to be estimated as zero. I'm not sure what is shown in Tables 3 & 4, or how these results relate to models 2 & 3, so my interpretation of the information may be incorrect. Nonetheless, and noting that the errors around estimates are not presented, there seems to be considerable heterogeneity in size and direction of divergence between replicates for most of the traits. Is this study really dissecting responses to directional selection, or is it dissecting drift?

      We have modified the statistical modelling of the phenotypic data. Model 2 is no longer presented. We provide a MANOVA multivariate analysis equivalent to model 2 (with replicate populations as fixed effects) but now including both environments, together with the univariate models. MANOVA results show that all traits are significantly different across populations (i.e., at least two populations differ from one another). The fitted estimates from the MANOVA are not reported with errors in R but it is obvious that not all traits evolved in each replicate GA population (Figure 6). We therefore tested the difference between each of the evolved populations and the ancestral population using a univariate approach (Figure 6, supplemental source data table 2). In this univariate analysis, block was modeled as having random effects (which we could not model with MANOVA). In the high salt environment, the replicates GA 1,2,4 differed significantly for respectively 4, 6 and 4 transitions rates (out of 6). The traits are all evolving in the same direction, and this even when the trait difference between evolved and ancestral populations is not significant. We provide compelling evidence of parallel evolution and thus selection (see review about how to infer selection in evolution experiments in Teotónio et al. Genetics 2017). We tried to be exhaustive in our statistical reporting but would happily provide additional details if requested.

      What are the traits, and what is the confidence in G? My outsider's interpretation of these results is that defining 6 transition states is a way of getting at a single behavioral trait, and I was not convinced that these data were suitable for addressing questions about multivariate evolution. Genetic parameters were estimated using MCMCglmm, which imposes boundaries on estimates. The authors state that they followed Morrissey and Bonnet 2019, but I was unable to infer what this means with respect to accounting for the contribution of sampling error to covariances (or how they accounted for the positive variance constraint). Because I was unsure how sampling error was being assessed for G, I was not confident about the interpretation of statistical support for individual parameters, or for eigenvalues of G. Following this forward, if the measured characteristics constitute a single trait, with an entirely shared genetic basis, then the results of strong alignment of everything with gmax makes complete sense - there is a single trait, that is heritable and plastic, and for which the mean evolved.

      Our initial draft was misleading and we now provide more detailed description (see also replies #5 and #12 above). We computed 1000 randomized G matrices to account for the constraints imposed by the MCMCglmm algorithm. This should account for the bias inherent with variance estimation and the eigen decomposition we did given our sample sizes. You will find that all 6 transition rates show genetic variance (Figure 2, supplement figure 2) and that up to three eigentraits have more genetic variance than the randomized G-matrices (Figure 2, supplement figure 3).

      The 6 transition rates are the mathematical description of changing movement states in 1-dimensional space (under memoryless assumptions). A priori we do not know how many relevant traits there are, if they are genetically or environmentally independent. To help the reader, we provide a Table 3 with the trait loadings for the several canonical traits of phenotypic plasticity, divergence and selection. The first canonical trait of standing genetic variation, gmax, is indeed aligned with phenotypic divergence (dmax; Figure 8, panels A and B) and with the axis of genetic variance reduction during evolution (emax; Figure 8, panels C and D), but not with ancestral plasticity (pmax; Figure 3) or mutational variance (mmax, from Mallard et al. G3 2023). pmax, for example, is aligned with g3, the third eigenvector of the ancestral G matrix. Note, however, that we do not have any power to detect the influence of g2 or g3 on phenotypic divergence or genetic divergence (Figure 8), though they together explain about 15% of the genetic variance. This is because performing such a test would require an alignment of the deviations in divergence not explained by gmax with g2 or g3. We now mention this issue in the Discussion. Overall, however, there are clearly several behavioral traits.

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript by Zheng et al. examined the disease-causing mechanisms of two missense mutations within the homeodomain (HD) of CRX protein. Both mutations were found in humans and can produce severe dominant retinopathy. The authors investigated the two CRX HD mutants via in vitro DNA-binding assay (Spec-seq), in vivo chromatin-binding assay (ChIP-seq), in vivo expression assay of downstream target genes (RNA-seq), and retinal histological and functional assays. They concluded that p.E80A increased the transactivation activity of CRX and resulted in precocious photoreceptor differentiation, whereas p.K88N significantly changed the binding specificity of CRX and led to defects in photoreceptor differentiation and maintenance. The authors performed a significant amount of analyses. The claims are sufficiently supported by the data. The results not only uncovered the underlying disease-causing mechanisms, but also can significantly improve our understanding of the interaction between HD-TF and DNA during development.

      Thank you for summarizing the key findings and strengths of our manuscript.

      Minor concerns:

      1) The E80A, K88N and R90W (previously reported by the same group) mutations are located very close to each other in the homeodomain (Figure 1A), but had distinct effects on the activity of CRX. Has the structure of the homeodomain (of CRX) been resolved? If so, could the authors discuss this phenomenon (mutations close to each other but have distinct effects) based on the HD-DNA structure?

      In paragraphs 2, 4, 5 of the discussion section, we have added explanations on how each mutation could affect CRX HD-DNA interactions differently based on published structural studies. And we further explain how these biochemical changes relate to the molecular perturbations and cellular phenotypes seen in vivo.

      In addition, has this phenomenon been observed in other homeodomain TFs?

      Disease associated missense mutations at residues HD50 (K88) and HD52 (R90) have also been reported in other HD TFs implicated in CNS development (see discussion paragraph 7). Distinctively, different substitutions at CRX E80 residue have been reported in multiple CoRD cases, suggesting its essential role in HD-DNA-mediated regulation during retinal development. These new points are now included in the discussion section.

      2) The authors should briefly summarize the effects/disease-causing-mechanisms of all the reported CRX mutations in the discussion part. The readers can then have a better overview of the topic.

      We have added a concise summary of previously proposed CRX mutation classification scheme, all characterized Crx mutant mouse models and their pathogenic mechanisms. Please see paragraph 9 in the discussion section.

      3) CRX can also function as a pioneer factor (reported by the same group). Would these HD mutations distinctively affect chromatin accessibility (which then leads to ectopic binding on the genome)?

      Prior evidence has demonstrated that regulatory regions for many photoreceptor genes failed to stay accessible upon loss of CRX in the Crx-/- model (PMID: 30068366). It is unclear with the existing data whether CRX could initiate the chromatin remodeling (true pioneering function) of these regions, or it simply maintains the accessibility once these regions became accessible. Future studies comparing epigenomic landscape changes in mutant Crx KI models at various ages can be informative, particularly for the CRX K88N ectopic binding events. Determining how the CRX K88N mutant protein alters chromatin landscape important for photoreceptor fate and/or differentiation during development would shed light on the nature of these ectopic binding events.

      4) The discussion part can be shortened and simplified.

      We have re-written the discussion section to make it concise and to incorporate discussions on mutant CRX HD structures. Please see the revised manuscript.

      Reviewer #2 (Public Review):

      Zheng et al., investigated the molecular and functional mechanisms of two homeodomain missense mutations causing human retinal photoreceptor degeneration diseases in photoreceptor development regulated by the CRX transcription factor. They analyzed the E80A mutation associated with dominant cone-rod dystrophy (CRD) and the K88N mutation associated with dominant Leber Congenital Amaurosis (LCA). The authors found that E80A CRX binds to the same target DNA sites as WT CRX, but the binding specificity of K88N CRX is altered from that of WT in an in vitro assay. They generated Crx(E80A) and Crx(K88N) KI mice and performed ChIP assay and observed that K88N CRX binds to novel genomic regions from the WT-binding sites, while E80A binds to the WT sites. In addition, using the KI mice, they found that E80A and K88N differently affect the expression of Crx target genes. This study is well executed with proper and solid methodologies, and the manuscript is clearly written. This study gives us the insights how single missense CRX mutations lead to different types of human retinal photoreceptor degeneration diseases.

      We greatly appreciate the reviewer’s summary and positive comments.

      While the study has strengths in principle, it has a couple of weaknesses. One is how well E80A KI mice function as a pathological model of dominant CRD, in which cones are mainly first affected, is not clearly shown in this study. More data investigating how cones are affected by performing histological, molecular, and physiological analyses will be helpful and useful. For example, in the Discussion, the authors describe that E80A associates with S-cone opsin promoter results is "data now shown". This data must be presented for the readers. In addition, more molecular insights as to how E80A affects cones will strengthen this study.

      The mouse retina is rod dominant and contains only a small number of cones (3% of all photoreceptors) that are born prenatally. This poses technical challenges to appropriately assess cone-specific changes during disease initiation/progression. We are in the process of developing cellular/molecular tools to investigate how cones are being affected in Crx E80A KI model, but this is beyond the scope of the current study.

      At the same time, we have added a supplemental panel showing that, based on P0 retinal immunostaining of the early cone marker RXRγ, cones were initially born, and fate specified in CrxE80A retinas (see Figure S7A). Since the E80A protein also hyper-activated S-cone opsin promoter-luciferase (Sop-luc) reporter in HEK293 cells (see Figure S7B), we predict that CRX E80A affects cone photoreceptor differentiation in a similar manner as rod photoreceptors. Furthermore, the cone transcriptional program might be more prone to perturbations by abnormal CRX activities. These possibilities require future investigations. For this manuscript, we have included all these points in the discussion section.

      Another point is that it will be very valuable if the authors could show how E80A and K88N differently affect the 3D structure of the CRX homeodomain. Even a simulation model would be valuable.

      Please see our answer to Point 1 of Reviewer #1. In short, we have added in the discussion section our explanations on how each mutation could affect CRX HD-DNA interactions differently based on structural studies. We further explain how these biochemical changes relate to the molecular perturbations and cellular phenotypes seen in vivo. Additionally, since TF-DNA interactions are diverse and dynamic across binding sites with different sequence features and genomic environments, future studies that systematically and quantitatively evaluate CRX transcriptional activity at different regulatory sequences would be important.

    1. Author Response:

      We thank the reviewers for their insightful comments and will resubmit a revised version where we address most of the issues raised. At this time, our immediate responses are as follows.

      1. We have data to confirm the presence of the merodiploid strain by PCR but did not show the data in the original version for brevity. We will show that data in the revised version.

      2. We also have, of course, a no ATC control in our CRISPRi experiments and will also show that data in the resubmission.

      3. As a loading control for the SecA2 strains, we will show PknG blots (a protein secreted by SecA2;PMID: 29709019) that we have with us.

      4. In the nanoluc assays, the construct we made that was fused to CFP10 was generated so that there was a long linker between the C-terminus of CFP10 and nanoluc. We also have other controls in that experiment to show that the CFP10-nanoluc protein was secreted in the ΔRD10 strain and not in the ΔSecA2 strain. We will attempt to show fusion protein secretion using CFP10 antibodies in the revised version of the manuscript.

      5. We will perform experiments with the inhibitor using the merodiploid strain and in partial knockdown strains to confirm that the inhibitor does indeed specifically act on Rv1636.

      6. We will modify the discussion to talk more about the role and processes of cAMP synthesis and degradation in the revised version of the paper. Further, the manuscript will be checked for spelling and grammatical errors before resubmission, and the arrangement of data modified as suggested by the reviewers.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript describes the differences in the plasma proteome and metabolome in healthy Tanzanian and healthy Dutch adults. The inflammatory plasma proteome was measured using the Olink 92 Inflammation panel, while the plasma metabolome was analyzed using a mass spectrometry-based untargeted approach. The plasma metabolome was measured only in the Tanzanian cohort. This study aimed to link the pro-inflammatory proteome of Tanzanian and Dutch healthy individuals with environmental factors and dietary lifestyles.

      The correlation between the plasma proteome and food-derived metabolome profiles can shed light on the development of non-communicable diseases. This observation stresses the importance of dietary transition and lifestyle changes in expressing inflammation-related molecules. Moreover, this study describes the inflammatory proteome profile in healthy Tanzanian individuals covering a cohort with limited studies. The molecular differences in circulating biomolecules between healthy individuals living in East Africa and individuals living in Western Europe and the correlations with intrinsic and environmental features are novel.

      This study lacks a robust and solid validation of some of the differentially regulated circulating proteins and correlations between food-derived metabolites and proteins in a selected cohort. The discovery-driven approach in this manuscript highlights potential findings that need to be supported by a validation phase. According to this reviewer, the lack of such validation impacts the robustness of the results and the hypotheses generated. Due to that, the manuscript should incorporate validation experiments.

      We acknowledge that our study was limited by the lack of a validation phase. To address this issue, we have undertaken additional analyses to validate our key findings related to the proteins associated with mTOR and Wnt/β-catenin pathways. These analyses involved data from a proof-of-concept intervention study conducted at the same site. Our response below provides more information on these validations.

    1. Author Response

      Reviewer #1 (Public Review):

      For PRLR, the question being asked is whether and how the intracellular domain (ICD) interacts with the cellular membrane or how the disordered ICD can relay and transmit information. The authors show that PI(4,5)P2 in the membrane localizes around the transmembrane domain (TMD) due to charge interactions and facilitates binding of the ICD to the membrane, even in the absence of the TMD. Furthermore, the ICD and PI(4,5)P2 form a co-structure with JAK2 which locks a disordered part of the ICD into an extended conformation, allowing for signal relay and, through multiple complex conformations, may enable switching signalling on and off.

      Strengths:

      • NMR paired with MD is a powerful way to probe an interaction especially when peaks disappear and become difficult to probe by NMR.

      • Using NMR and MD to formulate hypotheses which are then tested by cell studies is quite informative. The combination of MD, NMR, and cell biology is a strength.

      • The authors are diligent in testing MD simulations on systems with and without PIP2.

      • The use of Pep1 and Pep2 to differentiate the KxK region that interacts with PIP2 is helpful.

      • The four utilized mutants help illustrate the co-dependence of the respective regions in the formation of the co-structure.

      Weaknesses:

      • In Figure 2G, there is a big change in CSP between 280 and 290, which the authors do not comment about.

      The region referred to contributes to binding but is on the edge of the main binding site and where the local affinities are weaker. Therefore, the exchange rate is high and allows for following the chemical shift changes. In support of this, we see an almost inverse correlation between the CSPs and the changes in intensities. For the main binding site, the exchange rate between bound and free states is slower because the affinity is stronger. Therefore, we cannot follow the chemical shifts to extract the CSPs to the bound state, as the peaks disappear. We have commented on this in the main text (p.8) as follows:

      “In the region from D285-E292 we observed an almost inverse correlation between the CSPs and the intensities. This suggests that in contrast to the preceding region, a faster local exchange rate allows us to follow the resonances from the bound state in this region, giving rise to the large CSPs.”

      • The data in Figure 2 are summarized as indicating the formation of extended structure in the ICD upon binding. It is not clear to me what data show an extended structure.

      The information on the extended structure comes from the analyses of the peptide Pep1 titrated with C8-PI(4,5)P2. The CD signature that develops in the bound state has a minimum ellipticity at 218 nm, which is a strong indicator of extended structure. We find this information adequately described in the main text (p.8), but have emphasized this further as follows:

      “In contrast, for Pep1, large spectral changes were seen, which were unrelated to helix formation. Subtracting the spectra in the presence and absence of C8-PI(4,5)P2, revealed a negative ellipticity minimum at 218 nm, a strong indication of B-strands, showing that when bound to C8-PI(4,5)P2, a distinct extended (strand-like structure) signature was seen (Figure 2G).”

      • No modelling or experiments were done with PIP3 despite conclusions and models which rely on the phosphorylation of PIP2 to PIP3. At the very least, these would be useful as negative controls.

      We have in a previous work addressed the affinity for phosphoinositides using lipid dot blots where we observed a preference for certain species, including PI(4,5)P2 (Haxholm et al., BJ, 2015). In this study, we also observed that there was no affinity for PI(3,4,5)P3, but may not have highlight this sufficiently in the introduction. This can have caused some confusion in understanding our choices. We have now more explicitly described these data, both in the introduction (p.4), in the result section (p. 8) and later in the discussion (p. 21). We thank the reviewer for bringing this up.

      • Only R2 experiments were done when the authors mention investigating dynamics. R1 and -HetNOE dynamics would be useful for creating a complete picture.

      Our aim with recording the R2 values was not to map the detailed dynamics of the disordered regions, but to explain the changes in the peak intensities we see for the variants when adding C8PI(4,5)P2. In this case, the R2 values supported our suggestion of internal contacts and, although we agree with the reviewer that R1s and HetNOEs would be important and relevant for a more in-depth and complete analyses of the dynamics, we find that in this case, the R2 values suffice.

      • Some of the exciting results are under-emphasized including Fig 3H and 3I.

      A new version of Figure 3 has been generated to consider the reviewers’ comments and suggestions. This figure has been restructured to further emphasize some of the major conclusions obtained from the simulations. We have moved the former Figure 3 A, B, C and D to the supplemental information to increase this focus.

      Reviewer #2 (Public Review):

      The authors combine NMR experiments, cell experiments, and molecular simulations to address the question of how lipid interactions of the prolactin receptor contribute to signalling. They assess the interactions of the disordered cytoplasmic tail of the receptor with phosphoinositides among others by chemical shift perturbations from NMR for different PIP2-containing membranes, by coarse-grained simulations, as well as site-directed mutagenesis and subsequent cell signalling experiments to monitor the activation of the mutants. A major result is that PIP2 interactions are functionally important, which so far has not been known for this receptor. Their results are likely relevant for other non-receptor tyrosine kinases.

      The hypothesis that the protein complex is regulated by IDR-membrane interactions is very novel. A major strength is the close connection of and feedback between state-of-the-art experiments and simulations.

      We thank the reviewer for the positive comments on our work and on the novelty and importance of the work

      This is where I see weaknesses:

      1) The motivation of focusing on LID1 is limited.

      We have now provided our rationale for selectively focusing on the LID1 in the PRLR. The selection was done to address the conundrum on how structural disorder in the juxtamembrane regions would be able to transmit the knowledge of extracellular hormone binding to the bound JAK2. This constitute the first step of signaling on the intracellular side and given the distance to the other two LIDs (LID2 and LID3) and their disconnect to the TMD by long disordered regions, they were disregarded, focusing on LID1 in this work. We have emphasized this choice in the introduction and in more detail in the result section (p. 5-6).

      2) The data and analysis for the JAK2-PRLR complex appear somewhat superficial, and a connection between conformational states to their functional relevance is lacking. In fact, the majority of the simulation part of the paper is about suggesting different states of the PRLR-JAK2 complex but the states and their hypothesized functional relevance are not further taken up, e.g. by experiments, and yet presented as major results, e.g. in the abstract.

      In the original manuscript we already provided a detailed analysis of the different states, highlighting accessible residues and lipid interacting residues and compare these across the states. From our experiment, including those performed in cellular assay, we cannot with certainty link the two major state to active and/or inactive states. We have therefore no intention or support from the data to claim this. However, what we do put forward as a major result, in the presence of more than one major state as also stated in the abstract and in the conclusion of the result section as follows:

      “Another key observation is the existence of different states in which different regions of both JAK2 FERM-SH2 domain and LID1 of PRLR are exposed to the solvent or hidden below the bilayer.”

      In the discussion we do speculate as to which state may be the active and/or inactive dimer/monomer but make no firm claims. We have now made the major find of more states clearer in the text, and further compare the two major states, the Y and the Flat state, to the resent cryo-EM structures of JAK1 bound to IFNAR1, which lend some support to our speculations. The abstract now reads:

      “We find that the co-structure exists in different states which we speculate could be relevant for turning signalling on and off.”

      To discern the functional relevance of these state, if possible, will require experiments also in cells that by themselves would be a new study. We have to the best of our ability clarified that the functional relevance of the states has not been elucidated by the current work.

      3) The connection between simulations and mutational study is not very direct. An open question is if the mutants can distinguish between the effects of PRLR-PIP2 interaction or PRLR-JAK2 interaction, even though this conclusion is still drawn from the data.

      We have now explained in much more detail by which arguments the different mutations were selected (see also answer above), which property of the co-structure they are most likely to engage in and affect, and we have emphasized that the separation of function by mutation may be complicated by the intimate structure formation among the three components of the co-structure. The conclusion has therefore also been softened.

      4) The conclusions drawn from the mutagenesis study (lines 547-555) are not directly supported by data. Only a partial correlation between PRLR membrane localisation and STAT5 activation is no reason to attribute the unexplained part of the STAT5 activation to PRLR-JAK2 interactions without further studies.

      We have now explained in much more detail by which arguments the different mutations were selected (see also answer above), which property of the co-structure they are most likely to affect and emphasized that the separation of function by mutation may be complicated by the intimate structure formation among the three components of the co-structure. The conclusion has therefore also been softened.

      5) PIP2 is identified as an important regulator, with very solid support from the presented data. PIP3 is part of the model but not discussed before or as part of the results. The analysis could be similarly applied or the data directly relevant to the understanding of PIP3 plays a similar role, as interactions are likely primarily electrostatically driven.

      We have in a previous work addressed the affinity for phosphoinositides using lipid dot blots where we observed a preference for certain species, including PI(4,5)P2 (Haxholm et al., BJ, 2015). In this study, we also observed that there was no affinity for PI(3,4,5)P3, but we agree that we did not highlight this sufficiently in the introduction. This have caused some confusion in understanding our choices. We have not more explicitly described these data, both in the introduction (p.4), in the result section (p. 8) and later in the discussion (p. 21). We thank the reviewer for bringing this up.

      Reviewer #3 (Public Review):

      Araya-Secchi and coauthors present a very interesting study on the role of PIP2 lipids in the potential modulation of prolactin receptor signaling. The study is well-conducted and employs an integrated approach that combines NMR spectroscopy, modeling (primarily coarse-grain MD simulations), and cell biology. This combination of methods is crucial for gaining a deeper understanding of cell receptors, from their biophysical properties to their cellular functions.

      The modelling work is mainly based on both coarse grain forcefield versions Martini2.2 and Martini3. These two versions of the forcefield may produce different results. Therefore, depending on the system being modeled, the results presented here should be considered in light of the limitations inherent to each version of the forcefield.

      We thank the reviewer for the positive appraisal of our work and the approach we employed. It is true that one must be aware of the limitations of the tools and models employed in this type of work. We agree that perhaps we were not too explicit about limitations of our methods in the presentation of the results. However, we have addressed and discussed such limitations in the revised version of the manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      This study demonstrates that Chinmo promotes larval development as part of the metamorphic gene network (MGN), in part by regulating Br-C expression in some tissues (exemplified in the wing disc) and in a Br-C independent manner in other tissues such as the salivary gland. I have included below the following comments on the submitted version of this manuscript:

      1) The authors have shown experimentally that Chinmo regulates Br-C expression in the wing disc but not the larval salivary gland. Based on this, they posit that Chinmo promotes larval development in a Br-C-dependent manner in imaginal tissues and a Br-C-independent manner in other larval tissues. This generalization of Chinmo's role in development would be more compelling if the relationship between Chinmo and Br-C were explored in other examples of imaginal/larval tissues.

      We agree with the referee that confirmation of our observations in other tissues might help to generalize Chinmo’s role. To this aim, we have analyzed the role of chinmo in an additional larval, the larval tracheal system, and imaginal tissue, the eye disc. Consistent with the results reported in the manuscript, we found that the mode of action of Chinmo is conserved, as depletion of Br-C in the eye disc is able to rescue the lack of chinmo, whereas in the tracheal system it is not. We included this new information in the main text and in new SFigures 1 and 3.

      2) Chinmo, Br-C, and E93 have all been shown to be EcR-regulated in larval tissues, including the brain and wing disc (as in Zhou et al. 2006, Dev Cell; Narbonne-Reveau and Maurange 2019, PLOS Biology; Uyeharu et al. 2017, ). It would be interesting (and I believe relevant to this study) to know whether the roles of these factors in their respective developmental stages are EcR-dependent and whether their regulation by EcR (or lack thereof) depends on whether the tissue is larval or imaginal.

      Although the relevance of EcR on the regulation of the genes that conform the metamorphic gene network has been already established, a different response of EcR-mediated signalling of these genes in larval and imaginal tissues is still not properly addressed. Finding this possible different output of the EcR signalling would be very interesting. However, we think that this is out of scope of this report as the main aim of this study was to determine the main role of the temporal genes during development and their repressive interactions.

      3) In the chinmo qPCR analysis shown in Fig1A, whether animals were sex-matched or controlled was not indicated. Since Chinmo has a published role in regulating sexual identity (Ma et al. 2014, Dev Cell; Grmai et al. 2018, PLOS Genetics), and since growth/body size is known to be a sexually dimorphic trait (Rideout et al. 2015, PLOS Genetics), it seems important to establish whether the requirement of Chinmo for larval development and/or growth. I recommend either 1) controlling for sex by repeating qPCRs in Fig 1A in either males or females, or 2) reporting male/female chinmo levels at each stage side-by-side.

      As the referee pointed out, chinmo has been related to sexual identity raising the possibility of a different effect of chinmo in growth of males and females during development. However, several observations discard this option. First of all, the role of chinmo in sexual identity has been only reported in adult testis and specifically in cyst stem cells. In fact, specific mutations of chinmo that only affects the expression of chinmo in testis, do not affect testis formation but its maturation, suggesting a role of chinmo in sex determination specifically in the testis cyst stem cells (Ma et al. 2014, Dev Cell; Grmai et al. 2018, PLOS Genetics). Second, it has been described a sex dependent growth rate during larval development (Rideout et al. 2015, PLOS Genetics; Sawala A. and Gould AP, PLoS Biol, 2017). However, the main difference in growth rate between males and females is found in L3 larvae (Sawala A. and Gould AP, PLoS Biol, 2017), when the expression of chinmo strongly declines in both males and females, indicating that chinmo impact on sex dimorphism during larval development might be at least, limited.

      Thus, considering that, based on our results, chinmo exerts its main role in larval tissue growth during L1 and L2 stages and that body growth is practically identical in male and female during these stages (Sawala A. and Gould AP, PLoS Biol, 2017), we can assume that chinmo might not contribute to sexual body size dimorphism.

      Nevertheless, we would like to clarify that we have performed the measurements of chinmo expression always in females, when sex identification was possible, namely in L3 larvae. L1 and L2 larvae qPCRs were not sex-discriminated as sex identification was not possible in our conditions.

      4) In Fig2E, the authors show that salivary gland secretion (sgs) genes are repressed in salivary glands lacking chinmo. Sgs genes are expressed during late larval stages as the animal prepares to pupate. Thus, based on the proposed model where Chinmo promotes larval development and represses the larval-to-pupal transition, one might expect that larval salivary glands lacking chinmo would express higher than normal levels of sgs genes. This expectation directly opposes the observed result - it would be helpful to speculate on this in the interpretation of results.

      This is an interesting observation. As Sgs genes are regulated by Br-C (Duan et al. Cell Reports 2020), precocious expression of this transcription factor in chinmo depleted animals might result in an early activation of those genes. Interestingly, we were not able to detect any Sgs genes expression in chinmo depleted salivary glands. We think that this is due to the fact that in absence of chinmo, this organ does not properly develop and mature, and therefore it is unable to express Sgs genes. Proof of that is that the double knockdown of Br-C and chinmo shows the same dramatically low levels of those genes. Altogether, these results strongly suggest that SGs lacking chinmo expression are unable to grow and synthesise Sgs proteins, even in the premature presence of Br-C. We discussed this point in the main text of the edited Ms. Please also see the response to referee 2.

      Reviewer #2 (Public Review):

      The evolution and control of the three-part life history of holometabolous insects have been controversial issues for over a century. While the functioning of broad as a master gene controlling the pupal stage and of E93 as a master gene for the adult stage has been known for about a decade or more, chinmo has only recently been proposed as being the master gene responsible for maintaining the larval stage (Truman & Riddiford, 2022). While the former paper focused on the embryonic and early larval function of Chinmo, this paper explores its metamorphic effects and defines the roles of Broad and E93 in the phenotypes produced by manipulations of Chinmo expression.

      Overall, the paper is well presented but in places, readers would be helped if the authors were more explicit about the logic and details of their manipulations. There are a couple of conceptual issues that the authors should address.

      The role of Broad in larval tissues:

      One intriguing issue relates to the relationship of Chinmo to Broad and E93 in larval versus imaginal tissues prior to metamorphosis. The knock-down of chinmo in imaginal discs results in severe suppression of growth and the lack of metamorphic patterning genes such as cut and wingless. Normal growth and patterning are reestablished though, if broad is also knocked-down, supporting the notion that the effects of the lack of Chinmo are mediated through the premature expression of Broad.

      In the salivary glands, by contrast, chinmo knock-down suppresses growth, and this growth suppression is not reversed by simultaneous broad knockdown. They properly conclude that the role of Chinmo in supporting the growth of larval tissues does not involve Broad, but their data on the expression of salivary gland proteins suggest that Broad still plays some role in Chinmo function in salivary glands. Fig. 5E shows the levels of various salivary glue proteins in the glands of Chinmo knock-down larvae. The levels are reduced, as expected by the lack of salivary gland growth, but a significant finding is that they are there at all! The Costantino et al. (2008) paper shows that these genes are only induced in the mid-L3. Ecdysone, acting through Broad isoforms, is necessary for their appearance and these SGS genes can be induced in the L1 and L2 stages by ectopic expression of some Broad isoforms. Their low levels in Fig 5, would be due to the small size of the gland, but the gland's premature expression of Broad likely causes their induction. In larval cells, then, Chinmo may feed into two parallel pathways, one that does not involve broad and regulates growth and the other, utilizing Broad, regulating premetamorphic changes.

      It would be useful to look at early larval salivary gland proteins such as ng-1 to -3 that are expressed in salivary glands before the critical weight. Also, it would be interesting if the appearance of the SGS proteins after chinmo knock-down (Fig 5E) is abolished by simultaneous knock-down of broad.

      This is an interesting observation. We think that the main problem has derived from the way we presented the data. Our results showed that depletion of chinmo in the SGs dramatically impairs the induction of Sgs gene expression, even with the premature presence of Br-C, which has been shown to be responsible for Sgs expression (Duan et al. Cell Reports 2020). The confusion might come from the way we presented the level of expression of those genes. In fact, the levels of Sgs in both chinmoRNAi and chinmoRNAi/Br-CRNAi SGs were virtually undetectable, suggesting that chinmo in the SG is not only required for Br-C repression but also for proper development of the gland. We believe that based on the fact that the very low levels of expression of Sgs genes in chinmo depleted SGs are still detected in the double knockdown chinmoRNAi/Br-CRNAi. Dramatically reduced expression of the early larval SGs ng1-3 genes in chinmoRNAi and double knockdown chinmoRNAi/Br-CRNAi supports this statement. Altogether these results suggest that Br-C is necessary but not sufficient for the expression of those specific SGs genes. We have changed the plots in Figure 2 and 3 to clarify this point and added the levels of expression of ng1-3.

      Role of Chinmo and Broad in Hemimetabolous insects:

      In the conclusion of their comparative studies on the cockroach (line 342), the authors state that Broad exerts no role in the development of hemimetabolous insects. However, this conclusion is not consistent with the literature. The first study of broad knockdown in a hemimetabolous insect was in the milkweed bug Oncopeltus fasciatus by Erezyilmaz et al. (2006). Surprisingly to Erezyilmaz et al., broad knock-down in early-stage nymphs did not cause premature metamorphosis. However, Broad expression was essential for tissues of the wing pads and dorsal thorax to undergo morphogenetic growth (rather than simple isomorphic growth), and for stage-specific changes in coloration through the nymphal series (but not for the nymph to adult color change). A similar function for Broad on wing growth during the later nymphal stages was later shown in Blattella (Fernandez-Nicolas et al., 2022; Huang et al., 2013). The wing- and genital pads represent "imaginal" tissues in the nymph and the need for Broad in these tissues are the same as seen in imaginal discs as the latter shift from isomorphic growth to morphogenesis at the critical weight checkpoint in the L3. This would suggest that important roles for Broad and E93 are already established in the hemimetabolous insects with E93 controlling the shift from immature (nymphal) to adult phenotypes and Broad controlling the premetamorphic growth of imaginal tissues in early-stage nymphs. Chinmo might then be needed to keep both in check.

      We are sorry for not having dealt with these observations in the submitted manuscript. We have taken them into consideration in the new version to discuss about the role of Br-C in the transition from hemimetabolous to holometabolous.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors study single and pairs of MDCK cells adherent to an H-shaped geometry on a flat surface. In this pattern, the cells form strong peripheral stress fibers. To a lesser extent, these cells also exhibit stress fibers in the cell interior, which otherwise has a rather homogenous actin distribution. Using a combination of traction force microscopy, from which they infer the stress distribution by monolayer stress microscopy, and "contour analysis" the authors quantify the 'bulk' and the 'surface' stress in these cells. This analysis shows that single cells are mechanically polarized whereas pairs are not.

      The authors then go on to optogenetically activate the actomyosin contractility of either one half of a single cell or one cell of a pair. Combining their stress measurements in these situations and using a finite element mechanical model, the authors convincingly show that the mechanical response in the non-activated part is active. By varying the aspect ratio of the adhesion patterns, they also find that the efficacy of active stress propagation depends on the mechanical and structural polarity of the cell. Furthermore, they provide evidence that their results on cell pairs generalize to tissues.

      Strengths:

      This study uses a nice combination of physical tools to address an important question in tissue mechanics. The data is compelling and fully supports the authors' conclusions.

      Weaknesses:

      There are no major weaknesses.

      In summary, although the fact that mechanical stress propagation in tissues is an active process might not come as a surprise, the study makes substantial contributions to a quantitative contribution of this process. As such it is of fundamental significance in the field. It will be interesting to explore the consequences of this mechanism for mechanical stress propagation in the context of developmental processes. It will be also of great interest to study how this local process can be accounted for in large-scale theories.

      We thank reviewer #1 for this very positive assessment. We agree that in the future, our results should be used on the theory side to upscale them to tissue level. One way to do this would be the discontinuous Galerkin method, but it will take time to work this out. We also note that we would have loved to experimentally study intermediate cases between two and many cells, but it turned out to be very difficult to position few cells on micropattern and to repeat the force propagation analysis which we present here for two cells and for small tissues. In fact, it might be more rewarding to use optogenetics early in a developmental process with clearly defined cell positioning. In the revised manuscript, we now have added a comment on the challenge to work with three or four cells with the micropatterning approach, and that therefore we turned to small monolayers.

    1. Author Response

      Reviewer #3 (Public Review):

      In this study, the authors probe the molecular changes that occur in a neural circuit for learned behavior that depends on sensory input to maintain stereotypy. Songbirds, as the Bengalese finches used here are, are premier systems in which to ask these questions because they produce a highly stereotyped song that emerges after sensory learning becomes integrated into the function of a sensorimotor neural circuit responsible for singing. By deafening a group of birds (who show a shift in their song structure) and comparing them to hearing birds, clues as to how plasticity in motor output may emerge from genomic changes that alter the function of cells within the various components of the neural circuit.

      There are multiple strengths of the paper:

      1) The results may have broad implications because the type of sensorimotor neural circuit (cortico-basal ganglia-thalamic-cortical loop) used for singing is generally necessary for learned behaviors.

      2) The methods and analyses are generally rigorous, including the parsing of song elements, and the type of detailed RNA sequencing and analysis that demonstrates the power of a genomic view of neural plasticity as it relates to behavioral plasticity.

      3) Because the authors assayed the pallial (cortical) areas, as well as the basal ganglia component, of the sensorimotor circuit they were able to creatively compare how different facets of the network contributed to a) unmodified brain properties, b) properties perturbed after the loss of the auditory input that is required to stabilize song structure. As a result, they have added to the known molecular profiles for each of these brain areas, the accounting of how they may be specialized in comparison to the surrounding non-song brain, and what changes occur after deafening. Utilizing some existing single-cell sequencing data, the authors present for the first time some insight into what cell types may be showing the most robust changes, and therefore which may be driving the shift in song structure. The analysis further pushes in new ways to suggest how the molecular properties of a given brain area may relate to those of directly-connected areas. Together, these findings provide valuable clues as to the specific cell types and signaling properties that may be central to the production of stabilized, learned behavior.

      4) One of the cortical brain areas, LMAN, was lesioned in a subset of the hearing subjects because it projects to the area that showed the greatest molecular difference between deafened and hearing birds (RA). The idea was to compare how this affected molecular properties with properties after the loss of auditory input; because RA is the output motor area for the song, its properties may be most directly tied to song structure. Using unilateral lesions was a strong choice of experimental design that allowed for rigorous analysis of this idea, and was interpretable because birds do not have a direct inter-hemispheric callosum.

      The foundation of the paper is solid, though the results shown raise several questions that are not fully addressed, and limit some of the power of the implications.

      The biggest questions arise from the finding that RA shows the largest number of molecular changes after deafening. The analysis and interpretations do not fully incorporate what we know of this circuit, at least from another well-studied songbird, the zebra finch, from which the authors derive other types of information. For example, it is not yet clear if RA is most changed because it is most directly involved in song output or because it receives projections from two areas within the sensorimotor circuit (LMAN and HVC). How do we consider the fact that by adulthood, LMAN and HVC cells project onto the same RA neurons? Are those the cell types being identified here? Would HVC lesions be expected to have the same effect as LMAN lesions? Are the cell types showing the greatest change those that are most involved in song output (e.g. are they projecting to nXIIts)? How do these results relate to the findings of changes in RA after HVC and LMAN lesions reported decades ago? How do these findings compare to an earlier study that also performed sequencing on areas from the sensorimotor circuit in deafened juveniles? Further, RA also receives information from the auditory processing regions of the brain, via the immediate structure RA-cup. It is not yet explicitly addressed how some effects may be from the loss of this more direct access to auditory information, rather than from information and projections originating within the sensorimotor circuit, and reinforces the question of whether or not the number of inputs to a particular brain area is a driving factor in the general pattern of changed RNAs after perturbation.

      We thank the reviewer for their review and for their excellent suggestions on how to improve its impact. The reviewer raises several important points, which we have expanded on in the Discussion of the revised manuscript, and will address here:

      First, there is the general consideration of how the structure of inputs to RA influences the interpretation of our results. There is the question about whether we can consider RA expression alterations as due to its direct projections to song motoneurons (‘output’) or the convergence of two important song nuclei, HVC and LMAN, onto RA (‘input’). This is a difficult question to untangle. We could interpret ‘output’ only effects as local perturbations that do not depend on song circuit afferent activity, such as hormonal fluctuations associated with the loss of hearing. ‘Input’ effects would occur through changes in afferent activity, such as those that elicit plasticity associated with song destabilization or more general alterations to the amount of afferent neural activity (a point addressed in the revised manuscript, lines 842-848). By focusing on a measure of song destabilization in our differential expression analysis, we are specifically seeking to identify gene expression responses that are associated with changes to behavioral output. Yet these behavioral changes are certainly driven by alterations in upstream regions or the manner in which they converge onto RA. The reviewer also notes inputs from RA-cup as a potential avenue through which the loss of auditory information could more directly influence expression in RA. It is certainly possible that the loss of auditory information itself could influence gene expression in different components of the song system, a point we note in the revised Discussion (lines 848-853). We also note there that future experiments leveraging different plasticity induction techniques (TS cut, delayed auditory feedback) will be important to resolve the influence of this input.

      Our lesion experiments aimed to characterize how input from LMAN influences expression in RA, due to LMAN’s important role in mediating song plasticity. We would expect HVC lesions to elicit different expression responses because of its distinct mode of transmission onto RA projection neurons (primarily AMPAR in contrast to primarily NMDAR for LMAN), the distinct activity patterns of HVC and LMAN, and likely distinct neuromodulatory signaling from the two afferents (e.g. LMAN acts as source of BDNF). We discuss how HVC lesions would be useful to further disentangle the influence of afferent input on RA gene expression in the Discussion of the revised manuscript (lines 926-946). In the revised manuscript, we also cite previous work that examined the influence of HVC and LMAN on RA neural activity, morphology, and cell survival (lines 928-932).

      As to the cell types in RA that show expression changes following deafening, we show in Figure 5 that both glutamatergic projection neurons (‘RA_Glut’), i.e. the neurons that project to subcerebral structures such as nXIIts, as well as GABAergic interneurons (‘GABA’) show substantial expression alterations. In the Discussion, we highlight the functional roles of several genes that have enriched expression in each class (lines 864-873 and 887-893).

      In the revised manuscript, we have added a paragraph in the Discussion (lines 854-862) that references results from Mori, C. & Wada, K. Audition-independent vocal crystallization associated with intrinsic developmental gene expression dynamics. J. Neurosci. 35, 878–889 (2015). This work examined the influence of early deafening on gene expression in the song motor pathway and identified a strong developmental and audition-independent expression response. It identified an important separation between developmentally-driven and experience-dependent molecular responses in the song system. We note that the aims were distinct from the present study, which sought to identify gene expression responses to deafening-induced song plasticity.

      Importantly, since the LMAN lesions did not create significant changes in the song structure, it is difficult to know how to interpret the meaning of these molecular changes in RA, alone and in combination with the comparison to the RA profiles from deafened birds. Of importance is the question of whether or which molecular profiles are thus signatures of behavioral plasticity or not.

      The reviewer raises an important set of followup experiments that assess the extent to which the transcriptional state of the song system tracks with song plasticity state. Coupling LMAN lesions with deafening, a manipulation that prevents song degradation, would be a strong approach to identify genes whose expression is closely tied to song destabilization, a possibility that we now discuss (lines 936-946).

    1. Author Response

      Reviewer #1 (Public Review):

      1) There are two main 'weaknesses'. The first is the limited power that comes from only using measuring the phenotype of 387 strains. Whether this is because of the expense/ difficulty of the inToxSa is not discussed, leaving open the question of how much this assay could be scaled up in the future.

      A previous study investigating the toxicity of S. aureus culture supernatants assessed 217 clinical strains (https://doi.org/10.1371/journal.pbio.1002229).That study had sufficient power to uncover important genetic determinants of S. aureus virulence. Here, we significantly increased the throughput to 387 clinical strains combined with a sophisticated cell toxicity assay that measures the kinetics of cytotoxicity caused by intracellular S. aureus. We have investigated the S. aureus genetic associations using this rich dataset (each of the 387 strains were assessed in 3 to 15 replicates, accruing 655,005 measurements corresponding to kinetic cytotoxicity assessments of intracellular S. aureus). This rich dataset enabled the accurate identification of genomic signatures that modulate cytotoxicity; genomic signatures that we then validated by reconstructing the mutations, thus demonstrating the power of our approach. The upscaling of this method (4-fold, with adequate technical adjustments) should be possible with the adoption of a 384-well plate format instead of a 96-well plate. We will continue to investigate additional clinical isolates and explore the use of 384-well plates, but the analysis we present of data from the 96-well format is already a substantial advance for the field.

      Across this study, and as presented in the current manuscript, the maximum throughput of the InToxSa assay was of 7x 96 well plates per week, thus corresponding to 98 distinct clinical strains testable per week (encompassing 6 individual replicates, each tested across 2 different days/plates). Following the reviewer suggestion, we have added this information to the discussion (Lines: 406-409).

      2) The second is that the main output of the assay is actually reduced intracellular toxicity (PI uptake AUC), which is inferred to be strongly linked to increased intracellular persistence. The linkage between the phenotypes comes primarily from microscopic studies on a limited number of strains. It may be true of all cases, but the possibility exists that for some of the strains, reduced cytotoxicity may be associated with intracellular elimination, which would presumably be a negative outcome for systemic infection.

      Whilst the reviewer’s comment is pertinent, we note that none of the least cytotoxic S. aureus isolates identified by the InToxSa assay have resulted from bacterial clearance, intracellular bacterial growth defects or evasion from their cellular niche, as we have assessed intracellular bacterial loads at 3h and 24h (post-bacterial uptake) in experimental conditions using cell-impermeant antibiotics (which would kill extracellular bacteria and prevent over-infection of non-infected bystander HeLa cells), as shown in figures 5F and 5H and also in Figure 5 Supplementary figure 5, highlighting an inverse correlation between cytotoxicity and intracellular persistence.

      Reviewer #2 (Public Review):

      1) …Thus, my concerns are focused on further understanding the practical utility of the approach and whether or not the HeLa cell model recapitulates what happens in professional phagocytes.

      HeLa cells have proven a useful cellular model in infection and in pathogen biology to assess the ability of bacterial pathogens to invade, persist and replicate within host cells. Several studies have convincingly used HeLa cells to assess S. aureus phenotypes at the bacteria-host cell interface, as exemplified by the following recent research (DOIs: 10.1128/mBio.02250-20, 10.1371/journal.ppat.1009874, 0.138/s41598-019-51894-3, and 10.1128/mSphere.00374-18). We do also acknowledge the limitations of cell line models in the discussion (Lines 494-510).

      2) …it is not clear to me that this system has the statistical power to find novel, biologically relevant rare mutations without first being very mindful in selecting strains that are extremely genetically similar.

      As described, this is a S. aureus bacteraemia study, wherein the strains composing the collection are, by definition, closely related. We articulated this in the manuscript “We used InToxSa to identify S. aureus pathoadaptive mutations, enriched in bacterial populations that are associated with human disease (e.g., upon transit from colonising to invasive”. “We hypothesised that these mutations would support an intracellular persistence for S. aureus.”) We see no foreseeable reasons preventing this type of study of being replicated elsewhere.

      3) It is also not clear to me that the toxicity assay captures the important features of the intracellular persistence that occurs in vivo within professional phagocytic cells.

      Response: Indeed, it is possible that InToxSa using HeLa cells may not capture the features of intracellular S. aureus persistence within professional phagocytes. However, our data shows that it remains possible to uncover genomic features related to intracellular cytotoxicity and persistence, both traits relevant S. aureus-host cell biology. The cells forming physical barriers, such as the epithelial cells and endothelial cells play major roles in staphylococcal pathobiology. Whilst HeLa cells are a model cell line, their tractability makes them ideal for high throughput studies tested over longer infection times.

    1. Author Response

      Reviewer #1 (Public Review):

      Mermithid nematodes are ecologically important parasitoids of arthropods, annelids and mollusks today. Their fossil record in amber reaches back into the Early Cretaceous, some 135 million years ago. Luo et al. more than triple this record by presenting, with ample illustrations, exceptionally well preserved new specimens from the beginning of the Late Cretaceous (99 Ma ago) of Myanmar. Their most important finding is that mermithids parasitized a number of insect clades in the Cretaceous that they are not known to infect today or in Cenozoic amber; further, the proportion of holometabolous insects among the hosts is found to be lower in the Cretaceous than in the Cenozoic. The strengths of the paper lie in the specimens, the illustrations of the specimens, and the documentation of when, where and how the specimens were acquired. Certain nomenclatural aspects of the paper require improvement. A potential weakness of the paper could be collection bias: it is not tested whether the collections used to show the shift toward holometabolous hosts from the mid-Cretaceous to the Cenozoic are representative of the fossil record as it is preserved and accessible today.

      Thank you very much for pointing out these issues. We have added a new Figure 10 and Table 1 to our paper. Indeed, collection bias is almost present in all amber biotas. However, we believe we have robust reasons to argue that the shift to holometabolous hosts does exist. Although Kachin amber has only been studied extensively in the last two decades (compared with centuries of study in Baltic amber or Dominican amber), it has become by far the most intensively studied amber biota since its Cretaceous age was appreciated, now comprising an exceptional 700 families (Ross, 2023). Also, the fossil record of holometabolous insects is clearly much better than heterometabolous insects in Kachin amber (1296 spp. vs 465 spp. respectively). But as shown in our paper, the nematodes we found in Kachin amber are mainly associated with heterometabolous insects. Therefore, even if collection bias might exist, such as the presence of some unreported nematode-Holometabola associations, we believe our conclusion about the shift is robust. We also add some explanation in our paper.

      Reviewer #2 (Public Review):

      This manuscript reports on mermithid nematode fossils from amber which dates from the Cretaceous period. The specimens described in the manuscript consist of insects and associated nematodes which have been trapped in amber and fossilised. The nematodes have been identified as belonging to the Mermithidae family, a family of nematode worm that infect insects. The findings of this manuscript provide an insight into the evolution history of nematodes and parasitism. Despite the ubiquity of both nematodes and parasites in extant ecosystems, fossil records of both are very rare. This is because nematodes and many parasites are soft bodied, and many are located inside their hosts' bodies, thus they rarely become fossilised. Thus, most of what is known about the evolutionary history of nematodes, and evolution of parasitism are based on what could be inferred from extant examples.

      The specimens described in this manuscript provides a valuable contribution to our understanding of parasitism in the geological past. These amber specimens are a snapshot of parasite-host interactions - interactions which are commonly found in nature but are rarely captured in fossils. The identification of the specimens as mermithid nematodes are based on sound scientific reasoning. The worms' morphology and position in relation to the insects are consistent with what have been observed with extant mermithid nematodes.

      Additionally, one of the values of such parasite fossils is that they provide us with insight into parasite-host combinations or interactions which may have existed throughout the geological past, but no longer exist today or cannot be inferred from extant taxa. It helps fill in major gaps in our understanding of parasitism. This was the case with the amber fossil that contained a bristletail with its nematode parasite.

      We are very grateful for the positive and encouraging comments.

      Reviewer #3 (Public Review):

      The authors provide a timely description of new mermithid nematodes from Cretaceous amber and use it to argue an important shift in insect host exploitation. The descriptions are state-of-the-art and will become valid once the appropriate zoobank numbers are used after publication. The authors also compiled crucial and detailed new information on the host exploitation in amber nematodes in the supplementary material. This data is also depicted in pie diagrams and seems at first glance to support their interpretations of a shifts in host exploitation in fossil amber deposits when analysed appropriately and statistically but such an true analysis and depiction should be part of the main manuscript to do the compilation and interpretation justice. For the sake of reproducibility and the field, such fundamental statistical analysis as well as a statistical comparison with modern hosts would make this broad-sweeping claim of a major host shift and importance of amber deposits containing such nematode-insect interactions since the Cretaceous (even) more robust and fundamental.

      Thanks. We realized this drawback and now we calculated the 95% CI using the Agresti-Coull method of the “binom.confint” function from the binom R package (https://cran.r-project.org/package=binom) of R 4.2.2. We also added a new Figure 10 and Table 1 in our paper. But, since we compiled the “occurrence” of invertebrate–nematode associations from these amber localities, it is impossible to compare with modern mermithids. For example, the parasite of Cretacimermis chironomae occurs five times in Kachin amber, but an extant dipteran-parasitized mermithid species can occur many times just in a single pond. However, it is evident that mermithids and all invertebrate-parasitized nematodes prefer to infect holometabolous insects rather than other invertebrates (Poinar, 1975; Poinar, personal observation). We have also added some explanation to our paper.

    1. Author Response

      Reviewer #2 (Public Review):

      Yamaguchi et al. studied the roles of two proteins, Calaxin and Armc4, in the assembly of the outer arm dynein (OAD) docking complex (DC). By combination of the improved cryo-ET analysis and gene knockout zebrafish lacking each of these proteins, they found that Armc4 plays a critical role in the docking of OAD and that Calaxin stabilizes the molecular interaction in the docking.They further showed an evidence that Calaxin changes the conformation of another compartment of DC comprising CCDC151/114. This new information provides an important basis for understanding how the DC is assembled and regulates docking of OAD. The authors' conclusion is well supported by the data but some data presentation and discussion need to be completed.

      Gui et al. (2021) already reported on a cryo-EM observation in bovine tracheal cilia, with the conclusion similar to this paper in the structure of OAD/DC on DMT. Using knockout zebrafish strain, the authors present detailed interaction of calaxin with other DC components. They show that the binding of calaxin induces the changes of conformation in N-terminal region of CCDC151/114. The conformation further changes in the presence of Ca2+; specific conformation of N-terminal region of CCDC151/114 becomes undetectable, instead additional structure appears in the vicinity of calaxin.

      1) The authors conclude that the Ca2+-dependent conformational change of DC is subtle and not dynamic. This result is eventually valuable information but may be somewhat unexpected from the point of view that calaxin plays an important role in the regulation of flagellar motility in Ciona sperm. The authors found that calaxin changes the conformation of N-terminal CCDC151/114 region but the core dynein structure shows no dynamic change. What about the changes in the interaction between calaxin, core dynein, and DMT? Is this beyond the resolution of cryo-ET analysis?

      Since Mizuno et al., 2009 reported that Ciona Calaxin switches its interactor depending on Ca2+ concentration, it is highly expected that zebrafish Calaxin also changes its interactor in 1 mM Ca2+ buffer conditions. However, the resolution of our cryo-ET data was insufficient to detect the change of Calaxin interactors. More detailed structural analyses are required to understand the OAD structures in the Ca2+ buffer conditions. We discussed this point as follows:

      (line 389-395)

      Regarding the Calaxin conformation, a previous biochemical analysis reported that Ciona Calaxin switches its interactor depending on Ca2+: β-tubulin at lower Ca2+ concentration and OAD γ-HC at higher Ca2+ concentration (Mizuno et al., 2009). Moreover, a crystal structure analysis revealed the conformational transition of Ciona Calaxin toward the closed state by Ca2+-binding (Shojima et al., 2018). In this study, however, such conformation change of Calaxin was not detected, probably due to insufficient resolution of our cryo-ET analysis. More detailed structural analyses in the Ca2+ condition are required to understand the mechanism of the Ca2+-dependent OAD regulation.

      2) It would be very helpful if the authors could add the cryo-ET images of calaxin-/- axoneme in the presence of 1 mM EGTA in Figure 7. Although these images are thought to be similar or identical to Figure 4F, it would help to confirm that the conformational changes in CCDC151/114 and additional part of DC are induced in a Ca2+-dependent manner.

      We added the cryo-ET images of calaxin-/- OAD-DC (1 mM EGTA) in Figure 7D.

      3) To clarify the molecular interaction of calaxin with other components, it would also be helpful if the authors add the images rotated 80 degree to Figure 4F and G, in similar way in Figure 7.

      We added the images of OADs rotated 80 degrees in Figure 4F and G.

      4) Despite the molecular phylogenetic difference, there are several similarities between calaxin and Chlamydomonas DC3, not only in the in situ structure and configuration but in the phenotype of mutants; Chlamydomonas mutant lacking DC3 shows OAD loss in the distal part of a flagellum (Casey et al, MBC, 2003). It may be a good reference if the authors add the position of DC3 in Figure 4. A', B', and C.

      To answer this comment, we created Figure 4—figure supplement 1, which shows the cryo-ET structures and models of OAD-DCs in vertebrates and Chlamydomonas.

      5) There is a significant difference in sperm motility between WT and calaxin-/- or WT and armc4-/- (Figure 2E). However, it is not clear whether immotile sperm were included in the data for VAP (Figure 2F) or BCF (Figure 2G). For example, WT and calaxin-/- show similar VAP, although both are significantly different in the percent of motile sperm.

      In our CASA study, spermatozoa with less than 20 μm/s velocities were considered immotile and excluded from the data for VAP (Figure 2F) and BCF (Figure 2G). To clarify this point, we revised the manuscript as follows:

      before

      Swimming velocity and beating frequency were calculated from the trajectories of the motile spermatozoa (Figure 2F-G; Figure 2—figure supplement 1; Video3).

      after (line 139-141)

      Swimming velocity (VAP) and beating frequency (BCF) were calculated from the trajectories of the motile spermatozoa, which have 20 μm/s or more velocities (Figure 2F-G; Figure 2—figure supplement 2; Video3).

      6) In calaxin-/- zebrafish, OAD was clearly detected from the base to two-thirds of a flagellum with unclear border (Figure 2A). Typical distribution of OAD+class and OAD-class are shown in Figure 5 in the ~3 micrometer tomograms. Were these taken from around this unclear border? Are proximal most region of a flagellum occupied with OAD+class only? The authors should clearly indicate the region of a flagellum where the tomograms in Figure 5C and D were selected.

      7) Line 229~: It is not clear what the authors meant by "probably reflecting the different distance from the sperm head". In relation to this and the comment 6, does the "proximal" in the sentence "OAD loss occurred even in the proximal part of the flagella" (line 232) indicate the region near the base of a flagellum?

      In general, axonemes are tangled on the cryo-TEM grids, which makes it difficult to identify the ends of all axonemes, especially for the long zebrafish sperm flagella. Thus, we could not clarify the region of a flagellum about the tomograms shown in Figure 5D.

      However, to answer comments (6) and (7), we created Figure 5—figure supplement 1. In this experiment, we newly generated cryo-TEM grids with sparse sperm axonemes and succeeded in finding two areas containing clear axonemal ends with suitable ice conditions for cryo-ET observations (Figure 5—figure supplement 1B). The polarity of the axonemes was judged from the 3D-reconstructed structures of the axonemes (Figure 5—figure supplement 1B, red dotted lines). By the structural classification of OAD+ class and OAD- class in the tomograms, we confirmed the OAD loss in calaxin-/- even in the proximal part of the flagella, which is near the base of a flagellum (Figure 5—figure supplement 1D, (a) and (c)). To clarify these points, we revised the manuscript as follows:

      before

      In calaxin-/-, the ratio of OAD+ class to OAD- class varied among tomograms (Figure 5D), probably reflecting the different distance from the sperm head. However, all calaxin-/- tomograms showed multiple clusters of OAD- class, indicating that the OAD loss occurred even in the proximal part of the flagella.

      after (line 236-239)

      In calaxin-/-, the ratio of OAD+ class to OAD- class varied among tomograms (Figure 5D), reflecting the different distances from the sperm head. Analysis of detailed OAD distributions along calaxin-/- axoneme revealed that OAD loss occurred even in the proximal part of the flagella (Figure 5—figure supplement 1D).

      8) In conjugation with comment 7, it would be appreciated to show an authors' idea on why distal region of flagella tends to lack calaxin, if they do not discuss anywhere in the text.

      We discussed this point as follows:

      (line 316-323)

      calaxin-/- spermatozoa exhibited a unique OAD distribution, with OAD-missing clusters at various regions of the flagella. Interestingly, OADs decreased gradually toward the distal end, by which the mechanism is unclear. The axoneme is elongated by adding flagellar components to its distal end during ciliogenesis (Johnson & Rosenbaum, 1992). IFT88, a component of the IFT machinery, disappears as the spermatozoa mature (San Agustin et al., 2015). Thus, we speculate that the OAD supply at the distal sperm axoneme is insufficient to compensate for the OAD dissociation in the calaxin-/-. Consistent with this idea, distal OAD loss is the sperm-specific phenotype, as olfactory epithelial cells in calaxin-/- have Dnah8 along the entire length of the cilia (Figure 6B).

      9) Immunofluorescence in twister-/- epithelial cilia showed that the localization of calaxin is independent of OAD (line 271-274). Based on the authors' finding, the localization of calaxin requires Armc4, which is preassembled with calaxin in the cytoplasm. If this is true and the localization of calaxin is NOT resulting from diffusion, Armc4 must be localized with calaxin along the entire length of cilia in twister-/- epithelial cilia (Figure 6D). Although Armc4 is shown localized in cryo-ET images (e.g. Figure 1, Figure 7), authors may provide the immunofluorescence of Armc4 along the entire length of sperm flagella and epithelial cilia.

      To answer this comment, we obtained a commercially available anti-ARMC4 (human) antibody and checked the cross-reactivity of the antibody against zebrafish Armc4, but no signal was detected in our western blot analysis. Thus, we could not assess the localization of zebrafish Armc4 in twister-/- epithelial cilia.

      In our study, we found an ectopic accumulation of Calaxin at the ciliary base in armc4-/- cells (Figure 6C, white arrowheads). The small molecular weight of Calaxin (~25 kDa) suggests the possible diffusional entry of Calaxin into the ciliary compartment. However, in armc4-/- cells, Calaxin accumulated at the ciliary base, strongly suggesting that Calaxin requires Armc4 to be localized to cilia.

      Reviewer #3 (Public Review):

      ODA-DC anchors ODA, the main force generator of ciliary beating, onto the doublet microtubules. Vertebrate ODA-DC contains 5 proteins, including Calaxin and Armc4, whose mutations are associated with defective ciliary motility in animals and human. By generating calaxin-/- and armc4-/- knockout zebrafish lines, this manuscript examined the Kupffer's vesicle cilia and spermatozoa. They showed that calaxin-/- and armc4-/- knockouts both affect ciliary motility but to different degrees. The authors conducted careful structural analyses using cryo-ET and subtomo averaging on both mutants, revealing a partial loss of ODA in calaxin-/- and a complete loss of ODA in armc4-/-. I really like the distribution analysis of calaxin-/- OADs (Figure 5), which emphasizes the strength of cryo-ET in uncovering the molecule distribution of distinct conformational states in situ. Fitting of the atomic models of ODA and ODA-DC into the cryo-ET density maps and Calaxin rescue experiments showed how Calaxin stabilizes ODA at a molecular detail. By using olfactory epithelium, the authors also presented the possible assembly mechanism of ODA-DC proteins, which is also a beautiful experiment. Finally, the authors also investigated how Ca2+ regulate the ODA-DC using cryo-ET.

      The thorough structural and functional analyses of Calaxin and Armc4 in WT and gene KO animals could serve as a reference for future study of the detailed function of other ciliary proteins. The experiments are overall well designed and conducted, but some aspects need to be clarified and improved.

      The authors interpret the vertebrate ODC-DC to include four linkers (line 193). However, the authors also said that loss of one linker (Calaxin) makes ODA to attach on the DMT through two linkers (line 199 and 246). These descriptions are confusing. It would make more sense to interpret the vertebrate ODC-DC as containing three linkers (CCDC151/114, Armc4/TTC25, Calaxin).

      This comment is reasonable because vertebrate OAD is tethered to DMT through three linker structures (the distal CCDC151/114, Armc4/TTC25, and Calaxin). However, vertebrate DC is composed of four parts (a) Calaxin, (b) the Armc4-TTC25 complex, (c) the proximal CCDC151/114, and (d) the distal CCDC151/114 (Figure 4E). The (c) part is embedded in the cleft between protofilaments A07 and A08. To clarify this point, we revised the manuscript as follows:

      before

      The bovine DC model shows that vertebrate DC is composed of four linker structures: (a) Calaxin, (b) the Armc4-TTC25 complex, (c) the proximal CCDC151/114, and (d) the distal CCDC151/114 (Figure 4E).

      after (line 196-200)

      The bovine DC model shows that vertebrate DC is composed of four parts: (a) Calaxin, (b) the Armc4-TTC25 complex, (c) the proximal CCDC151/114, and (d) the distal CCDC151/114 (Figure 4E). Among the four parts, three (a, b, and d) work as linkers between OAD and DMT, while (c) the proximal CCDC151/114 is embedded in the cleft between protofilaments of the DMT.

      To confirm whether Calaxin directly interacts with β-tubulin (line 213), a control experiment could be needed by incubating WT axoneme with mEGFP-Calaxin followed by IF imaging.

      In our manuscript, we wrote as follows:

      (line 218-224)

      To assess the specificity of Calaxin binding, we also performed a rescue experiment with mEGFP-Calaxin (Figure 4H-I; Figure 4—figure supplement 2). Ciona Calaxin was reported to interact with β-tubulin (Mizuno et al., 2009), suggesting the possible binding of Calaxin along the entire length of the axoneme. However, the rescued axonemes showed partial loss of EGFP signal (Figure 4H, white arrowheads). This pattern resembled the OAD localization of calaxin-/- in immunofluorescence microscopy, suggesting the preferential binding of Calaxin to the remaining OAD-DC. mEGFP alone showed no interaction with the axoneme (Figure 4H, asterisk).

      Therefore, our manuscript is NOT intended to support or deny the interaction between Calaxin and β-tubulin, which was reported by Mizuno et al., 2009. Instead, we focused on the interaction between Calaxin and OAD-DC, revealing that Calaxin binds to Calaxin-deficient OAD-DC (Figure 4G, H, and I). Thus, we assume this comment refers to the interaction between Calaxin and OAD-DC.

      To further discuss the interaction between Calaxin and OAD-DC, we created Figure 4—figure supplement 2. We tested Calaxin’s interaction by incubating recombinant mEGFP-Calaxin with sperm axonemes of calaxin-/-, armc4-/- (representing OAD-missing DMT), and WT (representing DMT with Calaxin and OAD). The localization of mEGFP-Calaxin was assessed by fluorescence microscopy of mEGFP signals. In calaxin-/-, mEGFP-Calaxin was bound to the limited region of the axoneme, with the partial loss of EGFP signals (Figure 4—figure supplement 2A, white arrowheads), consistent with Figure 4H. On the other hand, mEGFP-Calaxin showed no significant interaction with armc4-/- axoneme (Figure 4—figure supplement 2B) or WT axoneme (Figure 4—figure supplement 2C). These data show the preferential binding of Calaxin to the Calaxin-deficient OAD-DC than OAD-missing DMT or WT OAD. Although Mizuno et al., 2009 reported the interaction between Calaxin and β-tubulin, our analysis could not detect the signals for such interaction, probably due to the different binding affinity of Calaxin against OAD-DC and β-tubulin.

      The Immunoblotting experiment should be improved in Figure 5E. Could the authors get the same results in repeating experiments? Why is the Dnah8 signal higher in 50 mM NaCl of the (+)Calaxin group compared to that in 0 NaCl? This makes me doubt if the difference between (-)Calaxin and (+)Calaxin groups are significant.

      This comment is reasonable because NaCl concentration-dependent detachment of OAD-DMT suggests the highest Dnah8 signal in 0 mM NaCl of the (+)Calaxin group. To discuss this point, we created Figure 5—figure supplement 2, which shows the experimental replication of the immunoblot analysis in Figure 5E. In this experiment, we used calaxin-/- sperm axonemes collected independently of the Figure 5E data.

      However, again, the Dnah8 signal was higher in 50 mM NaCl of the (+)Calaxin group than that in 0 mM NaCl, confirming the result in Figure 5E. One possible explanation for this result is that the NaCl concentration affects the rescue efficiency of the Calaxin protein. We speculate that the Calaxin protein requires NaCl for efficient binding to OAD-DC, which caused the lower amount of OAD in 0 mM NaCl of the (+)Calaxin group compared to that in 50 mM NaCl.

      The authors have covered several important points in the Discussion section. Now that the function of Calaxin in both mouse and zebrafish have been reported, the authors could discuss the similarity and difference of Calaxin function in different species and tissues.

      To discuss this point, we inserted the following paragraph:

      (line 324-333)

      In mouse Calaxin-/- mutant, motile cilia in various organs (sperm flagella, tracheal cilia, and brain cilia) showed abnormal motilities, although OADs in the mutant cilia/flagella seemed mostly intact when observed by conventional transmission electron microscopy (Sasaki et al., 2019). In our study, however, we revealed that mutation of zebrafish calaxin caused OAD-missing clusters at various regions of the flagella, by using detailed cryo-ET analysis and immunofluorescence microscopy. Thus, we speculate that the same OAD defects to zebrafish calaxin-/- caused abnormal ciliary motilities in mouse Calaxin-/- mutant. One exception is the mouse nodal cilia. In mouse Calaxin-/- mutant, the formation of nodal cilia was significantly disrupted (Sasaki et al., 2019). On the other hand, zebrafish calaxin-/- mutant showed the normal formation of Kupffer’s vesicle cilia (orthologous to the mouse nodal cilia), suggesting the tissue-specific function of Calaxin on the ciliary formation.

      Because of the limited resolution, the authors should be more careful when interpreting the small densities in the difference map, for example, in Figure 4F-G black arrows. Considering that the CCDC151/114 coiled coil is overall poorly resolved both in the WT and mutant cryo-ET maps, the different densities could be due to different map quality or data processing. This makes the following statement suspicious "This structure corresponds to the N-terminus region of CCDC151/114, suggesting that Calaxin affects the conformation of neighboring DC components".

      This comment is reasonable because the resolution of our cryo-ET data was insufficient to identify each molecule in the cryo-ET map. To be more careful about the interpretation of our cryo-ET structures, we revised the manuscript as follows:

      before

      However, the difference map also showed an additional missing structure adjacent to Calaxin (Figure 4F’, black arrowhead). This structure corresponds to the N-terminus region of CCDC151/114, suggesting that Calaxin affects the conformation of neighboring DC components.

      after (line 207-210)

      However, the difference map also showed an additional missing structure adjacent to Calaxin (Figure 4F’, black arrowhead). When fitting the bovine DC model, this structure overlapped the N-terminus region of CCDC151/114, indicating that Calaxin can affect the conformation of neighboring DC components.

      To discuss the map quality and data processing of our cryo-ET analysis, we summarized the following points that can support the confidence of our data:

      (1) Two independent experiments showed the same results of OAD-DC structures, suggesting that the small changes in DC conformations were not due to different map quality or data processing:

      (a) For OAD structures in 1 mM EGTA condition, we analyzed the WT OAD (Figure 4D) and the calaxin-/- OAD rescued with recombinant Calaxin (Figure 4G). These samples were prepared in completely independent processes. However, in both cases, the small densities overlapping the N-terminus region of CCDC151/114 were visualized adjacent to Calaxin (Figure 4D and G, black arrowhead).

      (b) For OAD structures in 1 mM Ca2+ condition, we analyzed the WT OAD (Figure 7B) and the calaxin-/- OAD rescued with recombinant Calaxin (Figure 7C). These samples were prepared in completely independent processes. However, in both cases, the small densities overlapping the N-terminus region of CCDC151/114 were not observed. Instead, the additional densities appeared around DC (Figure 7B and C, white arrowheads).

      (2) We assessed the statistical significance of the changes in DC conformations. We applied Student’s t-test for WT and calaxin-/- OAD-DC structures and created Figure 7—figure supplement 1. p-values of each voxel were calculated as described in Oda & Kikkawa, 2013. The isosurface threshold of p-values corresponds to 0.05% probability in one-tailed test. p-value maps indicate not only Calaxin structures but also the adjacent small density (Figure 7—figure supplement 1A, black arrowhead) and the additional density around DC (Figure 7—figure supplement 1B, white arrowheads) as the statistically significant difference between WT and calaxin-/- OAD-DC.

    1. Author Response

      Reviewer #1 (Public Review):

      This project aimed to understand if decision making impairments commonly observed in older adults arise from working memory (WM) or reinforcement learning (RL) deficits. Evidence in the paper suggests it is the former; they observe poorer task accuracy in older adults that is accompanied by a faster memory decay in older adults using a novel hierarchical instantiation of a previously validated computational model. There were no similar changes in RL in this model. These results are extended using Magnetic Resonance Spectroscopy (MRS) to measure glutamate and GABA levels in striatum, prefrontal and parietal regions. They found that impairments in working memory were linked to reductions of glutamate in PFC, particularly in the older adult group.

      The task employed is elegant and has been studied extensively in different populations and is well-validated (though here a hierarchical Bayesian extension is developed and validated). The results however may not be definitive in some respects; the paper did not replicate previously observed RL deficits. It therefore, remains possible that this is due to the sensitivity of the task to this RL component in ageing and future work is needed to fully bridge the gap in the literature.

      Thank you for the comment. If our understanding of the comment is correct, our results suggesting no impairments in the RL system conflict with previously observed RL deficits in older adults. In the introduction section, we discuss previous literature on RL deficits in old adults which yields largely mixed conclusions, wherein some experiments show RL impairments (Frank and Kong, 2008; Hämmerer et al., 2011; Samanez-Larkin et. al, 2014) and some do not (Grogan et al., 2019; Radulescu et al., 2016). Placing our experiment in the context of these mixed results, we aimed to use a task that addresses these inconsistencies, by reasoning that commonly used RL tasks and models do not account for additional processes that may contribute to learning (e.g. executive function/WM/attention), hence explaining why sometimes the deficits are observed and sometimes they are not. We can also point to our model parameter recovery (Appendix 1 - Figure 9), where we show that RL model parameters (e.g. learning rate) are successfully recovered - indicating that our model is sensitive to RL variability in participants, but we observe no differences split across age groups.

      Although the study is well-executed, there is an obvious limitation in the use of a cross-sectional design to address this question. The authors acknowledge this limitation in the discussion but could go further to highlight the potential confound of cohort effects on gaming, RL and WM tasks more generally. Without within-person change data, the evidence can only be suggestive of potential age-related decline. For this reason, it may be more appropriate to use the terminology "age-related differences' rather than "age-related declines" given the study design.

      Thank you for the comment. We have attempted to address the cohort effects by administering RBANS to old and young participants. Age-normed total RBANS (Randolph et al., 1998) scores were similar in both age groups (described in the first paragraph of the results section), which we took to suggest that our cohorts reflected comparable samples of the population with respect to overall cognitive ability. In addition, we show that certain aspects of performance (e.g. accuracy) decline within the group of older adults, and not just between the two groups, which would constitute an argument against cohort-based effects. We now elaborate further on the point of cross-sectional design in the discussion section on lines 410-417. As suggested by the reviewer, we have also adjusted the language throughout the manuscript to imply age-related differences instead of age-related decline.

      Reviewer #2 (Public Review):

      In this study, Rmus and colleagues contribute to the important open question of whether reinforcement learning deficits observed in older adults are due to impairments in basic learning processes, or can be attributed to a decline in working memory function. The authors present cross-sectional behavioral data from a task designed to assess the role of working memory in reinforcement learning. And they use computational modeling in conjunction with MR spectroscopy to demonstrate a relationship between prefrontal glutamate and age-related impairments in learning specific to working memory decay. I found the overall story compelling, the data novel, and the analysis carefully executed. Below I outline some areas in which the claims of the manuscript could be strengthened.

      1) I may have missed this, but does glutamate correlate with other model parameters? Or did the authors only focus on the WM parameters because of the age difference? In support of the specificity argument, it would be important to show that glutamate only predicts WM related parameters regardless of whether there was an age difference or not.

      Thank you for your suggestion. In Appendix 1-figure 7, we show correlations between glutamate and all model parameters. If glutamate captured impairments in RL computational processes, we would expect to see a correlation between glutamate and the learning rate. Below we show that glutamate does positively correlate with RL learning rate. However, there are parameter correlations within the model itself – making the direct correlations hard to interpret.To better understand the relationships between learning rate, working memory, and glutamate, we ran a model predicting MFG glutamate using all parameters that significantly correlated with MFG glutamate (MFG glutamate ~ 1 + learning rate + decay + omega3 + negative learning rate), and found that only WM decay predicted MFG glutamate when controlling for other factors (learning rate: t = -0.42, β = -.03, p =0.67; WM decay: t = -3.14, β = -0.30, p = .002; omega3: t = 1.84, β = .16, p = .07; negative learning rate: t = .56, β = .03, p = .57). Thus, while glutamate measures correlate with RL learning rate, these correlations seem to be driven by the fact that both glutamate and RL learning rate correlate with WM Decay. Note that negative learning rate influences both RL and WM processes’ updating (see computational modeling section), and thus cannot help us make claims about specificity of RL or WM mechanisms alone being related to glutamate.

      2) As it is somewhat common with these tasks, it seems like the model does not fully capture the performance deficit in OA (Fig. 2B), even when all the individual difference parameters in WM are allowed to vary. Can the authors say more about the discrepancy? This is an interesting datapoint which may give clues to mechanism.

      Thank you for your comment. We elaborated on this in detail in the Appendix 1 (Posterior predictive checks section). We have observed that in some blocks (particularly in ns=6 blocks), older adults only learned a correct response for a subset of the presented stimuli, and neglected to learn responses to other stimuli altogether. We have interpreted this as a possible strategy older adults used to reduce the difficulty of the ns=6 condition. This would explain the discrepancy between the data and the model predictions, as the model has no way of accounting for stimulus identity effects on learning (since the model predicts similar performance for all the stimuli). To test our reasoning, we have fit the model to a subset of data - excluding participants who have implemented this strategy, and predicted that this should reduce the model misfit. We found that this is indeed the case (Appendix 1 - Figure 4). This confirms that strategic prioritization of stimuli in some older adults negatively affected the fit of the model. While we believe that a better understanding of these contaminant response patterns in the RL-WM model is worthy of further investigation, we feel that it is beyond the scope of this paper, and might require task designs with even higher set sizes to elicit the strategic stimulus prioritization more robustly. We have now added a paragraph in the discussion to discuss this issue.

      3) Relatedly, it may not be possible with these data alone, but can authors discuss what the WM decay parameter captures? In particular for OA, the distinction between generating and maintaining a "task set" has been extensively written about. Older adults tend to have difficulty internally generating and flexibly deploying task sets, but somewhat paradoxically can perform better than YA in certain decision situations (e.g. when reward is dependent on previous choices, see Worthy et. Al. 2011). The task in this study necessarily pushes OA in a regime in which relying on familiar decision strategies is sub-optimal, and task sets must be continuously generated. Is there a type of intervention do authors expect would reverse the observed deficit in WM?

      In the RLWM model, WM stores stimulus-action-outcome weights. Using WM decay we can gradually reduce the stimulus-dependent weights on each trial where the stimulus is not observed (e.g. forgetting). These weights, therefore, get reduced with the rate of decay, by being pulled towards the uniform/uninformative values (1/nA, where nA is the number of actions) they were initialized to. It effectively captures forgetting of information with increased time delays (here time = number of intervening trials between successive stimulus presentations where the stimulus is not observed). It is possible that older adults might be prioritizing storage of different types of (irrelevant) task information (e.g. category of stimuli, or relationships between the stimuli), resulting in a tradeoff that might lead to faster decay in older adults, and that the younger adults neglect such information. This could also explain discrepancies between our model and older adults described above, as the model does not hold any assumptions about how stimulus identities might impact task performance strategy. If this was the case, if probed about such task-irrelevant prioritized information older adults could potentially perform better than younger adults (in a way that in the Worthy et al. (2011) paper the older adults perform better on a choice dependent task compared to younger adults). We are unable to test this idea in our dataset, but we believe that it could be a promising avenue for future research.

      4) There is a wealth of evidence suggesting striatal DA loss in older adults, which served as the basis for many of the original investigations and hypotheses regarding a simple RL deficit in OA (e.g. work by Shu-Chen Li and others). While the authors do not directly measure DA in this study, it would be helpful to place the results in the context of that literature.

      Thank you for pointing this out. In the introduction, we have discussed the mixed results from research on RL/dopamine deficits in older adults. Some of the literature suggests no impairments in striatal dopamine in older adults (Samanez-Larkin et. al, 2014; Bäckman et al., 2006), while some suggests absence of impairments (Grogan et al., 2019). Furthermore, while DA is important for RL updating, it is also potentially important for WM updating (O’Reilly and Frank, 2006), therefore a potential DA loss could affect both RL and WM, and not RL exclusively. Prior research also suggests that although correlative relationship between DA and cognitive functions has been recorded, the extent of generality/specificity of the effects of DA on cognition in aging (Bäckman et al., 2006), compared to resulting noise that impairs cognition (Li et al.,2001) should be studied more extensively in the future. We have not focused on dopamine in the study, but have now added a paragraph in the discussion section to address this on lines 402-407.

      5) Finally, the main argument of the paper as I read it is that PFC glutamate mediates the performance deficits observed in RL because it reflects a compromised WM system. Sample size permitting, it would be helpful to see a formal test of this mediation relationship.

      As highlighted in the response to the mediation point in essential revisions, we observe that glutamate mediates effect of WM on task performance, but that this mediation approach might be difficult to justify, due to WM decay and task performance having shared signal and noise (since WM decay is estimated from task performance). We have now included the mediation analysis in our Appendix 1 information and provided a conservative interpretation of it in the results section.

      Reviewer #3 (Public Review):

      Aging impacts many cognitive functions, and how these changes affect performance in different tasks is an important question. By testing 42 older and 36 younger healthy adults with a novel learning task and MR spectroscopy, Rmus et al addressed the important question whether age-related declines in learning are driven by WM, or by deficiencies of the RL system. The task varied the role of working memory in learning by asking participants to learn about either 3 or 6 stimulus response associations from feedback (set sizes 3 and 6). The paper combines a detailed computational account of participants behaviour and striatal and prefrontal/parietal MR spectroscopy in order to assess individual glutamate and GABA levels.

      The authors report an effect of set-size on learning in both are groups, and show that participant age is associated with (1) worse accuracy, (2) a larger set size performance difference, and (3) a heightened sensitivity to reward. Computational modeling showed that working memory decay differed between age groups, but that reliance on WM to perform the task at hand was similar in both age groups (similarly differing between conditions in both groups). Turning to the MRS results, the paper shows that an aggregate measure of glutamate relates to aggregate task performance, that prefrontal glutamate specifically relates to WM decay observed in the task, and that age was negatively associated with glutamate levels.

      While the paper is well worth reading and offers many interesting data points, the title's suggestion that "Age-related decline in prefrontal glutamate predicts failure to efficiently deploy working memory in working memory" is, in my opinion, not fully supported by the evidence. First, the authors don't report clear evidence for any age-related differences in WM reliance in the task overall. Second, the authors find that MFG glutamate relates significantly only to WM decay, not the parameter that captures WM deployment. Third, correlations don't imply predictive relations.

      We apologize for the lack of clarity in our wording. We agree that the title of the paper implies that the reliance on WM parameter differentiates older and young adults, while the results show that the difference is mostly captured by the WM decay parameter. We meant to communicate that the age-difference seems to be particularly rooted in the WM, but have chosen misleading/confusing words. We have proposed changing the title of the manuscript to “Age-related differences in prefrontal glutamate are associated with increased working memory decay that gives appearance of learning deficits” to minimize confusion. With regards to your last point, as outlined in our response to essential revisions, we agree that we should modify the language used in our manuscript to be more consistent with the associative rather than predictive nature of our results.

      Another important open question relates to the relatively large age difference in the effect of set-size on performance. The authors write that working memory will contribute less to performance in higher set size conditions. Yet, age differences are largest in the set size 6 condition, suggesting that RL-dependent learning is most severely impaired in learning (set size 6 performance), rather than WM dependent learning (set size 3 performance). Finally, a statistically significant age difference in reward sensitivity seems to be hardly integrated into the authors' overall interpretation.

      Working memory does contribute less in higher set-size condition; however, given the higher number of items, the delays between successive presentations of the stimuli in the high set-size condition are on average longer - which makes the effect of WM forgetting more pronounced. Furthermore, a WM impairment can have an indirect effect in RL, in that frequent failure to select correct action through WM leads to reduced ability to train RL on encoding correct responses (especially earlier in training, when the incremental RL hasn’t ‘caught up’ yet), and thus worse performance overall. As such, a larger effect of set size could potentially be indicative of either or both WM or RL process deficits. This most clearly underscores the importance of modeling - these complex interactions are difficult to intuit, but modeling allows us to establish cleaner mechanistic explanations of observed behavioral patterns/group performance deficits (e.g. while on the surface impairment might look to be RL driven, it is actually better explained by a WM parameter, such as WM decay in older adults - this can). With regards to reward sensitivity, the same explanation applies - there are multiple mechanisms through which differences in reward sensitivity could occur (e.g. slower learning rate, or increased RL recruitment due to failure of WM), which further emphasizes the need for modeling.

      In short, in a complex task, there are often multiple ways to explain the same qualitative feature and here we have leaned on computational modeling to identify the computational elements that differed across groups. However we have now also simulated data from our computational models using posterior predictive checks to show that they can reproduce core descriptive features of the original data, including those noted above, and to examine the degree to which different features can be mapped onto the working memory decay parameter (Appendix 1 Figure 5).

    1. Author Response

      Reviewer #1 (Public Review):

      This paper presents a thorough biochemical characterization of inferred ancestral versions of the Dicer helicase function. Probably the most significant finding is that the deepest ancestral protein reconstructed (AncD1D2) has significant double-stranded RNA-stimulated ATPase activity that was lost later, along the vertebrate lineage. These results strongly suggest that the previously known differences in ATPase activity between extant vertebrates and, for example, extant arthropods is due to loss of the ATPase activity over evolutionary time as opposed to gains in specific lineages. Based on their analysis, the authors also "restore" ATPase function in the vertebrate dicer, but they did so by making many (over 40) mutations in the vertebrate protein, and it is not clear which of these many mutations is required for the restoration of the activity. Thus, it is difficult to discern how the results of this experiment relate to the evolutionary history.

      We completely agree with this reviewer's assessment of our paper. Our Michaelis-Menten analyses raised the intriguing idea that loss of ATPase activity in the helicase domain of the vertebrate ancestor may indicate loss of the ability to couple dsRNA binding to formation of the active conformation. Our rescue experiments support this idea, albeit in future studies we hope to create an active ancestor with fewer amino acid changes. While the rescue experiments validate what these analyses told us, as the reviewer suggests, they do not themselves inform on the evolutionary history.

      A criticism of the paper is the authors' tendency (probably unconscious) to ascribe a purposefulness to evolution. For example, in the introduction, "We speculate that the unique role of the RLR's in the interferon signaling pathway in vertebrates...created an incentive to jettison an active helicase in vertebrates." Although this sentence is clearly labelled as speculation and "incentive" is clearly a metaphor, the implication is that evolution somehow has forethought. (There are other instances of this notion in the paper, for example, in the last line of the abstract). The author's statement also implies that the developing interferon system somehow caused the loss of active helicase, but it seems equally plausible that the helicase function was lost before the interferon system co-opted it.

      We agree with the stated critiques and have rephrased language that suggests that evolution is an active force. In addition to changing the last line of the abstract (page 2, line 35), and removing the quoted sentence from the Introduction, we have included a more nuanced discussion of the order of evolutionary events that may have preceded or followed the loss of helicase function in Dicer (page 18, lines 418-430)

      Reviewer #2 (Public Review):

      The manuscript by Aderounmu presents an interesting attempt to reconstruct evolution of the function of the helicase domain in ancestral Dicers, RNase III enzymes producing siRNAs from long double-stranded RNA and microRNAs from small hairpin precursors. The helicase has a role in long dsRNA recognition and processing and this function could have an antiviral role. Authors show on reconstructed ancestral Dicer variants that the helicase was losing dsRNA binding affinity and ATPase activity during evolution of the lineage leading to vertebrates while an early divergent Dicer-2 variant in Arthropods retained high activity and seemed better adapted for blunt ended long dsRNA, which would be consistent with antiviral function.

      The work is consistent with apparent adaptation of vertebrate Dicers for miRNA biogenesis and two known modes of substrate loading: "bottom up" dsRNA threading through the helicase domain where the helicase domain recognizes the end of dsRNA and feeds it into the enzyme and "top-down" where the substrate is first anchored in the PAZ domain before it locks into the enzyme. Some extant Dicer variants are known to be adapted for just one of these two modes while Dicer in C. elegans exemplifies an "ambidextrous" variant. The reconstruction of the helicase domain complex enabled authors to test how well would be ancestral helicases supporting the "bottom up" feeding of long dsRNA and whether the helicase would be distinguishing blunt-end dsRNA and 3' 2 nucleotide overhang. Although the reconstruction of an ancestral protein from highly divergent extant sequences yields just a hypothetical ancestor, which cannot be validated, the work provides remarkable data for interpreting evolutionary history of the helicase domain and RNA silencing in more general. While it is not surprising that the ancestral helicase was a functional ATPase stimulated by dsRNA, particularly new and interesting are data that the decline of the helicase function started already at the level of the common deuterostome ancestor and the helicase was essentially dead in the vertebrate ancestor. It has been reported two decades ago that human Dicer carries a helicase, which has highly conserved critical residues in the ATPase domain but it is non-functional (10.1093/emboj/cdf582). Recently published mouse mutants showed that these highly conserved residues are not important in vivo (10.1016/j.molcel.2022.10.010). Aderounmu et al. now suggest that Dicer carried this dead ATPase with conserved residues for over 500 million years of vertebrate evolution.

      I do not have any major comments to the biochemical analyses and while I think that the ancestral protein reconstruction could yield hypothetical sequences, which did not exist, I think they represent reasonable reconstructions, which yielded data worth of interpretations. My major criticism of the work concerns clarity for the readership and interpretations of some results where I wish authors would clarify/revise the text. The following three examples are particularly significant:

      1) It should be explained to which common ancestor during metazoan evolution belongs the ancestral helicase AncD1D2 or at least what that sequence might represent in terms of common ancestry during metazoan evolution.

      We thank the reviewer for bringing this issue to our attention, and we have now included a brief discussion of the complexity in identifying AncD1D2’s exact position in metazoan evolution (page 6, lines 124-134). Our maximum likelihood phylogeny is constructed from Dicer’s helicase and DUF283 subdomains which evidently do not contain enough phylogenetic signal to resolve the finer details of early metazoan evolutionary events surrounding the divergence of non-bilaterians: Porifera, Ctenophora, Cnidaria and Placozoa. In our tree, Cnidaria even diverges later than the Nematode bilaterian branch reflecting the fact that our reported phylogeny does not match consensus species relationships, especially in the invertebrate clades. This means we cannot pinpoint AncD1D2’s exact position with certainty. While we do not intend to overinterpret the evolutionary trends from these hypothetical ancestral constructs, we believe the functional differences in biochemical activity are meaningful and correspond to big-picture changes over evolutionary time. AncD1D2 thus corresponds to some early metazoan ancestor that existed before the divergence of bilaterians from non-bilaterians. In support of this interpretation, when the phylogeny is constrained such that the bilaterian branches match the consensus species tree (Figure 1-figure supplement 2A) we observe that AncD1D2 is ancestral to the bilaterian ancestor, AncD1BILAT (now labeled on the figure), but retains 95% identity to the version of AncD1D2 constructed from the maximum likelihood phylogeny (Figure 1-figure supplement 3B).

      2) This is linked to the first point - authors work with phylogenetic trees reconstructed from a single protein sequence, which are not well aligned with predicted early metazoan divergence (https://doi.org/10.1098/rstb.2015.0036). While their sequence-based trees show early branching of Dicer-2 as if the two Dicers existed in the common ancestor of almost all animals (except of Placozoa), I do not think there is sufficient support for such a statement, especially since antiviral RNAi-dedicated Dicers evolve faster and Dicer-2 is restricted to a few distant taxonomic group, which might be better explained by independent duplications of ambidextrous ancestral Dicers. I would appreciate if authors would discuss this issue in more detail and make readers more aware of the complexity of the problem.

      We agree with the reviewer that in our initial submission we did not properly address the incongruence between our maximum likelihood phylogeny and the consensus species tree of life. We have now addressed this by revisions that discuss the difficulty in using a single gene or protein to accurately date ancient evolutionary events, especially in the case of Dicer, a protein whose evolutionary history is littered with multiple duplication events (page 6, lines 124-147, beginning with “Importantly, we observed multiple instances…”; page 16, lines 365-371, sentence beginning with “Uncertainty in the single gene or protein phylogeny…”). Our assumption that an early gene duplication produced the arthropod Dicer-2 clade is consistent with previous Dicer phylogenies that have been constructed with maximum likelihood algorithms with different parameters (https://doi.org/10.1371/journal.pone.0095350, https://doi.org/10.1093/molbev/msx187, https://doi.org/10.1093/molbev/mss263) using full length Dicer sequences with different taxon sampling depths and tree construction parameters. Removing other fast evolving taxa with long branch lengths from the sequence alignment still resulted in arthropod Dicer-2 branching out early in metazoan phylogeny (https://doi.org/10.1093/molbev/mss263).

      In analyses not included in our manuscript, we also independently constructed trees using full-length metazoan Dicers, helicase and DUF-283 subdomains using both RAXML-NG and MrBayes. We tried different taxon sampling depths and tried rooting the tree using either a non-bilaterian outgroup or a fungal outgroup and also tried breaking up potential long-branch attraction with deep taxon sampling. In every iteration, the arthropod Dicer-2 clade diverged early in animal evolution at some point before or during non-bilaterian evolution. We recognize that all these efforts are still prone to long-branch attraction that may cause the rapidly evolving Dicer-2 clade to artificially cluster with distant outgroups, but so far, the only evidence to support an arthropod-specific duplication event is parsimony. This parsimony model is plausible and one might expect a recently duplicated arthropod Dicer-2 to cluster closely with nematode Dicer-1, another antiviral Dicer that would have descended from a common ecdysozoan ancestor but this is not the case. The nematode HEL-DUF clade does get attracted to non-bilaterian Cnidaria clade in our ML tree, but unlike the arthropod Dicer-2 clade, this position varied depending on the parameters of phylogenetic analysis, and so we cannot conclude that arthropod Dicer-2’s position is due to long branch attraction. More sophisticated phylogenetic and statistical tools are needed to answer this question definitively, so we decided to proceed with the highest scoring maximum-likelihood phylogeny generated by our analysis.

      While we have now included a short discussion on the nature of this uncertainty in the revised manuscript (page 6, line 124., page 16, lines 365-371), we have excluded these additional details (paragraph above) from the main text in an attempt to prioritize readability for the generalist reader, and we hope that more specialized readers will find this discussion in the public comments helpful.

      3) Authors should take more into the account existing literature and data when hypothesizing about sequences of events. Some decline of the helicase activity is apparent in AncD1DEUT suggesting that it initiated between AncD1D2 and AncD1DEUT. This implies that a) antiviral role of Dicer was becoming redundant with other cellular protein sensors by then and b) Dicer was already becoming adapted for miRNA biogenesis, which further progressed in the lineage leading to vertebrates to the unique top-down loading with the distinct pre-dicing state where the helicase forms a rigid arm. Authors even cite Qiao et al. (https://doi.org/10.1016/j.dci.2021.103997) who report primitive interferon-like system in molluscs - this places the ancestry of the interferon response upstream of AncD1DEUT and suggests that this ancestral protein-based system was taking over antiviral role of Dicer much earlier. In fact, a bit weaker performance of AncD1LOPH/DEUT combined with the aforementioned interferon-like system and massive miRNA expansion in extant molluscs (10.1126/sciadv.add9938) suggests that molluscs possibly followed a convergent path like mammals. While I am missing this kind of discussion in the manuscript, I think that the model where "interferon appears ..." in AncD1VERT (Fig. 6) is incorrect and misleading.

      This comment is similar to others, including point 3 of Essential revisions, and we have revised our model in Figure 6 accordingly. We agree with the reviewer that we did not sufficiently explore the significance of the decline in Dicer helicase function between AncD1D2 and AncD1DEUT. In addition to the changes noted in point 3 of Essential revisions, we have corrected this by adding or modifying sentences in the Results (page 9, sentence beginning on line 197 “This reduction in ATP hydrolysis efficiency prior to deuterostome divergence may have coincided with…”, and page 11, sentence beginning on line 247 “One possibility is that between AncD1D2 and the deuterostome ancestor…”).

      We did not intend to suggest that this loss of Dicer helicase function was unique to vertebrates, but we focused on the deuterostome-to-vertebrate transition for the following reasons:

      a) The mollusk clade in our analysis is incongruent with its expected species position as a protostome. In our tree it clusters with deuterostomes instead. On one hand, this is probably an artefact of incomplete lineage sorting or long branch attraction. On the other hand, it is possible that this clade’s position is an underlying signal of the convergent evolution proposed by the reviewer. In support of the latter, some extant mollusk Dicer helicases (ACCESSION: XP_014781474, ACCESSION: XP_022331683) show a loss of amino acid conservation in Dicer’s ATPase motifs implying that extant mollusks have also lost Dicer helicase function like vertebrates. However, this is in contrast to vertebrate Dicer helicase where loss of function exists, but ATPase motifs remain conserved. We do not discuss this in the paper because the evidence remains inconclusive until extant mollusk Dicers can be functionally characterized, similar to Human Dicer and Drosophila Dicer-1, to determine that they are truly specialized for miRNA processing to the detriment of helicase function.

      b) Caenorhabditis elegans Dicer is an example of an ambidextrous Dicer, that processes both miRNAs, with the top-down mechanism, and viral dsRNAs, with the bottom-up mechanism. Recently, work has been published that suggests that C. elegans also possesses a protein-based innate immune defense mechanism, but instead of competing with the RNA interference mechanism, both mechanisms seem to work in concert and even share a protein in both pathways: DRH-1, a RIG-I-Like receptor homolog (https://doi.org/10.1128/JVI.01173-19). Furthermore, a protein-based pathway has also been reported in Drosophila and in this scenario Drosophila Dicer-2 is the dsRNA sensor that is common to both pathways (https://doi.org/10.1371/journal.pntd.0002823). This collaboration observed in ecdysozoan invertebrates is different from the competition that has been well established in vertebrates. More data is needed to understand whether a model of competition or collaboration exists in lophotrochozoan invertebrates like mollusks.

    1. Author Response

      Reviewer #1 (Public Review):

      VO2max is one of the most important gross criteria of peak performance ability and a plethora of studies focused on VO2max prediction. This manuscript provides huge and comprehensive data from male runners and male cyclists. The endurance-trained athletes performed cardiopulmonary exercise testing on a treadmill (n= 3330) or cycle ergometer (n=1094). In contrast to former studies, the authors used machine learning for algorithms and VO2max prediction. Models were derived and internally validated with multiple linear regression. The present study substantially expands current research.

      Sadly, the manuscript has an important and relevant main shortcoming as the limitations of the study had not been addressed properly:

      • The authors paid no attention to the fact that their results are strongly influenced by the exercise protocol used. It is obvious e.g. that maximal performance attainable in protocols with 2-minute exercise steps will be higher compared to an identical protocol with 3- or 4-minute steps.

      • The exercise intensity was kept constant for only 2 minutes before the workload was increased (by 1km/h treadmill or by 20-30 W cycle ergometer). Due to the kinetics of lactate, VO2, etc., it is evident that the short 2-min intervals aggravate the correct determination of aerobic and anaerobic threshold. It is well-known that longer-lasting constant exercise steps (e.g. 4 minutes) are better when the focus is centered on threshold determinations.

      The quality of this manuscript will be substantially improved when the authors could implement a comprehensive and blunt paragraph showing the limitations of their study.

      We have completed our manuscript by indicating its limits as recommended. It is reasonable to suspect that the type of protocol used matters in the cardiorespiratory indices obtained. Interestingly, according to available studies, this effect is more pronounced for the determination of cyclists' threshold power output or runners' treadmill running speed than for threshold and maximum cardiorespiratory indices such as VO2max or Hrmax (Silva et al. 2021; Weston et al. 2002; Vucetić et al. 2014).

      In the regression models presented, the main explanatory variables with the largest effect on the prediction value are the AT/RCP threshold VO2 values (rVO2RCP; rVO2AT). The coefficients for the other explanatory variables are relatively low and differences in their values due to the use of potentially different protocols appear to be marginal. Nevertheless, we see the possibility of worsening the prediction when using less suitable testing protocols for athletes such as ramp tests or typically clinical tests such as the Bruce test.

    1. Author Response

      Reviewer #1 (Public Review):

      This study represents an important work in the field of (CAR)T-cell immunotherapy by analyzing the effect of different oxygen tension on the function and differentiation of T-cells (especially CD8+). Although it has been described that low oxygen levels can influence effector function/differentiation of T-cells, as nicely acknowledged by the authors in the introduction, a comprehensive analysis in the context of immunotherapy has been missing so far and this study adds significant findings that will be relevant for patient care in all fields applying (CAR)T-cell immunotherapy.

      The strength of the evidence is generally solid although there are some discrepancies between the different ways to induce HIF-1α (i.e. low O2, pharmacological inhibition, shRNA knockdown) that need to be clearly stated and/or discussed.

      1) The first section of the results determines the impact of low oxygen and pharmacological HIF-1α stabilization on CD8+ T-cell activation/differentiation. Low oxygen diminishes cell growth but induces T-cell activation and effector cytokines, while HIF-1a stabilization mimics the effects on activation without alterations in expansion. Unfortunately, it remains unclear why effects upon low O2 are more pronounced although pharmacological HIF-1a stabilization is more efficient.

      2) As a next step, in vitro conditioned T-cells are transferred into a subcutaneous B16-OVA model. Although only the low O2 levels increase T-cell numbers in vivo after the transfer, the initial tumor burden was nicely decreased by both low O2 and HIF-1a stabilization. However, only the latter significantly improved survival and it remains unclear and uncommented why.

      3) Next, the authors address whether pre-conditioning of human CART-cells to induce HIF-1α either by pharmacological stabilization or by silencing of VHL shows similar effects. Surprisingly, both ways of HIF-1a stabilization resulted in different effects concerning differential gene expression and cytotoxic capacity of CART-cells. Accordingly, pharmacologically pre-conditioned CART-cells did not have a significant impact on survival in an in vivo model, while the VHL-silenced ones did significantly improve animal survival. This discrepancy between the two modes of HIF-1a stabilization remains uncommented. Unfortunately, it also remains unclear why the pharmacological HIF-1a stabilization significantly improved the survival in animals of the B16-OVA model and not in the human CART-cell model.

      4) After this, the researchers determine how the timing of hypoxic conditioning affects the (CAR)T-cells. Here it is convincingly shown that already a short period of hypoxic conditioning (1 day) with a subsequent expansion phase (additional 6 days) is sufficient to induce HIF-1a mediated alterations (e.g. metabolic changes, calcium flux, intracellular signaling). Although this section is coherent in itself, the switch between different times of hypoxic conditioning, expansion, and analysis is difficult to follow and might lead to confusion. The expression pattern of e.g. HIF-1a on day 1 and day 7 together with the nuclear amounts of NFAT and c-Myc might be misunderstood, like the other presented data as well.

      5) Last, short-term hypoxic conditioning of CART cells is tested in a solid tumour mouse model. The previously identified conditioning protocol also increases CART-cell function against solid tumours (as shown by enhanced cytotoxicity, reduced tumour burden, and prolonged survival). Unfortunately, although both HER2-CART-cells and CD19-CART-cells are shown to have superior cytotoxicity in vitro after the pre-conditioning, only HER2-CART-cells are demonstrated to be superior upon low O2 conditioning in an in vivo adoptive transfer mouse model and CD19-CART-cells remain an open question.

      Generally spoken, the limitations of the manuscript are:

      1) The occurring discrepancies of determining effects caused by the different modes of Hif-1a stabilization which certainly are caused by the complex nature of Hif-1a regulatory network, and;

      We now extend our observations and discuss these concerns more extensively in the manuscript.

      2) The limitation of detected effects primarily on CD8+ T cells while CART-cells products usually are a mixture of CD4+ and CD8+ ones.

      Figure S6H now shows that the effects of shorter periods of low oxygen conditioning obtained with CAR-T cells generated from isolated CD8+ T cells are reproducible in CAR-T cells generated from PBMCs. We have found that a 24h incubation of PBMC-derived CAR-T cells in 1 %O2 increases cytotoxicity against target cell effector differentiation at day 7, when compared to the cytotoxic effects of cells cultured at 21% oxygen levels.

      Reviewer #3 (Public Review):

      In this study, Cunha et al. examined the role of different oxygen tensions (21%, 5%, and 1% O2) and HIF-1α stabilisation in regulating murine and human CD8+ T cell proliferation and function. The authors find that hypoxia (1% O2) and pharmacological PHD inhibition with FG-4592, enhance murine T cell activation but impair proliferation. Furthermore, adoptive cell transfer (ACT) therapy of CD8+ T cells from both conditions reduced tumour burden in a B16-OVA melanoma model. Short hypoxic conditioning (1% O2) of human CD8+ T cells for 1 day increased HIF-1α stabilisation, with increased activation, glycolysis, and mitochondrial function still observed following 6 days of normoxic cell culture. Short hypoxic conditioning of HER2 and CD19 CAR-T cells improved their activation and cytotoxicity in vitro, while HER2 CAR-T cell counts were increased in vivo, reducing tumour burden, and increasing survival when compared to 21% O2.

      Strengths:

      The paper convincingly demonstrates that short hypoxic conditioning in a defined window improves CAR-T cell function through in vitro cytotoxicity assays and following adoptive transfer in a preclinical HER2+-SKOV3+ positive tumour model. Thus, the major conclusion of the paper is mostly well supported by the data and could represent a novel strategy to improve CAR-T cell immunotherapy for solid tumours in the future.

      Weaknesses:

      The extent to which hypoxic conditioning-mediated improvement in CAR-T cell function is dependent on HIF-1-driven metabolic reprogramming is unclear and other potential mechanisms are not explored. 5FG-4592 and VHL silencing in HER2 CAR-T cells did not phenocopy each other faithfully. In addition, neither approach was as effective as short hypoxic conditioning with 1% O2 in improving CAR-T cell function in vitro or in vivo. Although the authors suggest the temporal dynamics of HIF-1α stabilisation is the key point, this is not convincingly proven, and no metabolic characterisation of these CAR-T cells was performed.

      The revised manuscript now includes live metabolic analyses in a Seahorse set up, using T cells following FG-4592 treatment or VHL silencing. We found exposure of human CD8+ T cells to FG-4592 leads to a suppression of their oxygen consumption rates, both at basal and maximal levels. This can underpin the observed reduced expression of effector molecules (PMID: 33398183). Treatment of human T cells with FG-4592 resulted in a dose-dependent reduction of in vitro cytotoxicity, similar to that observed with exposure to low oxygen (e.g., 7 day OT-I expansion in 1%O2 impairs antitumour function [Figure supplement 6L]).

      Regarding VHL silencing, we did not observe metabolic differences compared to controls. This might arise from the fact that shVHL vectors only caused an overall 30% reduction in VHL protein expression, and that the silencing occurred after T cells had been activated. As we show, the moment of activation is key for T cell differentiation and function, and this could explain the lack of metabolic differences between shNCT and shVHL-expressing cells. These points are now added to 5th paragraph of the Discussion section.

      It is unclear how changes elicited during short hypoxic conditioning are maintained following continued normoxic cell culture. Hypoxia is known to rapidly regulate histone methylation and chromatin structure in a HIF-independent manner (PMID: 30872525; PMID: 30872526). Are similar epigenetic changes observed in T cells, and if so, could these epigenetic changes underlie improved T cell activation?

      We thank the reviewer for the insightful comment on potential epigenetic changes observed in T cells cultured in hypoxia. We have now carried out an extensive analysis of histone methylation and acetylation (Figure 4H). Human CD8+ T cells cultured for 1 day in 1% and 6 days in 21% showed decreased acetylation of H3K9 and H3K27, reduced trimethylation of H3K4 and H3K27 and increased methylation of H3K9me2, as compared to the levels of cells continuously grown in ambient oxygen. These differences might underpin the altered differentiation and metabolic shifts of 1% cultured T cells and further indicate that the oxygen tensions during the first 24 hours of activation elicit permanent alterations in T cells. Future work will be dedicated to understanding the link between the observed alteration in histone post-translational modifications and T cell function in response to hypoxia.

      Complications may also arise when comparing different oxygen tensions given recent data that suggests standard cell culture conditions can lead to local hypoxia through a combination (https://www.biorxiv.org/content/10.1101/2022.11.29.516437v1) of cellular respiration and poor O2 diffusion. Although it is unclear how this will impact suspension T cells it does beg the question as to whether HIF-1α stability following T cell activation is (at least in part) mediated by pericellular O2 limitations in cell culture over time, even in presumed hyperoxic (21% O2) conditions? Or if T cells subsequently cultured at 21% O2 following short hypoxic conditioning (1% O2) still experience local hypoxia during the 6-day culturing protocol? It would be important to assess this in future work and at least discuss these potential weaknesses.

      Upon analysing HIF-1α accumulation on day 7, we only found substantial HIF levels in cells that had been in low oxygen tensions for the last 3 days of culture (Figure S4G). This suggests that cells were not experiencing hypoxia at the time of analysis on day 7, given that we did not observe substantial HIF accumulation. We have additionally designed an experiment where 21% and 1% 1 day T cells were cultured for 7 days with a single media change on day 4 (standard) or with 5 media changes (each media change performed on separate days to minimize local hypoxia in ambient oxygen). Regardless of the number of media changes, 1% 1d cultures showed increased effector differentiation and expression of effector molecules, relative to 21% cells (Figure S4H). We also did not observe any differences between control cells cultured with 1 or 5 media changes. As hypoxia elicits changes in T cell differentiation, this suggests cells do not experience local hypoxia during the phase of ambient oxygen expansion. Nevertheless, we very much agree that it is important to accurately assess oxygen concentrations in cell culture media.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors provide evidence for chromatin, which in Drosophila muscle cells is peripherally localized in the nucleus, whereas the central region is depleted of chromatin, and is organised such that RNA polymerase II (RNAp) is surrounding dense regions of chromatin. The authors theoretically study the formation of these regions by describing chromatin as a multi-block copolymer, where the blocks correspond to active and inactive chromatin regions. These regions are assumed to phase separately and to have different solvability. The solvability of the active region is regulated by binding RNAp. The authors study the core-shell organization in a layered geometry by analyzing the various contributions to free energy. In this way, they in particular obtain the dependence of the shell-layer thickness, which is described as a polymer brush. From these results, they infer chromatin organization in spherical coreshell chromatin domains and compare these results to Brownian dynamics simulations.

      The work is well done and even though it uses standard methods for studying block copolymers and polymer brushes obtains interesting information about local chromatin organization. These findings should be of great interest to researchers in the field of chromatin organization and in general to everybody interested in understanding the physical principles of biological organization.

      The work has two main weaknesses: The experimental evidence for RNAp and chromatin microorganization is weak as only one example is shown. It remains unclear whether the observed organization pattern is common or not. Also, no data is shown concerning the dependence of the extensions of the active and inactive phases on parameters, for example, solvent properties or transcriptional activity. Second, some parts could prove difficult for biologists to assess. For example, the expression for the brush-free energy should be explained in more detail and notions like that of 'mushrooms' need to be introduced. As a second example, biologists might benefit from a better explanation of the concept of a theta solvent and its relevance.

      We thank Reviewer #1 for the positive review and critical feedback. Below we answer the points raised in the last paragraph of its review.

      In the original version of the manuscript we only showed a representative image of nuclei of muscle cells in an intact, live Drosophila larvae. Notably, this organization is representative of many nuclei analyzed in muscle tissue. In the revised version we show that in a distinct tissue, e.g. salivary gland epithelium of live Drosophila larvae, RNA Pol II distribution is similarly facing the nucleoplasm, although chromatin condensation differs due to higher DNA ploidy. The new images were added as Supplement information (Fig A1). Since these representative images are the main motivation behind our theoretical analysis, we think that including them will help the reader in understanding the relevance of our minimal model. The effect of different biological perturbations, such as changes in the repressive marks and how these change the core-shell structure require extensive experiments that are outside the scope of the present paper. We also note, that in live organisms (not just live cells) such as those studied here, one can only reliably use genetic perturbations; solvent quality is regulated by the organism and cannot be controlled as in synthetic polymer experiments. Our main focus in the present paper is to highlight an area that has been relatively unexplored by the chromatin organization community, which is how changes in concentrations binding-partners of chromatin may have a strong effect in nuclear architecture.

      We have also improved the explanation of the physical concepts for biologists. We added a more thorough explanation of the concept of a polymer brush and explained more clearly what the concept of theta solvent in terms of the scaling properties of a polymer in solution. We quote these revisions below.

      Reviewer #2 (Public Review):

      This work formulates a detailed theoretical polymer physics model intended to explain the observed morphology of chromatin in the Drosophila cell nucleus. The model is examined in detail by both analytical calculation and computer simulation. The central premise of the suggested theory is that it is again based on equilibrium statistical mechanics. Within this paradigm, authors explore the model that views chromatin fiber as a block copolymer and, most importantly, describes the role of RNA polymerase as it interacts with one of the copolymer blocks and regulates its effective solvent quality. Blocks are assumed to be fixed on the time scale of interest by, e.g., different levels of acetylation or methylation. RNA polymerase is supposed to interact only with one of the chromatin blocks, called active, and assumed interaction is quite peculiar. Namely, RNA polymerase complex may absorb on chromatin fiber and, the model assumes, the fiber decorated with absorbed RNA polymerase molecules is less sticky to itself, or more repulsive than the fiber itself. This peculiar assumption allows authors to make interesting predictions about how proteins can regulate the genome folding architecture.

      We thank the reviewer for the positive and critical feedback. We agree that our assumption of changes in the effective solvent stemming from protein complexes binding to chromatin is at the core of our analysis and we justify it further below.

      STRENGTH

      The work includes a rather detailed theoretical description of the model and its equilibrium statistical mechanics. As both analytical theory and accompanying simulation indicate, the assumptions put forward in formulating the model do indeed produce the desired morphology, with isolated regions ("micelles") of core inactive chromatin surrounded by the less dense shell region in which RNA polymerization may potentially take place. Having such a detailed theory is potentially beneficial for the field and opens up avenues for further exploration.

      We thank the referee for appreciating the potential benefit of our minimal theory of solvent-quality regulation by binding processes.

      WEAKNESS

      The underlying assumption about the interaction of RNA polymerase complex with the fiber, although important and organic for the model, does not seem easy to justify from a molecular standpoint, especially thinking of the charges and electrostatic interactions.

      We visualize that the binding of RNA Pol II (mediated by different transcription factors) to chromatin is also associated with larger protein complexes that may contain hydrophobic and hydrophilic components, such as pre-initiation complexes. Some regions of these complexes might associate directly with chromatin due to positive charges on the surface of the Pol II complex , whereas the hydrophilic negative regions may be directed towards the solvent. Our theory is typical of the approach used in polymer physics where coarse-grained interactions are considered. While the origin of hydrophilic interactions lies in electrostatics, such interactions are highly screened in cells (typically 200 mM concentration of salts) and can be considered as short-ranged and competitive with hydrophobic interactions. Chromatin in solution is known to condense (see Gibson, et. al., Cell 2019 and Strickfaden, et. al., Cell 2020) and even phase separate from the nucleoplasm (see Amiad-Pavlov, et. al., Science Advances, 2021); this can arise either from hydrophobic interactions of the histone tails or from opposite charge attraction of the histones and linker DNA. In our model, this competes with the binding of protein complexes which then disrupt the self-attraction of chromatin. Previous work has shown that RNA Pol II associating with chromatin (in the absence of transcription) prevents the coarsening of dense chromatin domains (see Hilbert, et. al. Nat. Comm. 2021), which agrees with our modeling of protein complexes that bind to chromatin and interfere with its condensation; in addition, the binding of Pol-II and all its binding partners that form the pre-initiation complex (see Hahn, Nat. Struct. & Mol. Biol. 2004, 11) will result in effective, steric repulsion between different active and Pol II bound chromatin domains. Another interesting observation is that most of the surface of RNA Polymerase II is negatively charged with a few positively charged patches with which it specifically interacts with DNA while others serve as exit paths of RNA (see Cramer, et. al., Science, 2001.). We agree that a more thorough analysis of the molecular interactions between what we name protein complexes and chromatin is interesting, but it is out of the scope of our paper that uses a coarsegrained, polymer physics approach. This approach also allows our model to be to be predictive as to the physical organization and growth of the domains, independent of those molecular details that are as yet unknown.

      Reviewer #3 (Public Review):

      This theoretical study provides a theoretical explanation for a puzzling question arising from recent experiments: How can chromosomes behave like polymers collapsed in a poor solvent but also contain "open" active chromatin sections? The authors propose that the binding of proteins (e.g. RNAP's) to the active sections can effectively change the solvent quality for these sections and thus open them. They suggest further that chromosomes show micellar structures with inactive blocks forming the cores of the micelles. Protein binding causes swelling of the micellar shells which affects the whole chromosome structure by changing the total number of micelles. This theory fits well to live imaging data of chromatin in Drosophila larvae, like the one shown in the striking Figure 1.

      The manuscript is written very clearly.

      My only suggestion is that the authors, in both the theory and simulation parts, are more explicit about how the interactions between the various components are modeled. From what I could see, in the theory part, one needs to look closely at Eq. 5 to understand how the influence of the binding of proteins affects the interaction between active monomers, and in the simulation part, one needs to go to the appendix to learn that interaction strengths between monomers within the active blocks and monomers within the inactive blocks have different values. The latter is crucial to understand the micellar structure shown at the top of Fig. 5A.

      We thank the reviewer for his positive response. We have explained Eq. 5 more carefully now and included other explanatory remarks throughout the text. We also explained more clearly the interactions considered in the simulations. Below we answer point by point and add quotes from the revised manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      Marchal-Duval et al studied the role of Prrx1 in lung fibroblasts. Prrx1 is a transcription factor expressed in lung fibroblasts but not in other cell types. The authors showed that Prrx1 gene expression was enhanced in IPF patients. Immunohistochemistry in IPF tissue suggested that Prrx1 was expressed in fibroblasts in fibroblastic foci. The authors then showed that Prrx1 expression was regulated by TGF-b1 stimulation or stiffness of substrate by in vitro experiments using primary human lung fibroblasts from either normal or IPF lungs. The authors also showed that Prrx1 regulated fibroblast proliferation and TGF-b signaling by regulating PPM1A and Tgfbr2 expression. Finally, the authors revealed that Prrx1 knockdown suppressed fibrosis in bleomycin-induced fibrosis or PCLS. This manuscript identified novel molecular roles of Prrx1 in fibroblast activation, which is expressed in not only lung fibroblasts but also in other injured or developing organs. To support the idea that Prrx1 plays a critical role in lung fibrosis, however, some discrepancies between in vitro and in vivo data need to be clarified.

      Comment #1. Although the authors showed that Prrx1 knockdown in primary fibroblasts reduced Smad2/3 phosphorylation, the reduction of Acta2 or Col1a1 after Prrx1 knockdown and TGF-b1 stimulation was not impressive (Fig. S6), suggesting that the inhibition of TGF-b signaling by Prrx1 knockdown is only partial. In contrast, Prrx1 knockdown by ASO in bleomycin-induced fibrosis showed remarkable fibrosis suppression (Fig. 6, 7). Admittedly there are differences in models and nucleotides used, but this discrepancy needs to be addressed.

      We agree with the reviewer that Prrx1 inhibition only partially affects the upregulation of ACTA2, but this effect was significant (around 50% inhibition at the protein level). As stated in the discussion (lines 569-572), our data show that key ECM proteins such as Collagen 1 and Fibronectin were still upregulated in TGF-1 stimulated lung fibroblasts transfected with PRRX1 siRNA, whereas TNC and ELN mRNA expression levels were perturbed. These findings suggest that broader phenotypical changes are associated with Prrx1 knockdown. Notably, we also observed that Prrx1 inhibition impacted cell proliferation in vitro. We believe that the observed suppression of fibrosis in bleomycin treated mice following Prrx1 knockdown by ASO is the result of both the partial inhibition of TGF-β1 effect and the decrease in mesenchymal cell proliferation. Supporting this hypothesis, we observed a decrease in PDGFR-positive cell proliferation in Prrx1 ASO-treated animals (see comment #4 hereafter).

      Comment #2. Fig.6 and 7 lack control groups, where mice are treated with PBS instead of bleomycin and treated with either control ASO or Prrx1 ASO.

      As stated in the revised version of the material and method (line 683-686), the knockdown efficiency of Prrx1 ASO and lack of effects of control ASO were first validated in naive mice, which were treated with either Prrx1 ASO or control ASO, compared to PBS-treated mice (see Figure R2 in the answer to comment #11 of reviewer 2). Those groups were not repeated / included in the first set of bleomycin experiments in order to comply with institutional regulation to limit animal usage. In the first set of experiments (Prrx1 ASO treatment between day 7 and day 13 after bleomycin insult), the saline + PBS was just used to confirm fibrosis development while the bleomycin + Control ASO was the proper control of the bleomycin + Prrx1 ASO group. In the new second set of experiment (ASO treatment between day 21 and day28 suggested by reviewer #2), we were authorized by our local animal ethical committee to include a control ASO group in the saline treated group to confirm that the lack of effect of these control ASO compared to the PBS group (see new Figure 7-figure supplement 1).

      Comment #3. In Fig. 6F, the hydroxyproline content is shown with ug collagen/ug protein. Total protein in the lung is influenced by infiltration of hematopoietic cells, which are the major population in injured lungs by cell count. Fibrosis should be ideally assessed as ug hydroxyproline/lung (or lobe).

      We completely agree with the reviewer that hydroxyproline content should ideally be assessed by lobe/lung. As stated in the revised material and methods (lines 882-885), hydroxyproline and protein contents were measured using paraffin lung sections (15 sections of 10µm per sample) with the Quickzyme Biosciences hydroxyproline assay and total protein assay kits; due to limited material access and to refine its use to limit animal usage. Furthermore, the infiltration of hematopoietic cells would rather undermine the effect of Prrx1 ASO (less fibrosis and inflammation) since the contribution of those cells would be higher in control ASO-treated bleomycin mice. Considering the reviewer’s concern, a complete lobe was used to measure hydroxyproline content in the new set of experiments generated during the revision of the manuscript (see new Figure 7-figure supplement 1).

      Comment #4. Major proliferating populations in bleomycin-treated lungs are not mesenchymal cells but epithelial/endothelial/hematopoietic cells. Mki67+ cells (Fig. 7D) need to be identified by co-staining with mesenchymal markers if the authors claim that Prrx1 knockdown suppresses fibroblast proliferation in vivo.

      We agree with the reviewer that epithelial/endothelial/hematopoietic cells are the main proliferating populations in bleomycin treated animals at day 14. As suggested by the reviewer, we performed a MKI67 / PDGFR co-staining to identify proliferating mesenchymal cells and confirmed a decrease in proliferation in these cells after Prrx1 knock down in bleomycin treated mice (see lines 448-451 and Figure 6-figure supplement 3).

      Comment #5 Bleomycin-injured lungs or IPF tissue are patchy and mixed with normal and abnormal areas. Therefore, how areas of interest are chosen for histological quantifications (Fig. 6C, S14D) need to be described in the methods section.

      As now stated in the revised material section (lines 864-866), areas of interest were chosen according to the presence of major alveolar thickening as well as fibrous changes and masses (confirmed by picrosirius staining on serial section).

      Reviewer #2 (Public Review):

      The paper from Marchal-Duval et al reports for the first time the important role played by the transcription factor PRRX1, expressed specifically in the mesenchyme of the lung, in the context of fibrosis. The authors used a combination of human (Donor and IPF) and mouse lungs (saline and bleomycin treated) as well as associated fibroblasts and PCLS to test the functional role of PRRX1 in the context of proliferation and differentiation induced by TGFb1. The work is supported by an impressive amount of data (7 main figures and 14 supplementary figures).

      Comment #1: A main weakness in this work is the counterintuitive result that PRRX1 is downregulated in human lung fibroblasts (from both IPF and Donor) treated with TGFb1.

      We agree with reviewer that PRRX1 downregulation upon TGFb1 treatment may appear counterintuitive. First, as stated in the manuscript, this inhibitory effect is partial. Secondly, we performed additional experiments in the revised manuscript to better understand (timewise) the downregulation of PRRX1 in response to TGF-b1 in lung fibroblast as suggested by the reviewer. Time course analysis of PRRX1 isoform expression levels showed that PRRX1 was downregulated only after 48h. This late downregulation of PRRX1 in response to TGF-b1, could be the signature of a negative feedback loop to limit cell-responsiveness to TGF-b1 when lung fibroblasts are fully differentiated into myofibroblasts at 48h as discussed in the revised manuscript (see lines 175-180 and lines 589-594).

      Comment #2: Another smaller weakness is the inactivation of Prrx1 in vivo using ASO starting at d7 post bleomycin treatment.

      In our study of Prrx1 inhibition in vivo, we followed a therapeutic/interventional protocol consistent with current literature on the bleomycin model of lung fibrosis (Moeller A. et al, Int J Biochem Cell Biol 2008 and Kolb M. et al., Eur Resp J. 2020), treating the animals with either control or Prrx1 ASO every other day between day 7 and day 14 during the active fibrotic phase. In the revised manuscript, we extended our investigation to assess the potential effect of Prrx1 inhibition during the late fibrosis phase after bleomycin treatment at day 28, treating the animals with either control or Prrx1 ASO every other day between day 21 and day 27. Interestingly, we found that the effects of Prrx1 inhibition during the late fibrosis phase were less (but still) potent compared to the active fibrotic phase (see Figure 7-figure supplement 1).

    1. Author Response

      Reviewer #2 (Public Review):

      We thank the reviewer for their assessment that our work “supports the idea that epithelial-endothelial crosstalk is important for lung regeneration and proposes a potential candidate for this process” and their helpful suggestions for strengthening and clarifying our work.

      1) The scRNA-seq analysis is performed in two separate objects ("control lung" and "H1N1 infected lung 14dpi"). For these two sets of data to be comparable, the authors should have integrated the objects and analyzed them together. This is not only important for deciding the clusters' identities and making sure that the same clusters are compared between control and infected, but also necessary to compare gene expression.

      We have integrated the control and H1N1-infected scRNA-seq datasets and reanalyzed the integrated data. We then analyzed CAP1_A and CAP1_B populations, comparing their gene expression between control and influenza conditions. Unbiased clustering of the integrated dataset reveals the same clusters we identified in the individual datasets, with cells from control and flu contributing to each cluster (with the exception of proliferating endothelial cells, which are found only in the H1N1-infected lung). We have added a supplemental figure outlining these data (Figure 1 – Figure Supplement 3).

      2) ATF3 is not only present in Cap1_B, in the infected lung there seems like Cap1_A express less ATF3. The authors should comment on this difference.

      We have added violin plots to Figure 1, which we feel will better represent the greater Atf3 expression in CAP1_Bs relative to other endothelial cell subtypes. The reviewer is correct that Atf3-expressing cells are found in large vessels, but they are also numerous in the alveolar capillary space and increase with influenza in these regions. We have added lower-magnification, higher-resolution images of Atf3CreER; ROSA26tdTomato animals, both control and influenza-infected, to illustrate this expansion in a new Figure 2 – Figure Supplement 3. This increase is also quantified in Figure 2C. We have also clarified this in the text.

      3) It is unclear how the clusters Cap1_A and Cap1_B were decided. The manuscript would benefit from clarification.

      We have added text to the Materials and Methods section to clarify this.

      4) It would be beneficial to see via immunofluorescence the morphological and spatial differences between ATF3-expressing and non-expressing endothelial cells since this transcription factor is expressed in multiple endothelial cell types.

      We have added lower-magnification, higher-resolution images of Atf3CreER; ROSA26tdTomato animals, both control and influenza-infected, to illustrate the spatial distribution of Atf3-expressing endothelial cells. This data is now shown in the new Figure 2 – Figure Supplement 3. We have also added further data to the new Figure 5 – Figure Supplement 1 to include the cytoplasmic endothelial marker Endomucin-1 (EndoM1) in an analysis of the spatial distribution of endothelial cells in wild-type and Atf3-knockout animals at 21 dpi.

      5) The authors mention ATF3 is not endothelial-specific. Expression of ATF3 in other cell types should be evaluated via immunofluorescence.

      This data is present in Figure 2 – Figure Supplement 2.

      6) The authors should have shown evidence of the deletion in their Atf3EC-KO mouse and addressed whether they had residual ATF3. If there is no antibody available, RNAscope could be used, or Western Blot or RT-PCR on sorted endothelial cells.

      We agree that this is an important quantification to make. We have performed qRT-PCR for Atf3 in both the animals used to perform the RNA sequencing experiment as well as a new cohort of animals to confirm Atf3 deletion. We have added these results to a new supplemental figure accompanying Figure 4 (Figure 4 – Figure Supplement 1).

      7) The authors only show the epithelium as evidence that the alveolar region is altered in their mutant after infection. The endothelium should have also been investigated, especially since their mutant is an endothelial-specific deletion. Within this, the different endothelial cells should have been assessed by a method other than RNAscope such as immunofluorescence, given that this method is unable to show morphology and there are antibodies available.

      This data is present in Figure 5. We have also added additional data to the new Figure 5 – Figure Supplement 1 to extend our analysis to 21 dpi and to incorporate a cytoplasmic marker of endothelial cells, Endomucin (EndoM1).

      8) Bulk RNA-seq from endothelial cells is used in the manuscript. However, because ATF3 is not specific to Cap1_B cells or even capillaries alone, the downstream gene expression analysis of bulk RNA should be placed into the context of lung endothelial heterogeneity.

      We have added qRT-PCR analysis of several downstream genes to address the comments of Reviewer #3, point #3. To place this into the context of endothelial heterogeneity, we have added dot plots to show the expression of selected genes from the RNA-seq analysis in each endothelial subtype from the H1N1 scRNA-seq dataset. These data can be found in the new Figure 4 – Figure Supplement 1. However, because of the relatively low sequencing depth of scRNA-seq compared to bulk RNA-seq, many of the transcripts examined were only present in a small percentage of endothelial cells in the scRNA-seq dataset, so the differences seen are more striking in the RNA-seq data.

      9) Although the authors mentioned that the infection with H1N1 influenza can have regional differences, they do not show how they picked regions for their analysis and quantification, and whether ATF3 upregulation was found in more severely affected regions. Furthermore, since they quantified via FACS, this heterogeneity in the infection itself could have affected their observations.

      We agree that it is essential both to define the extent of H1N1-mediated inflammation in Atf3 wild-type and knockout mice and to compare this factor between genotypes. We have therefore used a previously published method for quantifying regions of severe, damaged, and normal tissue structure (Liberti et al., Cell Reports 2021) in both Atf3 wild-type and knockout animals. Our results show that Atf3 wild-type and knockout mice have similar levels of tissue damage, and we have added a supplemental figure demonstrating these data (new Figure 3 – Figure Supplement 2). We have also clarified how regions were selected for quantification of alveolar area.

      H1N1 influenza injury in mice is heterogeneous, with regions of severe alveolar destruction marked by densely packed immune cells, adjacent regions of damaged tissue, and regions of tissue that appear to have normal tissue structure, as we and others have previously described (Zacharias, Frank et al., Nature 2018; Liberti et al., Cell Reports 2021; Niethamer et al., eLife 2020). However, it has become increasingly apparent that these regions where tissue structure appears normal are actually regions of active regeneration, and endothelial cell proliferation is increased in these regions (Niethamer et al., eLife 2020). We therefore selected 20X fields in these areas to use for quantifying alveolar area, as these are actively regenerating regions where alveolar structures are present for quantification. Because of the changes to tissue structure seen in damaged or destroyed tissue areas, we did not select these regions for quantification, although they were present at similar frequency in Atf3 wild-type and knockout animals.

    1. Author Response

      Joint Public Review:

      These RNAs come from a screen which is not well described and the descriptions of the sequence analyses are unclear, so it is difficult to know exactly what they are analyzing in the manuscript.

      We apologize for not including the required details in the manuscript. The cell cycle lncRNA screen where we identified the initial SNUL-1 probe was published in an earlier paper 6. By performing RNA-seq in cell cycle synchronized samples, we identified several hundreds of lncRNAs that differentially expressed in a particular stage of the cell cycle. We performed a large-scale RNA-FISH-based screen to characterize the localization of these cell cycle-regulated lncRNAs. One of the probes in this screen hybridized to SNUL-1 RNA in the nucleolus. The original double-stranded DNA probe that detected the SNUL-1 RNA cloud(s) was mapped to hg38-Chr17: 39549507-39550130 genomic region, encoding a lncRNA. However, other unique non-overlapping probes generated from the Chr17-encoded lncRNA failed to detect the SNUL-1 RNA cloud. Furthermore, BLAST-based analyses failed to align the SNUL-1 hybridized sequence to any other genomic loci. Since a large proportion of the p-arms of nucleolus-associated NOR-containing acrocentric chromosomes is not yet annotated, we speculated that SNUL-1 could be transcribed from an unannotated genomic region from the acrocentric p-arms.

      We have now provided the information in the revised manuscript. Specifically, we have provided the details of the PacBio iso-seq, nanopore seq analyses as well as the bioinformatic approaches that were conducted to determine the identity of the full-length SNUL-1 ncRNA.

      If these are RNAs with reasonable abundance, then they should be findable without the extensive PCR amplification they appear to have done for the PacBio sequencing (the methods section is not clear on exactly how many rounds of PCR were performed).

      We apologize for not providing the essential details. In the PacBio-iso-seq analyses, we utilized the standard protocol (recommended by the scientists from PacBio, who are authors in the manuscript), which included 13 PCR cycles. However, as described in the manuscript, in parallel to PacBio-seq, we also performed nano-pore sequencing of the nucleolus-enriched RNA without any amplification. The SNUL-1 full-length candidate sequence (CS) that we described in the manuscript is the ncRNA that showed 100% sequence similarity in both independent PacBio Iso-seq as well as nanopore seq analyses. We argue that if the SNUL-1 candidate transcripts would have been an artifact of PCR amplification in PacBio-seq, then we would not have obtained the full-length sequence with 100% match in the nanopore-seq reads. We have now included the detailed bioinformatic analyses in the methods section of the ms.

      Moreover, given the acknowledged sequence similarities of the SNULs with other RNAs, the possibility of chimaera formation during PCR amplification is high. They are clearly detecting RNAs associated with nucleoli but exactly what they are examining is unclear.

      Please see our response above (public Reviewer comment 2). In addition, we performed detailed bioinformatic analyses to test whether the SNUL-1 full-length sequence obtained in the PacBio-seq is not an artifact of PCR amplification. This analysis is described in detail in the methods section under the sub-title “sequencing analyses”.

      It is possible that a clear determination of the genomic origin of these RNAs will be complicated by the repetitive sequences in the regions of the genome where they reside.

      We thank this reviewer for acknowledging the technical limitation of mapping the genomic locus of SNUL1 genes. We have pointed out this as the limitation of the present manuscript. Mapping the SNUL-1 genomic locus and characterizing the regulatory sequence elements and factors that control the monoallelic expression of SNULs will be part of future research plans.

      Note also that the idea of monoallelic expression from rRNA encoding loci is interesting, but has been established in 2009. Title: Allelic inactivation of rDNA loci. Genes Dev. 2009 Oct 15;23(20):2437-47. doi: 10.1101/gad.544509.

      We thank the reviewer for pointing out the study from Cedar lab published in 2009. To test the idea that SNULs contribute to allele specific expression of rRNA, which was previously reported by Cedar lab in their 2009 G&D paper, we performed the same set of experiments described in their paper in three different cell lines in the presence or absence of SNULs (please see the response to Editorial comment-2). However, we could not reproduce any of the data presented in the G&D manuscript. Also, we have not seen any other follow up study, where mono-allelic expression of rDNA genes was observed. Currently, no concrete data supports monoallelic expression of rRNA 5. We, therefore, argue that our current study is the first one, demonstrating the mono-allelic association of a ncRNA from the p-arm containing rDNA cluster.

    1. Author Response

      Reviewer #1 (Public Review):

      The shift from outcrossing to selfing is one of the most prevalent evolutionary events in flowering plants. The ecological and genetic backgrounds of these transitions have been of major interest for decades, and one of the key questions was the dating of this transition. Timing of pseudogenization of the self-incompatibility (SI) genes has been used as a proxy for this transition because loss-of-function mutations of SI genes are often responsible for the evolution of predominant selfing. However, SI genes are identified only in a limited number of taxa, and in some cases, the evolution of selfing is not necessarily associated with loss of SI. Therefore, an independent time estimate of the evolution of selfing by genome-wide polymorphism data has been considered important in this field.

      This study provides two statistical methods: SMC-based and ABC-based methods. Both methods intend to detect the genome-wide signatures of the outcrossing-to-selfing transition that alters the ratio of population recombination rate and mutation rate. Authors validated these methods by using the simulated data, confirming that both methods can generally infer the timing of the outcrossing-to-selfing transition jointly with population size changes, although its precision depends on several population history settings.

      This study would be an important contribution to the field of mating system evolution. By applying the proposed methods to many other selfing organisms, we may be able to see a general picture of the timescale of the outcrossing-to-selfing transition combined with population size dynamics. At the same time, this is one of the extensions of the SMC method, which has already been well utilized for various inferences, including population size and recombination rate heterogeneity.

      We thank the reviewer for his positive comments and acknowledging the novelty and relevance of our study for the field.

      I do not find a major weakness in the methodologies of this study, but I have a few comments on their applications to the data of Arabidopsis thaliana. It is important that these estimates largely depend on what input data is used, especially the mutation rate and recombination rate. While the authors claim that their estimate is older than Bechsgaard's estimate (<413 kyrs), these two studies used different mutation rates: the authors used Ossowski's mutation rate, and Bechsgaard used Koch's mutation rate (Koch et al. MBE 2010). To compare these two estimates, it is important to use the same mutation rate. Shimizu & Tsuchimatsu (2015; Ann Rev Eco Evo Syst) in detail discussed this point and showed that Bechsgaard's estimate becomes <1.48 myrs when Ossowski's mutation rate was used (see Figure 4). Then it happens to overlap with the estimate of this study.

      Thank you very much for identifying this important problem. It is indeed critical to re-scale Bechsgaard’s age of the transition using the same mutation rate as used in our analysis (Ossowski et al 2010). We now use the rescaled estimate published in your review (Shimizu and Tsuchimatsu 2015, figure 4). We note that Bechsgaard et al did not publish a measure of uncertainty around their estimate of the transition; making it difficult to compare it with our posterior distributions. However, Bechsgaard’s estimate is not contained within the credibility intervals of our posteriors for t_sigma and therefore we consider both results significantly different. We have modified the text accordingly, at page 4 l. 8-10; and p.12 l. 27 to p.13 l. 5

      I am also concerned about the genomic regions of Arabidopsis thaliana used for this study. Authors chose specific five regions based on homogeneity of recombination rates and diversity, but how does the estimated change when randomly chosen genomic regions are used? If it is important to choose "preferable" regions according to the homogeneity of recombination rates and diversity, it may be useful to describe how these regions should be chosen for future applications of this method to other organisms.

      The genomic intervals used for the application to A. thaliana are indeed not random. They were defined such as to avoid, on each chromosome, the increased diversity observed at and surrounding pericentromeric regions. This effect has already been described by Clarck et al (2007, Science) but however, no explanation for this pattern has been published yet. We have updated the text, including a recommendation for future application to other species, at lines p. 13 l. 8-15 and p. 18 l.25-30, and Figure S15. We have also replicated our analysis of the A. thaliana data using a different set of genomic intervals located outside pericentromeric regions (Figure S15 and S16)

      Reviewer #2 (Public Review):

      This submission seeks to detect changes in the rate of selfing through pairwise comparison of haplotypes sampled from a population. It begins, as did a previous paper by a subset of the authors (Sellinger et al. 2020), with the well-known theoretical finding that partial selfing increases the rate of coalescence and decreases the rate of crossing-over events in genealogical histories.

      I am supportive of pitching this contribution as primarily theoretical, with the very short discussion of the Arabidopsis data provided as a worked example. This perspective increases my enthusiasm, compared to an initial reading. My comments are intended to encourage development.

      Some thematic characteristics reduce the impact of the submission. Among these are:

      (1) a rather less than a scholarly perspective on previous literature;

      (2) tendency to avoid theoretical development in favor of computation;

      (3) little interpretation of results of their only analysis of real data.

      We have now revised the manuscript along the lines suggested by reviewer 2. We provide more references when needed, have emphasized in the abstract and in the theoretical part of the manuscript that it is primarily a new theoretical/methodological development with an application to A. thaliana data, and have improved the interpretation of the A. thaliana data (see reply to reviewer 1).

    1. Author Response

      Reviewer #1 (Public Review):

      The authors of this study sought to test whether the optogenetic induction of context-related freezing behavior could be enhanced by synchronizing light pulses to the ongoing hippocampal theta rhythm. Theta is a hippocampus-wide oscillation that strongly modulates almost every cell in this structure, which suggests that causal interventions locked to theta could have a more pronounced impact than open-loop ones. Indeed, the authors found that activating engram-associated dentate gyrus (DG) neurons at the trough of theta resulted in an increase in freezing relative to baseline when averaging across all stimulation epochs. In contrast, open-loop stimulation and peak-locked stimulation had weaker effects. Analysis of local field potentials showed that only the theta-locked stimulation facilitated coupling between theta and mid-gamma, indicating that this manipulation likely enhances the flow of activity from DG to CA1 via CA3 (as opposed to promoting transmission from entorhinal cortex to CA1). Previous results from mice, rats, and humans support the hypothesis that memory encoding and recall occur at distinct phases of theta. This work further strengthens the case for phase-specific segregation of memory-related functions and opens up a path toward more precise clinical interventions that take advantage of intrinsic theta rhythm.

      Strengths:

      This study recognizes that, when artificially reactivating a context-specific memory, the brain's internal context matters. In contrast to previous attempts at optogenetically inducing recall, this work adds an additional layer of precision by synchronizing the light stimulus to the ongoing theta rhythm. This approach is more challenging, because, in addition to viral expression and bilateral optical fibers, it also requires a recording electrode and real-time signal processing. The results indicate that this additional effort is worth it, as it results in a more effective intervention.

      The findings on theta-gamma cross-frequency coupling suggest a possible mechanism underlying the observed behavioral effects: trough stimulation enhances DG to CA1 interactions via CA3. LFP recordings showed that stimulation increases the coupling between theta and mid-gamma (though not in all mice), and the percentage of freezing during reactivation is correlated with the gamma modulation index.

      Weaknesses:

      Given the precision of the intervention being performed, one might expect to see a stronger behavioral impact. Instead, the overall effect is subtle, and quite variable across mice. Looking at individual data points, the biggest overall increase in freezing actually occurred in 2 mice during the 6 Hz stimulation condition. Furthermore, trough stimulation decreased freezing in 3 mice. This is not a weakness in itself; rather, the weakness lies in the lack of an attempt to make sense of this variability. There are a number of factors that could explain these differences, such as viral expression levels, electrode/fiber placement, and behavior during baseline. There is of course a risk of over-interpreting results from a few mice, but there is also a chance that the results will appear more consistent after accounting for these additional sources of variation.

      Although two mice that had negative light induced freezing for trough stimulation, the other 15 mice showed a positive result. Stringent inclusion criteria were used to ensure that mice had adequate viral expression levels and behavior during baseline. Mice without at least 5% light induced freezing in at least two of the four epochs were not included in the study. The negative behavior from some mice is further explained through the correlation between MI and light induced freezing (Figure 5D). 6 Hz showed mixed behavioral results across the different behavioral measures quantified. Additionally, 6 Hz did not show the physiological hallmarks of memory reactivation through the theta-gamma modulation index so having an increased number of negative light induced freezing samples is expected.

      Finally, the elevated baseline freezing rate relative to previous literature could have masked some of the behavioral effect.

      In the revised manuscript, we discuss the effects of exclusion criteria more clearly.

      While trough-locked optogenetic stimulation significantly increases freezing, the effects are much weaker than placing the mouse in the actual fear-conditioned context (average time freezing of 15% vs. 50%). The discussion would benefit from additional treatment of ways to further increase the specificity and effectiveness of artificial memory reactivation.

      We have content on future directions for artificial memory reactivation to further approach the behavioral response of natural recall. We believe that incorporating time varying stimulation to different cells or parts of the hippocampus could improve the induced recall value as all current methods stimulate the entire sub-region simultaneously.

      Using an open-source platform (RTXI) for real-time signal processing is commendable; however, more work could be done to make it easier to adopt these methods and make them compatible with other tools. The RTXI plugin used for closed-loop stimulation should be fully documented and publicly available, to allow others to replicate these results.

      The RTXI plugin can be found here: (https://github.com/ndlBU/phase_specific_stim). The URL has been added in the description of Figure 1.

    1. Author Response

      Reviewer #1 (Public Review):

      The screening effort has revealed a number of interesting and novel suggestions of new modulators of nuclear appearance that are exciting and have the potential to be of value to the field.

      We appreciate the reviewer’s view that identification of new modulators of nuclear morphology is exciting and of value to the field.

      Major Points:

      1) The discussion of the screen hits and prior knowledge key to their interpretation is lacking. For example, the authors only report on the purported localization of the hits without an unbiased analysis of their function(s). As a sole example, multiple members of the condensin complex are hits in Fig.1 while multiple members of the cohesin complex are hits in Fig. 2 - but there are many more factors worthy of further discussion. Moreover, the authors need to provide more information on the data used to assign the localization of the hits and how rigorous these assignments may be. For example, multiple CHMP proteins (ESCRTs) are listed - indeed CHMP4B is the highest scoring hit in Fig.1 - but this protein does not reside at the nuclear envelope at steady-state; rather, it is specifically recruited at mitotic exit to drive nuclear envelope sealing. Moreover, there are many hits for which there is prior published evidence of a connection to nuclear shape or size that are ignored: examples include BANF1, CHMP7, Nup155 (and likely far more that I am not aware of). This is a missed opportunity to put the findings into context and to provide a more mechanistic interpretation of the type of effects that lead to the observed changes in nuclear appearance. For example, there is already hints as to whether the effects occur as a mitotic exit defect versus an interphase defect, but conceptually this is not addressed.

      We appreciate this important point. We find that one of the major challenges in presentation of screening results is to provide detailed information on all interesting hits within the length limits of a manuscript! To provide a more comprehensive picture, we have now performed pathway analysis using STRING to display protein interaction networks to more comprehensively classify hits and groups of hits (Figures S7 and S8). We find highly connected regions in the network corresponding to condensin and histone modifiers in fibroblast hits altering nuclear shape. In contrast, MCF10AT hits showed increased connectivity with nucleoporin proteins. Fibroblast hits displaying an increase in nuclear size identified multiple nucleoporins and MCF10AT hit analysis identified components of DNA replication. We have added these findings to Supplementary Figures 7 and 8 and discuss them on page 16. Also, as requested, we added more than 20 new references and additional information on previously identified functions of some hits discussed in the text on p. 22-24.

      2) Validation of the screen is lacking. There appears to be no evidence that the authors validated the initial screen hits by addition siRNA experiments in which the levels of the knock-down could be assessed. As an example: do nucleoporin hits decrease in their abundance at the nuclear envelope in these conditions? This validation is absolutely essential.

      As requested, we now include in Tables S6A-C, data from independent validation experiments in which we selected the primary hits and validated them using an independent set of siRNAs with distinct chemistry and target sequences. Additionally, we demonstrate efficient knockdown capabilities for 8 targets in Supplementary Figure 9 with knockdown levels for most siRNAs of at least 60%. We find no strong relationship between knockdown efficiency and the extent of the observed phenotype (compare Figure S9 and Figure S10).

      3) Differences in cell type - the authors' interpretation that a lack of overlap in the hits across cell types reveals that there are fundamentally cell type-specific mechanisms at play is a stretch. This could also reflect a lack of robustness in the screen, which should be addressed by directly testing the knock-down of the hits from one cell line in the other. Even if this approach reinforces the cell type specificity, the differences in the biology beyond the nucleus itself - an obvious example being the mechanical state of the cell - organization of the cytoskeleton, adhesions, etc that influence forces exerted on the nucleus are different rather than the nuclear response is different. These caveats needs to be explicitly acknowledged.

      As requested, we have now performed side by side experiments between both cell lines to directly compare a subset of nuclear morphology hits in parallel. They are shown in Supplementary Figure 10. We find a number of hits display strong nuclear shape abnormalities in either fibroblasts or MCF10AT cells but not both, with the exception of LMNA, which confirms our screen data. In addition, we compared the hits from our screen with previously published reports of other factors which regulate nuclear morphology to further strengthen our findings. We mention these results on p. 16. Despite these results, we have now toned down our statements regarding cell-type specificity of individual hits considering the small number of cell lines analyzed and the possible cellular factors which could contribute to cell-type specific differences.

      4) There are major issues with the interpretation of the presented biochemistry. For example, the basis for the supposed effect of monomer/dimer state of lamin is confusing and likely misinterpreted. It is well established that GST imposes dimerization on proteins expressed as GST fusions independent of cysteines. Any effect of DDT would have to manifest through some other mechanism (disulfides between the lamin domains - assumedly what the authors are thinking). Further, GST will impose dimerization of lamin A and lamin C in the co-incubation experiments. It is therefore entirely expected that if lamin A binds H3 and lamin C does not that the mixed dimers will bind H3 with lower affinity. Critically, this does not, however, address how full-length lamin C influences binding of lamin A to H3 in vivo. Last, how an effect of lamin C on lamin A would manifest through a disulfide bond in the nucleus, which has a reducing environment, is entirely unclear.

      We directly tested the possibility that GST causes artifactual dimerization of lamins by mutating cysteines to alanine in GST-lamin and assessing their effect on histone binding experiments. We show the results in Supplementary Figure 14E. If the observed binding were artifactually due to GST-mediated dimerization, we should not expect an effect of the cystine mutants on histone binding. We find, however, that the C522A mutation in lamin A results in increased binding of H3 in the presence of lamin C, demonstrating that the observed effects are not due to GST dimerization. We discuss these results on p. 18 and p. 19.

      We agree with the referee that it will be exceptionally challenging to determine the in-vivo relevance of disulfide bonds, not knowing what the precise environment of the nucleus is. Given these caveats, we have now toned down this point and discuss the limitations of these findings in more detail on p. 19, 23, 24, and 25.

      5) It is important for the authors to address the concept of nuclear size changes versus changes in the nuclear to cell volume ratio – biologically these could be quite different conditions, but obviously these cannot be distinguished by measuring nuclear volume alone. Addressing this experimentally would be best (to provide more depth to the size measurements).

      This is an important point. As requested, we now clearly indicate on p. 23 that we are measuring nuclear area using nuclear cross-sections as a proxy for nuclear size rather than nuclear to cell volume ratio. We have found in our imaging studies over the past two decades that measuring cell volumes is exquisitely challenging and often highly inaccurate. A major challenge in these approaches is the correct identification of cell boundaries and this is particularly challenging in a high-throughput setting since cell volume measurements require z-stacks that greatly complicates the imaging and quantitative analysis and increases the complexity of this kind of analysis of the millions of cells analyzed in a screen. Ultimately, measurements of cell volume for adherent cells will only be estimates (see for example PMID 28622449). We now clearly indicate this limitation of our approach and discuss on p. 15 and 23 previous studies measuring nuclear size and cell volume ratio measurements and how it compares to measuring nuclear area alone. We have also added several references on this topic on p. 15 and 23.

      6) There are important caveats to the approach of using the nuclear area as proxy measurement for nuclear size, most prominently that it is highly responsive to changes in nuclear height that can occur for a multitude of reasons (increased height = small radius and decreased height = larger radius), particularly given the different cell types. This needs to be acknowledged directly.

      Along the lines of point 5 and as requested, we now more clearly acknowledge on p. 23 these caveats due to our screening method of measuring nuclear area as a proxy for nuclear size. Nuclear cross-sectional area has been experimentally shown to be a good proxy for nuclear size in many systems (see PMID 31085625). For this reason, and because quantifying nuclear size from z-stacks would have greatly complicated the imaging and quantitative analysis, we chose to use nuclear cross-sectional area as our metric for nuclear size. In looking through our data, we did not find any significant differences in nuclear height between the two cell lines used or amongst hits and non-hits. With respect to the issue of different cell types, our analysis focused on RNAi knockdowns that altered nuclear morphology in a given cell line and we did not compare cell lines against each other. Separate analyses were performed for each cell line, so possible differences in nuclear height between the different cell lines used should not affect our analysis. We now discuss these issues on p. 23.

      7) What is the evidence that the H3 effects manifest through lamins rather than directly?

      We apologize for not being clear. We did not mean to intend to state that H3 acts via lamins. We do find that H3 physically interacts with lamins and that H3.3 mutants (K9M, K27M, and K36M) result in nuclear morphology defects. We now also show in the new Figure S17 that H3.3 mutants slightly affect lamin levels. However, as pointed out by the reviewer, these observations do not categorically rule out non-lamin related mechanisms and we now make it clear in our discussion on p. 20 that the effect of H3 may either be mediated via lamins or independently.

      8) Context is needed for the "methyl-methyl" histone states described as being the highest binders in the peptide array experiments. Are these states commonly found? Where in the genome? Does this match any DamID data? Again - more depth of investigation is required.

      This is a good point. Unfortunately, to our knowledge there is currently no ChIP-seq human genome map of di-methyl modifications on histone tails available. We were unable to generate or procure the individual dually methylated peptides and methyl-methyl H3 antibodies are not available and we are thus not able to perform quantitative binding assays. However, to begin to address this issue, we now provide in a new Supplementary Table 8 quantitative data of binding intensities. Given these limitations, we have now toned the claims regarding the methyl-binding sites.

      9) That oncohistones induce changes in nuclear shape or size does not mean that this is related to the mechanism in cancer. Also - how over-expression of H3 without its obligate partner H4 could disrupt the cell or an assessment of the extent of the oncohistone incorporation into chromatin achieved in these experiments makes it challenging to interpret.

      We agree and did not intend to imply that the oncogenic function of the histone mutants involves changes in nuclear morphology. We now clearly state so on p. 25 and we also mention the caveat of the overexpression experiment.

      10) Throughout the manuscript it would be helpful to the reader if the author would provide at minimum a brief statement on the previously identified functions of the hits that are explicitly discussed beyond their localization (membrane versus chromatin). References would also be helpful (for example, again - what is the evidence that SLC27A3 resides at the nuclear envelope?).

      As requested, we added more than 20 new references and now provide additional information and previously identified functions of many of the hits mentioned in the text.

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript by Huang et al. examines the potential "self-policing" of Bacillus cells within a biofilm. The authors first discover the co-regulation of lethal extracellular toxins (BAs) and the self-immunity mechanisms; the global regulator Spo0A controls both. The authors further show that a subpopulation of cells co-express these genes and speculate that these cells engage in preferential cooperation for biofilm formation (over cells that produce neither). Based on previous literature, the authors then evaluate the relative fitness of the wild-type strain compared to mutants locked into either constantly exporting the toxins or permanently immune to these poisons. The wild-type exhibited increased fitness (compared to the mutants) for the tested biofilm conditions. The manuscript raises interesting ideas and provides a potential model to probe questions of cooperatively in Bacillus biofilms.

      Strengths:

      • The authors use fluorescence-producing reporter strains to discern the spatial expression patterns within biofilms. This real-time imaging provides striking confirmation of their conclusions about shared co-regulation.

      • The authors also nicely deploy genetic constructs in microbiological assays to show how toxin production and immunity can influence biofilm phenotypes, including resilience to stress.

      Thank you very much for your positive comments. The detailed response to your comments and suggestions are as follows.

      Concerns:

      • My biggest concern is that the claim of policing on a single-cell level needs more quantitive microscopy, particularly of the xylose-induced strain. The data support a more tempered consideration of self-policing via BAs and self-resistance in this Bacillus species. It seems sufficient that this manuscript opens the door for a novel and readily examinable system for examining potential cooperation and its molecular controls (without making broader claims).

      Thank you very much for your comments. We demonstrated the policing system on a single-cell level by re-filming the progress of individual nonproducers from alive to death and even disappearance in a biofilm population (please see the pictures in Figure 2 and the statistical data in Figure 2-figure supplement 1 of the revised manuscript, as well as revised Figure 2-video 1-4). Alternatively, the xylose-induced strain (SQR9-Pxyl-accDA) was constructed to assess the involvement of AccDA expression (controlled by Spo0A~P in wild-type while induced by exogenous xylose here) in regulating BAs synthesis and immunity. The expression of AccDA is likely to be homogeneous in the colony with xylose addition, instead of a heterogeneous expression in the wild-type population.

      • The discussion is more speculative than the presented data warrants. For example, the speculation in lines 289 - 310 is not anchored in the results. It is hard for this reviewer to imagine how one would use the genetic framework and tools developed in this manuscript to address the ideas proposed in lines 289 - 310.

      Thank you for your comments. We have revised the discussion to ensure it is more related to data warrants than speculation. As a complement to the molecular mechanism of the policing system in the discussion, the hypothesis of the evolution of this system (Lines 289-310 in the original version) was included to give a possibility that how it raised, which is based on a couple of ecological theories with regards to division of labour and kin selection4-6; we have shortened this discussion in the revised manuscript.

      • Some conclusions (in the results section) are more decisive than the data supports. For example, the microscopy of the PI staining (as presented in Figure 2 and the supplemental movies) does not prove that only non-expressing cells die. Yet the conclusion in line 143 states that "ECM and BAs producers selectively punish the nonproducing siblings." Also, the presented data shows many non-labeled cells without PI; why do some nearby non-gfp-expressing cells remain alive?

      Thank you for your constructive comments. According to the reviewer's suggestion, an observation covering more complete biofilm forming process, as well as a more convinced data statistics, should be performed. We then re-conducted microscope observation lasting for 3 h during biofilm formation, and assess the source and location of dead cells for statistical analysis. The results showed that all dead cells were originated from the subpopulation that didn't express the gfp (the nonproducers), and the number of dead cells adjacent to the producers was significantly higher than that closed to the non-producers (please see the pictures in revised Figure 2 and Figure 2-figure supplement 1).

      In addition, regarding the survival of some non-gfp-expressing cells near the producers, based on several relevant literatures1-3 and the observation in the present study, we speculate that the coordination system for optimizing the division of labor is relatively temperate, thus only a part of the nonproducers (relative sensitive cells or facing higher concentrations of the toxin) are eliminated. We think this scene is a balance between restraining the cheater-like subpopulation and retaining the advantages of cell differentiation.

    1. Author Response

      Reviewer #1 (Public Review):

      The work in this study builds on previous studies by some of the same authors and aims to test whether the heartbeat evoked response was modulated by the local/global auditory regularities and whether this differed in post-comatose patients with different contagiousness diagnosis. The authors report that during the global effect there were differences between the MCS and UWS patients.

      The study is well constructed and analysed and has data from 148 participants (although the maximum in anyone group was 59). The reporting of the results is excellent and the conclusions are supported by the results presented. This study and the results presented are discussed as evidence that EEG based techniques maybe a low cost diagnostic tool for consciousness in post-comatose patients, although it should be stressed that here no classification of diagnostics was performed on the EEG data.

      One potential weakness was the relationship between the design of the experiment and the analysis pathway for the results. If I have understood correctly the experimental design the auditory regularity changed on whether the local/global regularity was standard/deviant. In the analysis the differences between all conditions in which the local or global regularity were compared between the standard and deviant trials. This difference was then compared between MCS and UWS patient groups. For these analyses the results for the health and emerging MCS were not included. If this is correct it would be interesting to understand the motivation for this. Relatedly, it would be good to clarify if the effects reported were corrected for the multiple planned contrasts and if not why they should not be corrected.

      Thanks for the appreciation and constructive comments to our work. The misdiagnosis of MCS/UWS patients in the clinical practice occurs because of misdetection of covert consciousness given the absence of overt behavioral signs of consciousness. Therefore, the main motivation of our study is to contribute to a better distinction between those two patients’ groups.

      We have modified the introduction to clarify that the objective of the paper is to show in major detail the group differences between MCS and UWS patients:

      "In this study, we analyze HERs following the presentation of auditory irregularities, with special regard on distinguishing UWS (n=40) and MCS (n=46) patients. Note that the automated classification of this cohort was previously performed in another study (Raimondo et al., 2017). Therefore, our aim is to characterize the group-wise differences between UWS and MCS patients that may allow a multi-dimensional cognitive evaluation to infer the presence of consciousness (Sergent et al., 2017), but also complement the bedside diagnosis performed with neuroimaging methods that capture neural correlates of covert consciousness (Sanz et al., 2021)."

      Reviewer #2 (Public Review):

      The goal of this study was to determine whether heartbeat-evoked responses measured at the scalp level with EEG, which followed regularity violations, could potential help inform the diagnosis of patients with altered states of consciousness.

      The authors use high density EEG and an oddball paradigm that probes violations of both local and global regularities. Four groups were considered including unresponsive wakefulness syndrome patents, minimally consciousness patients, emerging minimally consciousness patients and healthy controls. A difference was found between unresponsive and minimally conscious patients in the amplitude of the heartbeat evoked responses measure with EEG following a sound that violated a global regularity. Similarly, differences were found between the variance of these responses between the two above mentioned groups (N=58 and N=59), but no differences were found in relation to the healthy control group, which appear to be "in between" the two other groups (at least for global effect of HER). I thought this was a little counterintuitive and raises some questions about what this neural signature can tell us about the state of consciousness. Having said that, the healthy control sample was very small, more than 5 times smaller (only N=11).

      Thanks to the reviewer for their comments. As described above, distinguishing between MCS/UWS patients is one of the main challenges in the clinical practice. We have modified the manuscript to show the differences between these two patients’ groups. Further data on EMCS and healthy participants is not included in this revision because of the new inclusion criteria.

      In general, I thought the Discussion section was a little light on the implications of the findings, what they tell us about the brain mechanisms of consciousness and their different levels/states. A question is raised about whether it is necessary to lock EEG to heartbeats to find differences between patients. The data appeared to say that this is not the case but the discussion does not appear to reflect that very clearly.

      We have enriched the discussion to comment on the relation of HERs in perception:

      "Our results contribute to the extensive experimental evidence showing that brain-heart interactions, as measured with HERs, are related to perceptual awareness (Azzalini et al., 2019; Skora et al., 2022). For instance, neural responses to heartbeats correlate with perception in a visual detection task (Park et al., 2014). Further evidence exists on somatosensory perception, where a higher detection of somatosensory stimuli occurs when the cardiac cycle is in diastole and it is reflected in HERs (Al et al., 2020). Evidence on heart transplanted patients shows that the ability of heartbeats sensation is reduced after surgery and recovered after one year, with the evolution of the heartbeats sensation recovery reflected in the neural responses to heartbeats as well (Salamone et al., 2020). The responses to heartbeats also covary with self-perception: bodily-self-identification of the full body (Park et al., 2016), and face (Sel et al., 2017), and the self-relatedness of spontaneous thoughts (Babo-Rebelo et al., 2016) and imagination (Babo-Rebelo et al., 2019). Moreover, brain-heart interactions measured from heart rate variability correlate with conscious auditory perception as well (Banellis and Cruse, 2020; Pérez et al., 2021; Pfeiffer and Lucia, 2017)."

      Reviewer #3 (Public Review):

      I found the results very interesting but wondered why the ERP results for the global vs. local effects are not reported. This analysis is mentioned in the methods section, but I do not find it in the results. Is this what is shown in the mid row in panel D? If yes, it should be made clearer. Is there a significant local and global deviant response in each patient group?

      We thank the reviewer for their appreciation of our work and their comments.

      We have reported the new results showing clustered effects in both ERPs and HERs.

      Additionally, eyeballing Figure 1, there are a few potential issues that may be affecting the conclusion re HER:

      (1) Panel D top: it seems that the orange trace (MCS) is largely the same in both the "Local" and "global" condition. But the blue trace (UWS) shows a larger negative going deflection in the "global" case. Put differently, the UWS, but not MCS patients appear to generate a different response to the Global effect relative to the local effect. Is this the case?

      We have separated the Figure 1 into 3 new figures to clarify on the results. And we also provide a more detailed description of our results.

      In brief, our results show that MCS may have a distinctive response to global and local effects. We have included new correlation analysis in which we show that the responses to global and local effects are uncorrelated (Table 2):

      With respect to the “negative” responses in UWS. Note that the measured effect correspond to a linear combination of evoked potentials, e.g.: global effect = mean(global deviants) – mean(global standard). Therefore, the negative group-wise response may imply that global standard responses are larger than global deviants. We have included in Table 1 the statistical tests to show whether the responses to local and global effects are different from zero:

      (2) There are some MCS subjects that appear to show a global effect that is larger than that observed in EMCS and healthy controls. How do you interpret these data?

      We have included in the discussion a paragraph in which we discuss on the outliers:

      "Note that outliers are expected in disorders of consciousness and exact physiological characterization of the different levels of consciousness remains challenging. First, the standard assessment of consciousness based on behavioral measures has shown a high rate of misdiagnosis in MCS and UWS (Stender et al., 2014). The cause of the misdiagnosis of consciousness arises because consciousness does not necessarily translate into overt behavior (Hermann et al., 2021). Unresponsive and minimally conscious patients, namely non-behavioral MCS (Thibaut et al., 2021), represent the main diagnostic challenge in clinical practice. Second, some of these patients suffer from conditions that may translate to no response to stimuli, even in presence of consciousness. For instance, when they suffer from constant pain, fluctuations in arousal levels, or sensory impairments caused by brain damage (Chennu et al., 2013). Third, these patients were recorded in clinical setups, which may lead to a lower signal-to-noise ratio, and lead to biased measurements in evoked potentials (Clayson et al., 2013)."

      (3) How do you interpret the negative average HER data shown by many UWS patients?

      As mentioned above, the negative HER is a result of a linear combination of different HER-based markers (deviants minus standard).

    1. Author Response

      Reviewer #3 (Public Review):

      1) While the data are generally very convincing, the authors overstated the conclusions in several instances. For example, the authors state that EPAC and PKCε are "required" or "essential" for vesicle docking and release. However, the author's own data show that both vesicle docking and release are clearly present (though reduced) in the absence of EPAC and PKCε, demonstrating they are not absolutely required. The language could be toned down without diminishing the impact of the excellent work.

      We thank you for these important comments. We have double-checked the manuscript and modified the language of our statements. In particular, we have changed the unnecessary words “required” and “essential” to “regulate” or “important”.

      2) The authors used analysis of cumulative EPSCs to estimate release probability (Pr) and the readily releasable pool (RRP) size. Unfortunately, this approach is likely not suited for low release probability synapses such as parallel fibers (the authors estimate Pr to be 0.04-0.06). Thanawala and Regehr (2016) extensively investigated the validity of cumulative EPSC analysis under a variety of conditions. They found that this analysis produces large errors in Pr and RRP at synapses with a Pr below ~0.2. In addition, 20 Hz EPSC stimulation (as was used here) produces much larger errors compared to the more commonly used 100 Hz stimulation. Between the low Pr at parallel fiber synapses and the low stimulus frequency used, it is likely that the cumulative EPSC analysis provides a poor estimate of Pr and RRP in this case.

      Thanks for the very insightful comment. In the previous experiments, we measured RRP and Pr based on parameter taken from the work in the hippocampal CA1 neurons (He et al., 2019), which, in our opinion, is similar to PF-PC synapses concerning low release probability. We have carefully read Thanawala and Regher (2016) paper and compared different methods. While the performance of the EQ method is in general more reliable to estimate small RRP and low Pr, it relies on p to be constant throughout a stimulus train (Thanawala and Regher, 2016). Although p may be constant for the calyx of Held synapses they studied, it cannot be case for PF-PC synapses. Therefore, we decided to redo the estimations of RRP and Pr using 100-Hz train (previously 20-Hz train). This method does not require constant p and allows us to have a better estimation on RRP and Pr at PF-PC synapses (Thanawala and Regher, 2016).

      The new results have been presented in new Fig. 2E and 2F. The PF-PC synapses were stimulated at the frequency of 100 Hz, and the artifacts were truncated and the EPSCs were aligned (Fig. 2E and 2F). Note that the aim of this experiment was to investigate whether there is difference between control and cKO mice. Indeed, we found that the amplitudes of both EPSC0 and follow-up EPSCs were smaller in cKO mice, indicating that both the initial release and the replenishment are reduced by the conditional knockout o EPACs or PKCε. Compared to 20-Hz train, the 100-Hz train resulted in steady-state EPSCs brought EPSCs into steady state faster. We created linear fit from normalized steady-state EPSCs and back-extrapolated the curve to the y-axis to calculate Pr. Indeed, we found that the Pr value estimated from the 100-Hz train stimulus was significantly larger than that from the 20-Hz train, showing 0.17 (Math1-cre) and 0.19 (PKCεf/f) with 100-Hz, but 0.07 (Math1-cre) and 0.08 (PKCεf/f) in previous submission. This result was similar to Thanawala and Regher (2016), in which they claimed that the accuracy of estimation from a 100-Hz train is about three times of that from a 20-Hz train. Moreover, we found that the conditional knockout of either EPACs or PKCε produced significant decrease on Pr (Math1-cre 0.17 vs Math1-cre;EPAC1cKOEPAC2cKO 0.11; PKCεf/f 0.19 vs PKCεcKO 0.12). These results have been added in the text and figure legend (Fig. 2E and 2F), and corresponding methods have also been updated.

      3) Using a combination of genetic knockouts and pharmacology, this paper convincingly shows that presynaptic EPAC/PCKε are necessary for presynaptic LTP, but do not alter postsynaptic LTP/LTD. However, given the experimental conditions in the slice experiments, it is difficult to extrapolate from the slice data to in vivo plasticity during motor learning. Synaptic plasticity in the cerebellar cortex is quite complex and can depend significantly on age, temperature, location, and ionic conditions. Unfortunately, these were not well matched between slice and in vivo experiments. Slice experiments used P21 mice, while in vivo experiments were performed at P60. Slice experiments were performed in the vermis, while VOR expression/adaptation generally requires the vestibulo-cerebellum/flocculus. Slice experiments were performed at room temperature, not physiological temperature. Lastly, slice experiments used 2 mM Ca2+ in the ACSF, somewhat high compared to the physiological extracellular fluid. Each of these factors can significantly affect the induction and expression of plasticity. These differences leave one wondering how well the slice data translate into understanding plasticity in the in vivo context.

      This is a great question. To date, almost all PC plasticity in published work were recorded in young adult mice (< 1 month) and at room temperature, and most behavioral experiments were conducted around 2-3 months of age. To better answer the reviewer’s comment, we tried our best to redo the LTP experiments under the requested, alternative conditions (in 2-month-old mice, low Ca2+ or high recording temperature). Our new data show that, under these conditions, EPACs and PKCε are still needed for the induction of presynaptic PC-LTP (Figure 3–figure supplement 2-4). In addition, we have tried to record PC EPSCs in the flocculus. Unfortunately, we found PC EPSCs there were quite unstable, which might be due to the more complex orientation of PCs and their innervations. We have discussed the reviewer’s comment in the revised manuscript “Second, presynaptic PF-PC LTP was performed in the cerebellar vermis in the present work, whereas VOR learning generally requires PC activity in the flocculus. Unfortunately, we found that PC-EPSCs in the flocculus were not suitable to record PC plasticity because they were unstable” (Line 557).

      4) Many experiments use synaptosomal preparation. The authors identify excitatory synapses by VGLUT labelling, but it is unclear how, or if, the authors distinguish between parallel fiber, climbing fiber, and mossy fiber synaptosomes. These synapses likely have very different properties and molecular composition, some quantification or estimation of how many synaptosomes are derived from each type of synapse would be helpful.

      We have performed synaptosome staining vGluT1/vGluT2, EAAT4 and bassoon to identify PF-PC synapses (vGluT1+EAAT4+) or CF-PC (vGluT2+EAAT4+) synapses. Our staining results showed that PF-PC synapses covered 88.8% of the total and CF-PC synapses covered 7.5% of the total. Thus, we estimated the number of mossy fiber synapses to be less than 3.7%, which would not affect our conclusion. These results have been presented in Figure 1–figure supplement 1.

      5) The math1-cre mouse line is used to selectively knockout EPAC or PKCε expression in cerebellar granule cells. This line also expresses Cre in unipolar brush cells (UBCs) of the cerebellum (Wang et al., 2021). This is likely not a factor in the molecular/slice studies of EPAC/PKC signaling, but UBC dysfunction could play a role in motor/learning deficits observed in vivo. This possibility is not considered in the text.

      There is indeed evidence that UBCs are involved in cerebellar ataxias (Kreko-Pierce et al., 2020). How UBCs precisely participate in motor learning or VOR learning is unclear, but they are suggested be involved in motor performance (Mugnaini et al., 2011; Guo et al., 2021). So, we agree with the reviewer that this option cannot be excluded. Therefore, we have revised the discussion about the potential role of UBCs “Two caveats should be considered in the present studies. First, Math1-Cre-induced deletion of EPAC or PKCε might affect the function of unipolar brush cells (UBCs), which are involved in cerebellar ataxias (Kreko-Pierce et al., 2020). However, we believe that the EPAC-PKCε module regulates VOR learning through presynaptic plasticity mechanism at PF-PC synapses rather than UBCs, in line with the observations in other granule cell-specific mutations (Galliano et al., 2013; Schonewille et al., 2021).” (Line 552).

      References:

      Mugnaini E, Sekerková G, Martina M. The unipolar brush cell: a remarkable neuron finally receiving deserved attention. Brain Res Rev. 2011;66(1-2):220-45.

      Guo C, Rudolph S, Neuwirth ME, Regehr WG. Purkinje cell outputs selectively inhibit a subset of unipolar brush cells in the input layer of the cerebellar cortex. Elife. 2021;10:e68802.

      Kreko-Pierce T, Boiko N, Harbidge DG, Marcus DC, Stockand JD, Pugh JR. Cerebellar ataxia caused by Type II unipolar brush cell dysfunction in the Asic5 knockout mouse. Sci Rep. 2020;10:2168.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper uses light field microscopy to measure calcium signals across the fly brain while it is walking and turning, and also while the fly is externally driven to walk and turn, using a treadmill. The authors drive calcium indicator expression using pan-neuronal drivers, as well as drivers specific to individual neurotransmitters and neuromodulators. From their experiments, the authors show that inhibitory and excitatory neurons in the brain are activated in similar patterns by walking and that neurons expressing machinery for different neuromodulatory amines tend to show differentially strong calcium signals during walking. By examining spontaneous and forced walking and turning, the authors identify brain regions that activate before spontaneous turning and that activate asymmetrically in concert with spontaneous or forced turning.

      Strengths: Overall, the strength of this paper is in its careful descriptions and analyses of whole brain activation patterns that correlate with spontaneous and forced behaviors. Showing how the pattern of activity relates to broad classes of cells is also useful for understanding brain activation. Especially in brain regions identified as preceding spontaneous walking and in being asymmetrically involved in spontaneous and forced turning, it provides a wealth of potential hypotheses for new experiments. Overall, it contributes to a coarse-grained understanding of broad changes in brain activity during behavior.

      Weaknesses: The primary weakness of this paper is that it presents some speculative interpretations and conclusions too strongly. Most importantly, average activity in a neuropil can represent the calcium activity of hundreds or thousands of neurons, and it is hard to know what fraction is active, for instance, or how expression pattern differences might play into calcium signals. Calcium signals also do not reliably indicate hyperpolarization, so a net increase in the average Ca++ indicator signal does not necessarily reflect that the average neuron is becoming more active, just that some labeled neurons are becoming more active, while others may be inactive or hyperpolarized. The conclusions about regions triggering walk (rather than just preceding it) are too strong for the manipulations in this paper, as are some of the links with individual neuron types. Thus, more presenting substantial caveats is required for the conclusions being drawn from the data presented here.

      We thank the reviewer for their assessment and the positive comments on our manuscript. We have made these caveats clear throughout the manuscript by adding text and removing overly strong conclusions and speculations.

      Reviewer #2 (Public Review):

      Aimon et al. used fast whole-brain imaging to investigate the relationship between walking and neural activity in adult fruit flies. They find that increases in brain-wide activity are tightly correlated with walking behavior, and not with grooming or flailing, and are independent of visual input. They reveal that excitatory, inhibitory, and neuromodulatory neurons all contribute to brain-wide increases in neural activity during walk. Aimon et al. extend their observations of brain-wide activity to reveal that activity in some inferior brain regions is more correlated with walk than in other brain regions. The authors further analyzed their imaging dataset to identify candidate brain regions and cell types that may be important for walking behavior, which will be useful in hypothesis generation in future studies. Finally, the authors show that brain-wide activity is similar between spontaneous and forced walk and that severing the connection between the ventral nerve cord and central brain abolishes walk-related increases in brain activity. These results suggest that increases in brain-wide activity during walking may be largely attributed to sensory and proprioceptive feedback ascending to the central brain from the ventral nerve cord rather than to top-down executive and motor control programs. The observations presented in this study suggest hypotheses that may be tested in future studies.

      Strengths: This paper presents a rich imaging dataset that is well-analyzed and cataloged, which will be valuable for researchers who use this paper for future hypothesis generation. The comparison of many different reagents, imaging speeds, and behavioral conditions suggests that the observed increases in brain-wide activity during walking are quite robust to imaging methods in adult fruit flies.

      Weaknesses: This study is largely observational, and the few experimental manipulations presented are insufficient to support the author's broad claims about the generation of brain-wide neural activity.

      We thank the reviewer for their assessment and have toned down claims throughout the paper accordingly.

      Notably, the authors suggest that their image analysis can reveal individual cell types that are important for walking by matching their morphologies to registered components from whole-brain imaging experiments. While these predictions are a useful starting point for future experiments, they have not convincingly shown that their method can identify individual cell types in genetic reagents with more restricted expression patterns. Adding further validation to show that genetically subtracting the candidate neurons from the overall expression pattern of the calcium indicator abolishes that component from the response would strengthen this claim. Furthermore, imaging the matched candidate neuronal cell type to show that it recapitulates the activity dynamics of the proposed component would add additional evidence.

      We agree that the correspondence to specific neuron types is often very speculative. We have clarified this throughout the manuscript. There are a few exceptions where the neurons we discuss are the only known neurons in a specific GAL4 expression pattern in a given region, and where we find the exact anatomical pattern matching these neurons’ anatomy. Together, this makes us quite confident that the activity results indeed from these neurons. However, the experiments proposed by the reviewer would be interesting complementary approaches. We believe, however, that abolishing activity in one neuron will be difficult to interpret regarding the neuron type as it would affect the activity of other neurons in the network (which is, in our opinion, an interesting point and research direction). Nevertheless, we plan to perform such experiments and experiments looking at the activity in more restricted drivers in the future.

      In addition, increases in neural activity prior to walk onset in specific brain regions are intriguing but insufficient to demonstrate the neurons in these regions trigger walking. This claim should await further studies that employ targeted and acute manipulation of neural activity, as noted by the authors. Furthermore, that activity in these brain regions is significantly increased prior to walk onset awaits more rigorous statistical testing, as do the authors' claims that spontaneous versus forced walking alters these dynamics. The suggestion that walking increases brain-wide activity via feedback from the ventral nerve cord is an interesting possibility and would also benefit from additional experimental validation. Activating and silencing neurons that provide proprioceptive feedback from the legs and determining the effect of this manipulation on brain-wide neural activity would be a good starting point.

      We have removed claims of causality in the result section. We have also added a statistical test for activation before walk onset. Activating and silencing proprioceptive neurons from the legs would be interesting follow up experiments although it is likely to affect walking. Nevertheless, we are planning to carry out such experiments in the future. We have added this point in the discussion.

      Reviewer #3 (Public Review):

      Aimon and colleagues investigated brain activity in flies during spontaneous and forced walking. They used light-field microscopy to image calcium activity in the brain at high temporal resolution as the animal walked on a ball and they used the statistical inference methods PCA and ICA to tease out subregions of the brain that had distinct patterns of activity. They then sought to relate those patterns to walking. Most interesting are the experiments they performed comparing forced walking to spontaneous walking because this provides a framework to generate hypotheses about which aspects of neural activity are reporting the animal's movements versus generating those movements. The authors identify subregions and neuron types that may be involved in generating vs reporting walking. Their analysis is reasonable but could be further strengthened with a more powerful statistical framework that explicitly considered the multiple hypotheses being tested. More broadly, the work serves as a starting point to investigate the role of different regions in the brain and should spur follow-up investigations that involve more perturbative approaches in addition to the correlative approaches presented here.

      We thank the reviewer for their overall positive assessment of our work and fully agree with the conclusion of its current limitations.

    1. Author Response:

      Reviewer #1 (Public Review):

      Tomasi et al. performed a combination of bioinformatic, next-generation tRNA sequencing experiments to predict the set of tRNA modifications and their corresponding genes in the tRNAs of the pathogenic bacteria Mycobacterium tuberculosis. Long known to be important for translation accuracy and efficiency, tRNA modifications are now emerging as having regulatory roles. However, the basic knowledge of the position and nature of the modifications present in a given organism is very sparse beyond a handful of model organisms. Studies that can generate the tRNA modification maps in different organisms along the tree of life are good starting points for further studies. The focus here on a major human pathogen that is studied by a large community raises the general interest of the study. Finally, deletion of the gene mnmA responsible for the insertion of s2U at position 34 revealed defects in in growth in macrophage but in test tubes suggesting regulatory roles that will warrant further studies. The conclusions of the paper are mostly supported by the data but the partial nature of the bioinformatic analysis and absence of Mass-Spectrometry data make it incomplete. The authors do not take advantage of the Mass spec data that is published for Mycobacterium bovis (PMID: 27834374) to discuss what they find.

      Important points to be considered:

      1) The authors say they took a list of proteins involved in tRNA modifications from Modomics and added manually a few but we do not know the exact set of proteins that were used to search the M. mycobacterium genome.

      Thank you for pointing out this issue. We will add the complete list of proteins used for the BLAST query.

      2) The absence of mnmGE genes in TB suggested that the xcm5U derivatives are absent. These are present in M. bovis (PMID: 27834374). Are the MnmEG gene found in M. bovis? If yes, then the authors should perform a phylogenetic distribution analysis in the Mycobacterial clade to see when they disappeared. If they are not present in M. bovis then maybe a non-orthologous set of enzymes do the same reaction and then the authors really do not know what modification is present or not at U34 without LC-MS. The exact same argument can be given for the xmo5U derivatives that are also found in M.bovis but not predicted by the authors in M. tuberculosis.

      The reviewer raises a valid point. In M. bovis mnm5U and cmo5U derivatives were observed in LC-MS analysis. However, we did not identify candidate genes known to be involved in the biogenesis of mnm5U and cmo5U in the Mycobacteriaceae, including M. bovis and Mtb, suggesting that if these modifications are indeed present, they are not synthesized through a canonical biogenesis pathways in this family. There are several examples where the same modification is generated by distinct modification enzymes (Kimura, 2021). These observations raise the interesting possibility that in the Mycobacteriaceae and most species in actinomycetota (except for Bifidobacterium, Corynebacterium and Rhodococcus species), major wobble modifications are generated by biosynthesis pathways that are distinct from those employed by well-characterized organisms. Future studies will examine this hypothesis.

      3) Why is the Psi32 predicted by the authors because of the presence of the Rv3300c/Psu9 gene not detected by CMC-treated tRNA seq while the other Psi residues are? Members of this family can modify both rRNA and tRNA. So the presence of the gene does not guarantee the presence of the modification in tRNAs

      Thank you very much for the careful read. We did not include RluA in the list of query proteins because it is not classified as a tRNA modification enzyme in Modomics. Additionally, the CMC-coupled tRNA-seq is imperfect for detection of all pseudouridylated positions. Due to this limitation, we only assigned modifications that are both predicted by the presence of putative biosynthetic enzymes and RT-derived signatures. As the reviewer points out, we cannot rule out that this homolog targets only rRNAs. We will clarify this possibility in the revised manuscript. Also, RluA will be added to the query and the name of Rv3300c will be changed to RluA in the text and related figures.   

      4) What are tsaBED not essential but tsaC (called sua5 by the authors) essential?

      Thank you for pointing out this interesting observation. We are also curious about differences in the essentiality among t6A biogenesis genes. We speculate that TsaC potentially has critical roles in cell viability other than t6A synthesis. TsaC synthesizes a compound, threonylcarbamoyl-AMP, as an intermediate for t6A biogenesis. Thus, it is possible that this intermediate has a role in other essential cellular activities besides t6A biogenesis. Further study of these factors in Mtb could reveal interesting crosstalk between modification synthesis and other cellular activities.

      Reviewer #2 (Public Review):

      In this study, Tomasi et al identify a series of tRNA modifying enzymes from Mtb, show their function in the relevant tRNA modifications and by using at least one deleted strain for MnmA, they show the relevance of tRNA modification in intra-host survival and postulate their potential role in pathogenesis.

      Conceptually it is a wonderful study, given that tRNA modifications are so fundamental to all life forms, showing their role in Mtb growth in the host is significant. However, the authors have not thoroughly analyzed the phenotype. The growth defect aspect or impact on pathogenesis needs to be adequately addressed.

      - The authors show that ΔmnmA grows equally well in the in vitro cultures as the WT. However, they show attenuated growth in the macrophages. Is it because Glu1_TTC and Gln1-TTG tRNAs are not the preferred tRNAs for incorporation of Glu and Gln, respectively? And for some reason, they get preferred over the alternate tRNAs during infection? What dictates this selectivity?

      Thank you very much for raising this excellent point. As the reviewer suggests, the attenuation of DmnmA Mtb growth inside of macrophages could be caused by disparate codon usage between genes required for in vitro growth and intracellular growth. Among multiple codons encoding Glu, Gln, or Lys, s2U modification-dependent codons might be preferentially distributed in genes associated with intracellular growth. For example, Mtb has two tRNA isoacceptors, Glu1_TTC and Glu2_CTC, to decipher two Glu codons, i.e., GAA and GAG. According to the wobble pairing rule, GAA is only decoded by Glu1_TTC, whereas GAG is decoded by both Glu1_TTC and Glu2_CTC; i.e., GAG can be deciphered by an s2U-independent tRNA. Thus, genes required for intracellular growth might be enriched with GAA, an s2U-dependent codon. The same thing can happen to other Gln and Lys codons deciphered by s2U-containing tRNAs. In the revised manuscript, we will include the perspective of codon usage for explaining the intracellular fitness defect of the ΔmnmA Mtb mutant.

      - As such the growth defect shown in macrophages would be more convincing if the authors also show the phenotype of complementation with WT mnmA.

      The reviewer raises a valid point. We note however, that Rv3023c, a putative transposase, is downstream of MnmA and unlike MnmA, Rv3023c appears to be dispensable for in vivo growth, according to the Tn-seq database. Therefore, it is likely that the intracellular growth defect is caused by loss of mnmA.

      An important consideration here is the universal nature of these modifications across the life forms. Any strategy to utilize these enzymes as the potential therapeutic candidate would have to factor in this important aspect.

      This is a valid point. Targeting a pathogen-specific system enables avoidance of the adverse side effects caused by many therapeutic reagents. There are a couple of Mtb modification enzymes that are specific to bacteria and critical for Mtb fitness (e.g., TilS). These enzymes represent ideal potential therapeutic targets to suppress Mtb intracellular growth.

      Reviewer #3 (Public Review):

      The work presented in the manuscript tries to identify tRNA modifications present in Mycobacterium tuberculosis (Mtb) using reverse transcription-derived error signatures with tRNA-seq. The study identified enzyme homologs and correlates them with presence of respective tRNA modifications in Mtb. The study used several chemical treatments (IAA and alkali treatment) to further enhance the reverse transcription signals and confirms the presence of modifications in the bases. tRNA modifications by two enzymes TruB and MnmA were established by doing tRNA-seq of respective deletion mutants. Ultimately, authors show that MnmA-dependent tRNA modification is important for intracellular growth of Mtb. Overall, this report identifies multiple tRNA modifications and discuss their implication in Mtb infection.

      Important points to be considered:

      - The presence of tRNA-based modifications is well characterised across life forms including genus Mycobacterium (Mycobacterium tuberculosis: Varshney et al, NAR, 2004; Mycobacterium bovis: Chionh et al, Nat Commun, 2016; Mycobacterium abscessus: Thomas et al, NAR, 2020). These modifications are shown to be essential for pathogenesis of multiple organisms. A comparison of tRNA modification and their respective enzymes with host organism as well as other mycobacterium strains is required. This can be discussed in detail to understand the role of common as well as specific tRNA modifications implicated in pathogenesis.

      The reviewer raises a fair point. However, with the exception of Chionh et al., the other studies cited here are not genome-wide characterization of tRNA modification. We will add a discussion of the distribution of tRNA modification enzymes across multiple mycobacterium species and the implications of this distribution for pathogenesis to the revised manuscript.

      - Authors state in line 293 "Several strong signatures were detected in Mtb tRNAs but not in E. coli". Authors can elaborate more on the unique features identified and their relevance in Mtb infection in the discussion or result section.

      Thank you for the suggestion. We will lengthen the discussion of the RT-derived signatures observed in Mtb but not in E. coli but the relevance of these modifications for Mtb pathogenicity remains speculative at this point.

      - Deletion of MnmA is shown to be essential for E. coli growth under oxidative stress (Zhao et al, NAR, 2021). In similar lines, MnmA deleted Mtb suffers to grow in macrophage. Is oxidative stress in macrophage responsible for slow Mtb growth?

      This is an excellent hypothesis which we will raise in the revised manuscript.

      - Authors state in line 311-312 "Mtb does not contain apparent homologs of the tRNA modifying enzymes that introduce the additional modifications to s2U". This can be characterised further to rule out the possibility of other enzyme specifically employed by Mtb to introduce additional modification.

      The reviewer raises a valid point. As discussed above (Reviewer #1, pt 2), Mtb may employ distinct enzymes to generate certain tRNA modifications. Future mass spec-based analyses of Mtb tRNAs will be carried out to identify the precise chemical structure of the sulfurated uridine, and subsequent studies will attempt to determine the enzymes that account for the biogenesis of these modifications.

    1. Author Response

      Reviewer #1 (Public Review):

      This refinement of their model, coupled with the demonstration that the Sis1 J protein chaperone does not appear to play a direct role in the inactivation phase of the HSR, provide a significant advance over their earlier work.

      We are pleased that the reviewer is satisfied that our new results represent a significant advance.

      A main weakness is that while the evidence that Sis1 is important for fitness of heat-stressed yeast cells is reasonable, exactly how Sis1 achieves this is not clear. In a single sentence the authors suggest that Sis1 might be an orphan ribosome chaperone, partly based on its nucleolar localization, but provide no evidence for this. If this were true, then one might expect a reduction in ribosome content under stress conditions (because there are more ORPS to take care of because of translation stalling?) and a decreased rate of protein synthesis (yes, this happens, how much this is due to overall translation suppression vs there being less ribosomes to translation things, is unknown and hard to test), which could be tested. Some further insights into this more general role of Sis1 would strengthen the authors' conclusions.

      We would like to make a distinction between the important biochemical roles for Sis1 in the cellular response to heat shock – which we explore elsewhere – and the role we are investigating here for the regulation of Sis1 expression by Hsf1. For new insights into the functional role of Sis1 as a chaperone for orphan ribosomal proteins, please see our recent preprint (Ali et al., https://www.biorxiv.org/content/10.1101/2022.11.09.515856v1). Here, we have focused on how Sis1 transcriptional regulation promotes fitness. Please see above for the description of the new mechanistic insight we have into the role of Sis1 expression tuning in controlling stress granules.

      Moreover, whether Sis1 plays a general role in the fitness of cells under stress has not been firmly established, i.e., is its mechanistic role the same in heat shock conditions and under nutrient stress conditions? Without knowing the mechanistic basis for how Sis1 maintains the fitness of heat-stressed cells, it is not possible to conclude that the same mechanism is at play in cells grown on a non-preferred carbon source.

      As described above, we have now provided evidence that the inability to properly tune Sis1 expression levels in the 2xSUP35-SIS1 strain results in disrupted stress granule homeostasis, linking a known function of Sis1 to a known process driven by nutrient stress.

      Figure 4: This is an ingenious experiment to study the subcellular localization of newly synthesized Sis1 in response to heat shock, compared to that of the heat-shock inducible Hsp70 Ssa1. However, based on the images presented in panel B it is hard to know how discrete the subnuclear distributions of Sis1 and Ssa1 really are, and ideally what is needed is to be able to analyze their localizations when both tagged proteins are expressed in the same cell, although this would obviously not be possible using the halo-tagged protein system. In addition, one would like to know the localization of Hsf1 in the cell at the same time. As it stands, these data seem overinterpreted, and it remains possible that some other event such as an inactivating post-translational modification of Sis1 under heat shock conditions might be involved in inactivating its function.

      To address this concern, we constructed two new imaging strains expressing Hsf1-mVenus/Halo-Sis1 and Hsf1-mVenus/Halo-Ssa1 (Hsp70) and used pulse-labeling followed by live lattice light sheet 3D imaging to resolve the subcellar localization of newly synthesized Sis1 and Hsp70 with respect to Hsf1 over a heat shock time course. Unfortunately, we cannot monitor newly induced Sis1 and newly induced Hsp70 simultaneously in the same cells with the HaloTag pulse labeling system. We found that a significantly greater fraction of newly synthesized Hsp70 colocalizes with Hsf1 than new Sis1. Thus, while we cannot directly image new Sis1 and Hsp70 in the same cell, we clearly observe a differential localization pattern with respect to Hsf1. These data are included in the revised Figure 4.

      One way to establish whether Sis1 nucleolar sequestration prevents it from acting on Hsf1 during the inactivation phase of the HSR would be to selectively disrupt its nucleolar localization signal eliminated while retaining its nuclear localization and determine how expression of such a mutant perturbed the inactivation kinetics of the HSR.

      Unfortunately, there is no known Sis1 nucleolar localization signal that we could use in the experiment you propose. In the preprint described above, we show that direct interactions with oRPs recruit Sis1 to the nucleolar periphery, but we do not yet know binding to oRPs is competitive with binding to Hsf1.

      Reviewer #2 (Public Review):

      This study aims to provide a needed update and validation of a previously outlined mathematical model that describes HSR/Hsf1 regulation. The purpose of the update is to incorporate the impact of newly translated proteins as negative regulators of Hsf1 following heat shock. A requirement for ongoing translation to mount the HSR and activate Hsf1 has been described in several recent studies. Moreover, the study addresses the role of the Hsp70 cochaperone Sis1 in HSR regulation, including its potential function in negative feedback regulation following heat-shock.

      The main strength of the study is that it combines quantitative modeling with a well-defined experimental system to generate data. Overall, the model appears to accurately reflect the behavior of HSR under the employed experimental conditions and provides and elegant example of a formalized model for this simple regulatory circuit. Another strength of the study is that it addresses the functional involvement of Sis1 in HSR/Hsf1 regulatory mechanisms and rules out Sis1 involvement in negative feedback regulation of Hsf1 following heat shock. This finding is of importance in light of the complexity of Sis1 involvement in HSR/Hsf1 regulation suggested by the literature. The authors also document a need for endogenous SIS1 promoter regulation during growth on non-fermentable carbon sources.

      The study is important for the advancement of Hsf1 research and it may provide inspiration for the study of other chaperone-titrated transcriptional mechanisms such as the UPR or bacterial stress sigma factors.

      We thank the reviewer for the generous evaluation.

      Reviewer #3 (Public Review):

      This paper follows other excellent work from the Pincus laboratory detailing the molecular mechanisms of Hsf1 regulation and extending experimental observations into predictive mathematical models. Overall, the work is top-quality, however, the findings are incremental in nature with respect to our understanding of the HSR and refine existing models rather than break new experimental or conceptual ground. Additionally, the relevance of the non-fermentable carbon source growth phenotype for the 2XSUP35pr-SIS1 strain is unclear with respect to HSR regulation.

      We thank the reviewer for this fair assessment of the work.

    1. Author Response

      Reviewer #1 (Public Review):

      Pelentritou and colleagues investigated the brain’s ability to infer temporal regularities in sleep. To do so, they measured the effect on brain and cardiac activity to the omission of an expected sound. Participants were presented with three different categories of sounds: fixed sound-to-sound intervals (isochronous), fixed heartbeat-to-sound intervals (synchronous), and a control condition without any regularity (asynchronous). When omitting a sound, they observed a difference in the isochronous and synchronous conditions compared to the control condition, in both wakefulness and sleep (NREM stage 2). Furthermore, in the synchronous condition, sounds were temporally associated with sleep slow waves suggesting that temporal predictions could influence ongoing brain dynamics in sleep. Finally, at the level of cardiac activity, the synchronous condition was associated with a deceleration of cardiac frequency across vigilance states. Overall, this work suggests that the sleeping brain can learn temporal expectations and responds to their violation.

      We thank the reviewer for the very useful and informed comments, to which we carefully reply below.

      Major strengths and weaknesses:

      The paradigm is elegant and robust. It represents a clever way to investigate an important question: whether the sleeping brain can form and maintain predictions during sleep. Previous studies have so far highlighted the lack of evidence for predictive processes during sleep (e.g. (Makov et al., 2017; Strauss et al., 2015; Wilf et al., 2016)). This work shows that at least a certain type of prediction still takes place during sleep.

      However, there are some important aspects of the methodology and interpretations that appear problematic.

      (1) The methodology and how it compares to previous articles would need to be clarified. For example, the Methods section indicates that the authors used a right earlobe electrode as a reference. This is quite different from the nose reference used by SanMiguel et al. (2013) or in Dercksen et al. (2022). This could affect the polarity and topographies of the OEP or AEP and thus represents a very significant difference. Likewise, SOs are typically detected in a montage reference to the mastoids. Perhaps the left/right asymmetries present in many plots (e.g. Figure 3) could be due to the right earlobe reference used.

      We thank the reviewer for raising this important point which has prompted us to clarify the reference choice in the manuscript both for completing the information about data recordings in our experiment and for emphasizing the influence of the reference on the EEG results and how they compare to previous reports.

      First, we would like to clarify that although EEG data is referenced to the right earlobe online, electrophysiological data from both earlobes were acquired and offline re-referencing to paired earlobes was performed. This is now clarified in the Methods section on page 26, lines 648-651 as follows:

      ‘Continuous EEG (g.HIamp, g.tec medical engineering, Graz, Austria) was acquired at 1200 Hz from 63 active ring electrodes (g.LADYbird, g.tec medical engineering) arranged according to the international 10–10 system and referenced online to the right earlobe and offline to the left and right ear lobes.’

      Additionally, after preprocessing, we performed common average re-referencing, as is common practise and recommended in the literature (see e.g. Niso et al., 2022), and hence the initial online referencing is no longer of relevance. Nonetheless, we agree with the reviewer that different online and offline referencing schemes could explain why some results in the literature are not optimally reproducible. We have clarified this point in the discussion on page 17, lines 408-411 as follows:

      ‘Finally, while we used largely similar pre-processing (i.e. filters) and experiment implementation (i.e. online and offline reference) as in Chennu et al. (2016), this was not the case for other studies with which direct comparisons are unwarranted.’

      For the SO analysis chosen reference (linked earlobes online and common average offline in our case) we acknowledge that - as the reviewer mentioned - many groups indeed employ mastoid re-referencing for SO detection (e.g. Siclari et al., 2018; Schneider et al., 2020; Ameen et al., 2022). However, to the best of our knowledge, this is not a standard choice, as many other groups choose a linked earlobe reference for online SO detection and the mastoids only for offline SO detection (Ngo et al., 2013; Besedovsky et al., 2017; Ngo and Staresina, 2022). In addition, other recent studies used linked earlobe referencing (Bouchard et al., 2021) or common average re-referencing (Züst et al., 2019) for offline SO detection. In our study we opted for using the same average reference for SO detection and evoked potential analysis in order to be able to relate the results of the omission evoked response comparison to that of the SO analysis.

      Also, the authors did not use the same filters in wakefulness and sleep, which could introduce an important bias when comparing sleep and wake results or sleep results with previous wake papers.

      We fully agree with the reviewer and thank him/her for this suggestion. We have now re-analysed the wakefulness data using a bandpass filter of 0.5-30 Hz as used for the sleep data. The chosen filtering range is commonly used in sleep research. Moreover, Chennu et al. (2016) employed a very similar filtering range (0.5-25 Hz) in an omission EEG study, whose results are similar to ours (Chennu et al., 2016). This new preprocessing resulted in a higher number of valid trials (average trial number: before N=245, now N=286) in wakefulness. Hence, the data from more participants could be used (before N=21, now N=23) and the statistical power of observed differences in our comparisons was improved. The Methods section has been updated accordingly on page 31, lines 763-764 as follows:

      ‘Continuous raw EEG data were band-pass filtered using second-order Butterworth filters between 0.5 and 30 Hz for the wakefulness and sleep session.’

      (2) The ERP to sound omission shows significant differences between the isochronous and asynchronous conditions in wakefulness (Figure 3A and Supp. Fig.) but this difference is very different from previous reports in wakefulness. Topographies are also markedly different, which questions whether the same phenomenon is observed. For example, SanMiguel and colleagues observed an N1 in response to omitted but expected sounds. The authors argue that they observe a similar phenomenon in the iso vs baseline contrast, but the timing and topography of their effect are very different from the typical N1. The authors also mention that, within their study, wake and N2 OEPs were "largely similar" but they differ in terms of latencies and topographies (Figure 3A-B). It would be better to have a more objective way to explore differences and similarities across the different analyses of the paper or with the literature.

      We concur with the reviewer and reviewing editor, who both pointed that the way we previously analysed (see our reply to the reviewer’s previous comment) and reported our data was sub-optimal. The new analysis of the wake data reveals more similarities with the MMN and to some extend with the omission literature (Figure 4). As requested, we also improved the description of the comparison of our results to those from the literature, in the Discussion section (pages 17-19, lines 391-458).

      (3) The authors applied a cluster permutation to identify clusters of significant time points. However, some aspects of this analysis are puzzling. Indeed, the authors restricted the cluster permutation to a temporal window of 0 to 350ms in wake (vs. -100 to 500ms in sleep). This can be misleading since the graphs show a larger temporal window (-100 to 500ms). Consequently, portions of this time window could show no cluster because the analysis revealed an absence of significant clusters but because the cluster permutation was not applied there. Besides, some of the reported clusters are extremely brief (e.g. l. 195, cluster's duration: 62ms), which could question their physiological relevance or raise the possibility that some of these clusters could be false positives (there was no correction for multiple comparisons across the many cluster permutations performed). Finally, there seems to be a duplication of the bar graphs showing the number of significant electrodes in the positive and first negative cluster for Figure 2 Supp. Fig. 1.

      We thank the reviewer for raising this point. We have now performed cluster permutation statistical analysis over the entire -100 to 500 ms window in wakefulness, thus matching the temporal window used for the sleep data (Methods, page 34, lines 843-846). Please note that this modified temporal window was applied to the wake data for which the pre-processing had also been modified (see our reply to comment #1 above). With matching analysis for wakefulness and sleep, we now identify clusters of higher or similar significance compared to our earlier results (Cohen’s d for isoch vs asynch = 0.92 now and 0.67 before; for synch vs asynch = 0.91 now and 1.06 before). In addition, for the isoch vs asynch omission response comparisons, overlapping cluster periods are identified in wakefulness (114-159 ms) and sleep (85-223 ms). The relevant results are thoroughly described on pages 9-10, lines 202-210; page 11, lines 238-251, pages 38-39, lines 970-985.

      We would like to also mention that while multiple comparisons correction is performed across channels and electrodes in the EEG using cluster permutation statistics, it is true that we do not perform multiple comparisons correction across the many comparisons. We now explicitly mention the lack of this correction for multiple comparisons in the Methods section page 34, lines 840-843 as follows:

      ‘Of note, the cluster permutation based multiple comparisons correction only applied across channels and latencies when comparing two experimental conditions, however no multiple comparisons correction was applied across the number of comparisons made in this study.’

      (4) More generally, regarding statistics, the absence of exact p-values can render the interpretation of statistical outputs difficult. For example, the authors report a significant modulation of the sound-to-SO latency across conditions (p<0.05) but no significant effect of heartbeat peak-to-SO latency (p>0.05). They interpret this pattern of results rather strongly as evidence that the "readjustment of SOs was specific to auditory regularities and not to cardiac input". Yet, examining the reported chi-square values show very close values between the two analyses (7.9 vs. 7.4). It seems thus difficult to argue for a real dissociation between the two effects. Providing exact p-values for all statistical tests could help avoid this pitfall.

      To assist the interpretation of statistical analysis results, we have now included exact p-values.

      Specifically, for SOs, we agree with the reviewer on the highly similar chi-squared values for the two analyses of Sound onset to SO peak and R peak onset to SO peak and have now included a comment in the discussion to reflect this on page 20, lines 478-480 as follows:

      ‘However, it should be noted that although not significant, we observed a trend of lower R peak to SO peak latencies during cardio-audio regularity compared to the other auditory conditions, possibly driven by the fixed relationship between heartbeat and sound in the synch condition.’

      Reviewer #2 (Public Review):

      This study was designed to study the cortical response to violations in auditory temporal sequences during wakefulness and sleep. To this end, the study had three levels of temporal sequence, a regular temporal sequence, an auditory tone that was yoked to the cardiac signal, and an irregular tone. The authors show significant EEG differences to an omitted tone when the auditory tone was predictable both during wakefulness and sleep.

      The authors analyze the ERP to the omitted tone as well as when aligned to the R-peak of the HEP. The analysis was comprehensive and the effects reported align with the interpretation given. Of particular interest was the fact that a deceleration of the heart rate was present for omissions when the auditory tone was yoked to the R-peak (synch) in all stages of wakefulness and sleep.

      We thank the reviewer for his/her positive judgment.

      However, one weakness was the rationale for the current study and how the results link to current theoretical frameworks for the role of interoception in perception and cognition. This was in contrast to the clear background and explanation to study the response to omissions for a predictable auditory sequence in wakefulness and sleep. It was unclear why the authors selected the cardiac signal to yoke their auditory stimuli. What is the specific motivation for the cardiac signal rather than the respiratory signal? This was not clear.

      In the revised Introduction section, we improved our description of these aspects, including the interaction between interoception and external stimulus processing. We hypothesized that cardiac signals would be more relevant than respiratory signals in coordinating temporal expectation because of existing prior experimental evidence thereof, as well as data showing a modulation of the neural response to heartbeat by levels of vigilance/consciousness, and the sharp cardiac R peak offering an ideal candidate for online temporal locking to administered sounds (see our detailed reply to the reviewer’s comment #2 below). However, we cannot exclude that respiratory signals could also be used by the brain to assist temporal regularities detection.

      Future studies may test for this possibility.

    1. Author Response

      Reviewer #1 (Public Review):

      Kozol et al adapt an important tool, in the form of the atlas, to the Astyanax research community. While broadly the atlas appears to correctly identify large brain regions, it is unclear what is the significance of the finer divisions. The external confirmations are restricted to just a few large brain regions (by independent human observer: e.g., optic tectum, hypothalamus. By molecular marker: hypothalamus only.). As such, interpretations of results from as many as 180 small subregions should be interpreted sceptically.

      The authors also suggest that some brain regions have increased in size during cavefish evolution (e.g., hypothalamus, subpallium). The analysis of progeny from a genetic cross of cave and surface morphs suggest a complex genetic program has evolved to control this variant set of brain structures. With the development of genetic manipulation tools in this species, an exciting series of experiments may link causal variants with brain development differences.

      MAJOR ISSUES

      Line 85+. Segmentation accuracy is not well established by the authors. For example, Figure S2 states that the pixel correlation is high between Astyanax populations. But the details of how this cross-correlation was done are sparse. Is the Y- axis here showing the fraction of pixels that are shared in the morphs? While the annotation appears to function similarly across morphs, the 80% machine:human correlation is difficult to put into context. On the one hand, this seems low. For what values should one strive? Are there common "mistakes" or differences in human & machine annotations that lead to certain regions being excluded? A discussion of these is warranted and will be useful to others who wish to use this approach.

      Line 87. "such as" is misleading since these were the only two antibodies used to confirm molecular definitions of regions.

      But more to the point, additional markers should be used to confirm more than just the ISL+ hypothalamic divisions.

      This is particularly warranted, as Fig 1d is not convincing. I believe that the yellow label is ISL; this is difficult to see in the figures. ISL is not ideal since this is widespread in the hypothalamus. There are no ISL-negative regions depicted, which would be necessary to demonstrate that the resolution of this subregion labeling tool is high. A complementary approach would be to find molecular markers that are more restricted than ISL which label only subsets of hypothalamic regions.

      Finally, do the mid/hindbrain ISL labeled regions correspond to known ISL+ subregions?

      We agree with the reviewer that the Islet1/2 assessment was insufficient for demonstrating automated segmentation accuracy and that the labeling was difficult to visualize in the previous version of the figure. We have addressed this reviewers concern by adding new molecular markers for verification of segment accuracy and through a modified presentation of the original data. The first, and in our opinion most convincing, is the addition of more markers of known neuroanatomical regions. This required not only adding extra antibody stains to our brain atlas, but also optimizing Hybridization Chain Reaction (HCR) in situ protocol that could be coupled with immunohistochemistry, permitting automated segmentation via total ERK registration and brain atlas inverse registration. This novel protocol showed corresponding localization of markers, such as 5-hydroxytryptamine (5-HT), gastrulation brain homeobox 1 (gbx1), and oxytocin (oxt), in the expected neuroanatomical areas. It should be noted that these markers included both large neuroanatomical areas as well as small, well-defined areas such as the superior, and also labeled disparate neuroanatomical loci throughout the brain. We also modified our original figure to better illustrate the regions that islet+ staining labeled. These markers show that islet1/2 labels precise regions of the hypothalamus which correspond to known expression patterns. The updated methodology can be found in lines 422- 440, while the results can be found in lines 105-118 of the text, Figure 1 and Figure 1 – Figure Supplement 1a.

      We believe these two changes address the reviewers concerns, and suggest that the neuroanatomical labels generated in this study faithfully label the Astyanax brain.

      The molecular and human-observed confirmations of brain regions suggests that the annotated borders of gross anatomical regions are correctly identified by the algorithm. However, data is not presented that indicates whether the smaller regions correspond to biologically meaningful compartments.

      We agree with the reviewer that our assessment of regional accuracy for automated segmentation necessitated additional markers, which labeled smaller, more refined compartments. To address this, we developed an HCR in situ hybridization strategy that was compatible with our brain atlas, and used several markers that label smaller regions, such as the 5-HT positive neurons of the dorsal raphe and oxytocin positive neurons of the medial preoptic region. Together, these results were consistent with our previous finding that anatomical regions confirmed by human- observation and molecular staining did faithfully label the correct regions of the brain. These findings can be found in lines 105-118 in the text, along with Figure 1 and Figure 1 – Figure Supplement 1a-d. Together, we hope this shows that not only large neuroanatomical areas, but also finer areas are correctly labeled by CobraZ.

      Parameters used in CobraZ to perform the segmentation are not defined. More transparency is required here for others to replicate.

      We agree with the reviewer that parameters used for CobraZ and Advanced Normalization Tools (ANT) are necessary for reproducibility of our results. We have since added sentences to clarify that we did not change the original ANTs or CobraZ parameters from Gupta et al. 2018. (line 474- 475) and have added the CobraZ parameter file and ANTs bash scripts to our dryad depository.

      Reviewer #3 (Public Review):

      In this manuscript the authors use novel techniques and analytical methods on an up and coming animal model for brain evolution. The manuscript utilizes the cavefish Astyanax mexicanus, which can provide future important insights into the field of neurobiology and in evolution in general.

      The authors however, only argue that Astyanax is a powerful system for functionally determining basic principles of brain evolution (which clearly it will be), but fail to actually describe what brain evolution insights Astyanax gives. The data is in the paper, but the interpretation needs refinement. This would be a much more valuable paper with a thorough evolutionary context based on the already existing, extensive literature. I believe this manuscript has the potential to be extremely impactful.

      We thank the reviewer for her positive critique of our manuscript, and more broadly for the thoughtful comments, the challenge to re-evaluate the way we have thought about our own data, and for hinting us in a direction of scientific direction that is more impactful. We have spent a lot of time re-thinking this work to address this reviewers critique, and believe that it is a far better study for it.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors of this manuscript aimed to systematically evaluate the pleiotropic effects of MCR-1-mediated colistin resistance. They evaluated the effect of MCR-1 and MCR-3 carried on different plasmids on antimicrobial peptides (AMPs) and assessed their ultimate effect on virulence. The authors find that MCR-1-mediated colistin resistance correlates with increased resistance against some host AMPs, but also increased sensitivity to others. The authors also find that MCR-1 alone is associated with resistance to human serum and to elements of the complement system. This highlights a potential selective advantage for MCR-1-mediated resistance to host immune factors and a potential for enhanced virulence.

      The methods have been well established before and adequately support their main findings. While determining the role of MCR-1 in a single genetic background is important to better understand its potential pleiotropic effects against a diversity of AMPs and in a variety of scenarios, the impact and significance of the results are partially ameliorated because different genetic backgrounds, particularly those most relevant to a clinical (or agricultural) context were not considered. The results depicted here are still a necessary and important step towards a more comprehensive understanding of the pleiotropic effects of MCR-1. But, interactions between plasmids and host genomes and their co-evolution can have important effects more generally. The authors do mention this in the discussion and suggest it to be an important avenue for future work. However, given the objective of the study and the clinical and agricultural context in which the authors have framed their work, it seems more relevant to include those distinct genetic backgrounds already here.

      The conclusions stemming from the results found in Figure 3, and Figures 4c and d seem too overreaching to me. The associated resistance to AMPs from pigs seems to be only strong enough against one of the five tested AMPs and hence concluding that these impose a strong selective pressure in the pig's gut seems unsubstantiated. Similarly, the difference in survival probability within their in vivo system, though statistically significant, seems to be very ild between their MCR-1 and empty vector control.

      Thank you for the comment. We agree on the effect of MCR-MOR on AMP susceptibility and have edited the paragraph by removing the lines on strong selective pressure in the pig gut. As regards the 4c and 4d results (4e and 4f in the revised version), it is interesting and statistically convincing that MCR increases bacterial virulence despite the cost of MCR expression. And importantly, this effect is even stronger in the case of LPS treatment where the immune system is stimulated, expressing diverse host AMPs (PMID: 19897755). This shows MCR-mediated advantages to bacteria in the complex host environment.

      Reviewer #2 (Public Review):

      Jangir et al test the hypothesis that resistance to the antimicrobial peptide (AMP) colistin can simultaneously increase resistance to other AMPS with related modes of action. Because AMPS comprise part of innate immunity, their central concern is that colistin resistance may compromise host defenses and thereby increase bacterial virulence. Their results show that MCR-1, whether expressed from naturally circulating or synthetic plasmids, can increase the MIC to AMPS from humans, pigs, and chickens, and impart fitness benefits at sub-MIC concentrations. In addition, they find that MCR-1-containing strains have increased survival in human plasma and are more lethal in an insect infection model.

      The conclusions of the paper are generally well supported by the results, but some aspects could be clearer and better defended with a few small additional experiments.

      Strengths:

      Using both synthetic and natural plasmids makes it possible to cleanly separate the effects of MCR-1 from the effects of other plasmid-borne genes or plasmid copy numbers. This helps confirm the causal role of MCR-1 on altered AMP susceptibility.

      Testing the survival of transformed isolates in human serum and in insects points to relevance in the more immunologically complex host environment where cells are exposed to a suite of factors that reduce bacterial survival.

      Thank you!

      Weaknesses/suggestions:

      Although increases in MIC are evident for different AMPS, the effects are generally modest. To address this, it might be helpful to use pairwise competition assays, as in Figure 1, to establish that even small changes to MIC are associated with clear selective benefits.

      Thank you for the suggestion. We agree that in some cases the change in MIC is modest, however, we would like to highlight that small-level changes in resistance have important clinical implications. For example, resistance mutations conferring a small change in MIC can ensure the survival of pathogenic bacteria in antibiotic-treated hosts (PMID: 30131514). Additionally, a comparison between competition assays (Fig 1) and MICs (Fig 2) clearly shows that small changes in MIC are associated with substantial fitness benefits. For example, for pSEVA:MCR-1, the fold change in MIC of CATH2 (chicken), PMAP23 (pig), and LL37 (human) ranges between 1.05 and 1.5, however, the competitive fitness ranges from 10% to 17%. This issue is discussed in the revised manuscript (lines 306-317, page 13)

      ….This would be especially helpful in assays with human serum and in Galleria where the concentrations of AMPS or other immune components are unknown.

      It is clear that MCR-1 increases resistance to serum and virulence (Figure 4). However, we agree with the reviewer that the selective benefits of MCR-1 in complex host environments are not known (i.e., serum or Galleria). We have revised the final paragraph of the discussion to reflect this limitation of our study (lines 370-382, page 15).

      Assays using human serum are interesting but challenging to interpret given the diverse causes of bacterial killing, including complement. Although this was partly addressed in Supplementary Figure 6, I found the predictions of these experiments unclear. First, I think these experiments are too central to be relegated to the supplemental materials; they belong in the main text. Secondly, it is important to explicitly spell out the expectations of using heat-killed serum (which will degrade any heat-labile components) or complement-deficient serum. It should be clearer under which conditions MCR-1-containing strains are predicted to do better or worse than controls.

      We have addressed this in the revised version. We have moved Supplementary Fig 6 to the main text, and have edited the text, clarifying the model prediction (lines 245-257, page 10).

      Galleria is a useful infection model for virulence, but it is unclear what drives differences between strains. First, bacterial numbers aren't measured in this assay, so it isn't known if increased virulence is due to increased bacterial growth or decreased bacterial clearance. As above, I think these assays would be stronger using the competition-based approach in Figure 1. This would indicate bacterial numbers through time and directly show the selective benefit associated with MCR-1. Second, it would be useful to elaborate on why MCR-1 increases virulence, especially any known similarities between Galleria AMPS and those tested in Figures 1 and 2. Overall, it would help if Galleria were less of a black box.

      We agree that the mechanism underlying increased virulence remains to be explored and thus, we have already discussed this in the discussion as a limitation (lines, 370-382, page 15). However, elucidating the mechanisms by which MCR-1 increases virulence would clearly be an interesting line of research moving forward.

    1. Author Response

      Reviewer #1 (Public Review):

      The adhesion of Leishmania promastigotes to the stomodeal valve in the anterior region of the sandfly vector midgut is thought to be important to facilitate the transmission of the parasites by bite. The promastigote form found in attachment is termed a 'haptomonad', although its adhesion mechanism and role in facilitating transmission have not been well studied. Using 3D EM techniques, the paper provides detailed new information pertaining to the adhesion mechanism. Electron tomography was especially useful to reveal the ultrastructure of the attachment plaque and the extensive remodelling of the flagellum that occurs. A few of the attached haptomonads were found to be in division, which is a novel observation. The attachment of cultured promastigotes to plastic and glass surfaces in vitro was found to involve a similar remodeling of the flagellum and was exploited to image the sequential steps in attachment, flagellar remodeling, and haptomonad differentiation. The in vitro attachment was found to be calcium2+ dependent. Based mainly on the in vitro observations, a sound model of the haptomonad attachment plaque and differentiation process is provided.

      We thank the reviewer for highlighting the significant progress we have made in dissecting the adhesion mechanism and flagellum restructuring in the Leishmania haptomonad.

      Reviewer #2 (Public Review):

      The study by Yanase et al. investigated the details of the 3D architecture of Leishmania haptomonad promastigote's adhesion to the midgut of the insect vector. The authors generated a dataset of images that reveal intricate details of the formed adhesion plaque and expanded the study with in vitro alternatives for the exploration of how Leishmania promastigotes strong adhesion by hemidesmosomes to surfaces can happen and be maintained. They show with unprecedented detail the ultrastructure of the attachment plaque. The in vitro dataset of the paper adds to the specific literature important details on how to explore micro/nanostructures involved in an important attachment step for this eukaryotic parasite. However, the in vitro data should be reconsidered in its discussion and conclusions as it does not support direct comparison with in vivo Leishmania forms as pictured by the authors. In general, the dataset presented in this manuscript adds valuable data and resources for the study of Leishmania promastigotes to surfaces, especially to the thoracic midgut parts of its insect vector.

      The dataset of this paper is well-collected and robust, but some aspects of image analysis need to be clarified and extended. Also, the in vitro data from the manuscript will benefit from an extensive adjustment in its discussion. Points to focus on:

      We thank the reviewer for recognising the ultrastructural detail we have now provided of this cryptic parasite life cycle stage. Below we address each of your points in detail.

      1) The haptomonad promastigote is indeed a possible critical form for transmission, but it lacks formal demonstration still in all literature available. This should not be claimed without proper formal demonstration.

      We agree with the reviewer that any relationship between transmission and the haptomonad form has yet to be formally demonstrated. Hence, we revised the descriptions referring to the relationship between transmission and the haptomonad form (Line 22-23, 31 and 113-114).

      2) Literature available and cited in this manuscript regarding in vitro adhesion of culture Leishmania promastigotes does not provide direct evidence for haptomonad differentiation. Haptomonads are still a largely unknown promastigote form with no defined ontogeny. With that, to propose an in vitro haptomonad differentiation protocol, more detailed direct evidence of in vivo haptomonads will be necessary. The in vitro experiments available show how cultured promastigotes attach to surfaces. Detailed studies in vivo will be needed still to attribute the findings in vitro to haptomonads.

      We would like to highlight that promastigotes and haptomonads have morphological definitions within the literature and our cells are definitely more like haptomonads than promastigotes. As the reviewer highlights, the haptomonad-like cells we generate in vitro have an almost identical morphology and attachment plaque structure to those haptomonads we observed attached to the stomodeal valve. In addition, we have been able to watch individual cells that had a promastigote morphology acquire a haptomonad morphology and we believe this will provide future insights to the ontogeny of these forms. However, as there are currently no published molecular markers for haptomonads we have not been able to provide direct evidence other than the morphology and ultrastructure that in vitro attachment replicates in vivo haptomonad differentiation. Therefore, we have revised our nomenclature and now refer to the in vitro haptomonad-like cell. In the discussion, we have been careful to highlight that certain aspects of our model rely on in vitro data and therefore may not accurately reflect the situation in the sand fly.

      3) This manuscript will benefit by having a detailed description of how to analyze and get to the 3D models presented. This has a strong potential for usage beyond the Leishmania/sand fly field. Statistics should be made available with ease across the manuscript and with a dedicated section on methods.

      We added a detailed description of how to analyse the 3D models (Line 756-763), and added videos showing a rotated view of each 3D model (Figure 1—video 3 and 4, Figure 2—video 2, and Figure 3—video 2 and 4). We have deposited the SBF-SEM and tomography data on the Electron Microscopy Public Image Archive (EMPIAR; https://www.ebi.ac.uk/empiar/), enabling access to the raw data (Line 763-766). We have added a statistics section into the Materials and Methods (Line 864-868).

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Sampaio et al. tackle the role of fluid flow during left-right axis symmetry breaking. The left-right axis is broken in the left-right organiser (LRO) where cilia motility generates a directional flow that permit to dictate the left from the right embryonic side. By manipulating the fluid moved by cilia in zebrafish, the authors conclude that key symmetry breaking event occurs within 1 hour through a mechanosensory process.

      Overall, while the study undeniably represents a huge amount of work, the conclusions are not sufficiently backed up by the experiments. Furthermore, the results provided present a limited advance to the field: the transient activity of the LRO is well established, and narrowing down this activity to 1 hour (even though unclear from the presented data that it is a valid conclusion) does not help to understand better the mechanism of symmetry breaking.

      We thank the reviewer1 for acknowledging the hard experimental set up. However, we must argue that knowing the exact timing that is more sensitive to fluid flow manipulations is a very important advance we provide here. The reason is because this type of experiment is giving us the physiological timing in a WT embryo. It is one thing to know the system can respond to optical tweezers earlier than 5 ss and later than 5 ss, as Yuan lab did recently, but quite another to constrain the physiological timing at which the process occurs in an unperturbed manner (as much as possible). Our aim was the latter. Our rationale is that knowing the physiological time is important to provide clues, for example we had these types of questions at the time: is the physiological time before or after cell rearrangements occur? is it falling in a directional or non-directional flow regime? Is it governed by a mild flow or stronger one? Is it before or after dand5 becomes asymmetric? Some of these questions that we think we all know the answers for, could be challenged by our experiments… so it is indeed very important to not assume we know the answer, and ask the question again in an unbiased way with every new technique available! We wanted to be unbiased, and we think that is the beauty of our time-window experiment. Indeed, it shows the physiological time-window peaks at 5 ss which is later than Yuan’s lab calcium transient recording and before dand5 asymmetric expression. In our opinion this is compatible and makes perfect sense because although the system already shows calcium transients before and can respond to lack of Pkd2 or optical tweezer cilia manipulations at 1 ss – 3 ss, it is from 4 to 6 ss, peaking at 5 ss, that it is most responsive physiologically to the fluid extraction and therefore both mechanical and chemical perturbations.

      We have made additional experiments and used smFISH on WT embryos for detecting dand5 expression with cellular resolution, and we have quantified asymmetries in dand5 number of transcripts as early as 6 ss (new Figure 7 and new author: Catarina Bota) that further support our time-window claim. Degradation of dand5 mRNA has been the mechanism suggested to be at the base of the asymmetric dand5 expression, which is usually a very fast mechanism. This new piece of evidence supports that the physiological breaking of symmetry is stronger around 5 ss. (see new discussion on this subject on page 27).

      Regarding the symmetry breaking. The fact that anterior angular velocity was the major difference between embryos that recovered without LR defects versus those that did not, reveals that angular velocity must be tightly regulated by cilia motility and CFTR activity to bring back fluid and flow directionality, which together confer the robustness of flow. This is now better explained in the manuscript. We agree that the novelty regarding angular velocity may seem incremental compared to our work from 2014, where we only analyzed speed (Sampaio et al, 2014). However, here we provided more resolution and detailed parameters of angular velocity per sections of the LRO as well as tangential and radial velocities, the components of angular velocity. The Radial component shows a trend towards left anterior that is now discussed in the text as evidence for a left difference. The present work shows that anterior angular velocity has a major role in the successful recovery of the symmetry breaking process, which was not claimed before. Here we challenged the embryo to bring to light the most important parameters.

      Importantly, the authors do not provide any convincing experiments to back up the mechanosensory hypothesis because the fluid extraction experiments affect both the chemical and physical features of the LRO, so it is impossible to disentangle the two with this approach.

      We agree the first extraction experiment (Figures 1-3 and Table 1) affects both mechanisms and does not disentangle them, and that was, in fact, our goal for the first experiment - the finding of the exact time-window for symmetry breaking. However, in the second part of the work (Figures 4-5 and Table 2) we provide a 20,000 times dilution experiment, this dilution experiment is very different than the extraction one. We apologize if this was not clear and hope to have made it clear this time.

      We must agree with the reviewer that chemosensing is not excluded, in fact we had provided a paragraph in the discussion about EV secretion rates to tone down our claim and did acknowledge that secretion could still overcome the dilution we are causing. We think we had already addressed this problem in the previous eLife manuscript but now we have discussed the possibilities and the experimental evidence that supports each of them (see page 28, last paragraph). The key experiment that does not fit with secretion is pointed out in the end, and we ask the reviewer to read it in the context of wildtype animals. We agree both scenarios must be discussed and leave space for future data on mmp21 and CIROP. However, so far, in zebrafish we cannot favor chemosensing as much as mechanosensing, we can only wait for more discoveries and be open.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Lee and colleagues address the participation of NBR1 in chloroplast clearance after treatment with high light intensity. Authors use NBR1 fused to reporter proteins (GFP, mCherry), with the aid of nbr1, atg7, and nbr1-atg7 mutants, in combination with immunogold labelling to show localization of NBR1 to surface and interior of photodamaged chloroplasts, which follows with their engulfment in the vacuole, a process which is independent of ATG7. The combined use of ATG8 fused to GFP further shows that NBR1 and ATG8 are recruited independently to photodamaged chloroplasts. In addition, the use of mutant versions of NBR1 in combination with mutants lacking E3 ligases PUB4 and SP1 and mutant toc132-2 and tic40-4 lacking members of the TIC-TOC complex of protein translocation to the chloroplast, authors show that chloroplast localization of NBR1 requires the ubiquitin ligase domain (UBA2) of the protein, whereas, the PB1 domain exerts a negative effect on NBR1 chloroplast association, yet neither the PUB4 and SP1 E3 ligases nor the TOC-TIC are required for NBR1 association to photodamaged chloroplasts. All these approaches are well described and strongly support the authors' conclusions that the loss of chloroplast envelope integrity allows the entrance of cytosolic ubiquitin ligases and the participation of NBR1 in photodamaged chloroplast clearance by a process of microautophagy. All these findings add valuable information to our knowledge of chloroplast homeostasis in response to light stress.

      To further support these conclusions, authors perform a chloroplast proteomic analysis of the WT, nbr1, atg7, and nbr1-atg7 mutants. However, in contrast with the above results, the description of the proteomic data is rather confusing. The paragraph on Page 17 (lines 393-406) is hard to follow. The term "over-representation of less abundant chloroplast protein" is also quite confusing, like the data in Fig. 6 and supplementary to this figure (what does show the PCA analysis in Fig. 6-suppl. 1?). I wonder whether it would be possible to show all these data as supplementary and try to present the data supporting the major conclusion of these analyses (if I understood correctly, that nbr1, atg7, and the double mutant have lower contents of chloroplast proteins), in a more simple and clear format.

      Following the reviewer’s comments, we have re-written the result section describing the proteomic data to make it more concise and clearer. We have also made modified Figure 6 to make it more concise and generated new graphs for Figure 6 supplemental figures 1 and 2.

      Reviewer #2 (Public Review):

      The authors conducted a wide-ranging series of experiments which lead to the conclusion that NBR1 is involved in the clearance of photodamaged chloroplasts. It is a novel finding because the role of NBR1 in this process was never documented. Notably, the NBR1-mediated clearance is only one of the several possible mechanisms responsible for chloroplast turnover. It is not surprising, considering that the nbr1 mutants are viable. The work is arranged very well. The rationale of the subsequent experiments is logically justified and the outcomes and followed by clear conclusions. In consequence, the authors managed not only to observe the association of NBR1 with the chloroplasts but they threw some light on the corresponding mechanisms. The manuscript contains numerous high-quality images from a confocal microscope and from a transmission electron microscope. All images are accompanied by statistical analysis of the respective microscopic observations, which greatly improves the credibility of the conclusions. Shortly, the authors demonstrated that NBR1 decorates not only the exterior but also the interior of damaged chloroplasts in an ATG7-independent way. Next, they establish that NBR1 and ATG8 are recruited to different populations of damaged chloroplasts, and they document differences in chloroplasts turnover, differences in chlorophyll abundance and chlorophyll photochemical properties, as well as differences in the total proteome of the nbr1 mutant in comparison to the wild type and atg7 mutant in two light regimes (low light and high light). Finally, they exclude the requirement for the known E3 ligases PUB4 and SP1 for NBR1mediated degradation and show that the NBR1 internalization relies rather on the chloroplastic membrane rupture than on the TIC-TOC-dependent processes. In summary, the authors postulate that NBR1-mediated chloroplast clearance is a novel, not yet described mechanism and summarize it in a clear diagram.

      The work is interesting, the figures are convincing and the conclusions are justified by the results. It provides novel data on the function of selective autophagy receptors NBR1 in plant cells, however, it also leaves the reader with some unanswered questions. The most important is the relative contribution of each of the chloroplast's degradation routes to the turnover of these organelles in different stresses, light regimes, plant growth stages, etc. This is a difficult problem because the mutations in relevant genes have pleiotropic effects and it is difficult to separate the functions of the individual turnover routes. For example, the defects in core autophagy genes (like the atg7 mutant used in this study) result in an increased level of NBR1. These issues are not sufficiently addressed in the discussion.

      The reviewer is correct and indeed, we also detected higher levels of NBR1 in the atg7 mutant (Fig 2G). This could be, for example, the underlying reason why there are more chloroplasts decorated with NBR1 in that atg7 mutants than in complemented nbr1 plants, 24h after high light treatment (Fig 1F). However, the higher frequency of photodamaged chloroplasts observed in atg7 (Fig 2D), supports a different scenario: the higher number of photodamaged chloroplasts that are not successfully repaired or degraded by canonical autophagy in atg7, become substrates of NBR1. The increased levels of NBR1 in the agt7 mutant and how this could influence the effects seen in the mutants studied in this manuscript is now discussed in lines 670-673.

      Reviewer #3 (Public Review):

      The authors use an impressive array of techniques to determine the role of the NBR1 autophagy receptor protein specifically in the clearing of photodamaged chloroplasts. The authors describe the mechanism(s) by which this receptor operates in this context and demonstrate that this NBR1-mediated process occurs independently of SP1 and PUB4 (whose own roles in other aspects of chloroplast autophagy have previously been shown). The authors further dissect the functional domains of NBR1 to identify which are important in this process.

      The major strength of this work is the myriad techniques used to approach the problem. The data are of high quality, and on the whole, well replicated and statistically analysed. In the main, these data substantiate the findings of the authors, although some findings are quite correlative/descriptive. However, the authors show good circumspection in their conclusions and discussion. One potential weakness is that the genetic data (use of mutants) rely on single mutant alleles, therefore whilst genetic linkage to the mutations is assumed, it cannot strictly be guaranteed. The authors performed effective genetic complementation to analyse the domain structure of NBR1 shown in Figure 7. It would have been good if complementation of nbr1 and atg1 mutants and/or alternative mutant alleles had been used for experiments described in Figures 1 to 6. Without this, I think even more circumspection regarding the data obtained from these single-allele mutants would be advised.

      We agree with the reviewer that more mutant alleles would have provided stronger support to our conclusions, but we would also like to highlight that the atg7-2 (Chung et al 2010), nbr1-2, and atg7-2 nbr1-2 mutants (Jung et al 2020) have been well characterized previously and the nbr1-2 mutant, shown to be rescued by the expression of fluorescently tagged NBR1 (Jung et al 2020). We are confident about the results on the localization of NBR1 in chloroplasts, not only because the fluorescently tagged NBR1 proteins are functional but also because we were able to corroborate the localization of NBR1 by using antibodies against the native proteins (Fig 2). That said, the reviewer does raise an important point and therefore, we have acknowledged more explicitly the limitation of our conclusions based on the analysis of single mutant alleles in lines 630-631 of the discussion.

    1. Author Response

      Reviewer #1 (Public Review):

      The model put forward by the authors in this manuscript is a simple and exciting one, explaining the function of AGS3 as a negative regulator of LGN, acting as a 'dominant-negative' version of LGN. Overall, the results support the model very well, and the results shown in Fig 6, which clearly reveal the functional relevance of AGS3, add strength to the paper.

      We thank the reviewer for their enthusiasm regarding our finding that AGS3 acts as an endogenous dominant-negative to inhibit LGN. We appreciate their assertion that the results support the model and that the functional relevance to epidermal stratification is a strength.

      In Figures 3A and B, the authors claim that AGS3 overexpression leads to depolarization of LGN in epidermal stem cells. However, in the example provided in Figure 3A, the LGN signal appears to be stronger than the control, with more LGN still on the apical side (many would categorize this as 'apically polarized'). In the scoring shown in Figure 3B, I am not sure if 'eyeballing' is the right way to decide whether it is polarized/depolarized/absent. The authors should come up with a bit more quantitative method to quantify the localization/amount of LGN and explain the method well in the manuscript. A similar concern regarding the determination of the LGN localization pattern applies to the rest of figure 3 as well.

      We agree with this important critique about the methodology used to assess LGN expression patterns. While we have historically included categorical analyses like those used in Fig. 3A,B in past publications (Williams et al, NCB 2014; Lough et al eLife, 2019), we have also now performed additional, unbiased, quantitative measures of LGN fluorescent intensity, as described in greater detail above. We added these new data in Fig. 4C-J, while the data previously in Fig. 3A,B have now been redistributed between Fig. 3E,F (overexpression) and Fig. 4A,B (knockdown).

      Reviewer #2 (Public Review):

      To date, only a handful of studies have addressed the importance of AGS3, a paralog of the relatively well-characterized spindle orientation factor LGN. The authors now show that AGS3 acts as a negative regulator of LGN and propose that this activity could work through competition for binding partner(s). Remarkably, regulation is temporally restricted in such a way that the conserved role played by LGN in metaphase spindle orientation is unaffected. Instead, AGS3 regulates a post-metaphase function for LGN, namely Telophase Correction. The article is well-written, the experiments are performed at a high level, and the claims are generally supported by the data. Two main points of confusion are raised in the current version. 1) The authors show that AGS3 regulates cortical localization of LGN, but would need to clarify how LGN is being affected. 2) The authors propose in the discussion that AGS3 might exert its regulatory effect through competition for NuMA, an important binding partner for LGN, but would need to clarify how and why NuMA would be involved in Telophase Correction.

      We thank the reviewer for appreciating the novelty of our findings regarding the understudied LGN/pins paralog AGS3. In regards to the first point, as described earlier, we have added additional quantitative analyses of how AGS3 affects cortical LGN fluorescent intensity in Fig. 4C-J. We now show that AGS3 loss leads to broader and higher expression levels throughout mitosis, and therefore we have amended our model to soften the claim that AGS3 primarily operates during telophase correction. This renders the second point somewhat moot, but we nonetheless have expanded our Discussion to note that NuMA can be cortically recruited to the anaphase cortex independent of LGN (lines 531-542). We also contextualize our findings with the Reviewer’s own recent study which proposes a “threshold model” of cortical Insc as a determinant of spindle orientation (Neville et al, 2023), and speculate that a similar model could apply in our system, perhaps with AGS3 binding and sequesting Insc rather than NuMA (lines 543-556).

      Reviewer #3 (Public Review):

      This paper examines the mechanisms that control division orientation in the basal layers of the epidermis. Previous work established LGN as a key promoter of divisions where one of the siblings populates the differentiated layers (perpendicular). This work addresses two important, related issues - the mechanisms that determine whether a particular division is planar vs perpendicular, and the function of AGS3, and LGN paralog that has been enigmatic. A central finding is that AGS3 is required for the normal distribution of planar and perpendicular divisions (roughly equal) such that in its absence the distribution is skewed towards the perpendicular. Interestingly, however, the authors find that AGS3 has no detectable effect on orientation if the orientation is measured at anaphase. This timing aspect builds upon previous work from this group demonstrating a phenomenon they term "telophase correction" in which the orientation changes at the latest phases of division (and possibly post division?). Thus AGS3 seems to exert its effect using these later mechanisms and this is supported by further analysis by the authors. Importantly, the authors show that AGS3 acts through LGN, based on localization data and an epistasis analysis. The function of AGS3 has been highly enigmatic so resolving this issue while providing a useful step towards understanding how the division orientation decision is made, makes for exciting progress towards an important problem. I found the overall narrative and presentation to be quite good and especially appreciated the thoughtful discussion section that did an excellent job of putting the results in context and speculating how unknown aspects of the mechanism might work based on current clues. With that said, I think there are some important issues that should be resolved.

      We thank the Reviewer for this excellent summary of our findings and appreciation of the significance of the issues that our study addresses.

      Regarding the orientation measurements, the authors should specify how the midbody marker was used to mark sibling cells, especially given the midbody can move following division. For example, how can the authors be confident that the siblings in the middle panel of 1A are correct and not an adjacent cell? Regarding quantification, it would be useful for the authors to comment on how the following would influence their measurements: 1) movements along the z-axis, and 2) movement of the nucleus within the cell

      We have used this methodology for over a decade, and while it is not flawless, we have included several safeguards to ensure that sibling cells are correctly identified. We have added additional details to the Methods section (lines 867-869, 873-879).

      A similar question is how much telophase correction really happens in telophase. How confident are the authors that the process actually occurs during division and not subsequent to it? What is drawn in their previous paper and in Figure 7A implies that post-division movements may be important. It would be useful for the authors to comment on whether they can make the distinction and whether or not it might be important.

      Our intent in coining the term “telophase correction” was to imply that this process initiates, rather than completes, during telophase. We apologize for this confusion and have clarified this in the text (lines 80-82). Since most mammalian cells complete M phase in ~1h, with the longest time spent in prophase, in the absence of direct evidence to the contrary, it may be prudent to assume that telophase, like metaphase and anaphase, is relatively short, on the order of minutes. Since we cannot directly observe reformation of the nuclear membrane in our movies, we cannot be sure when telophase ends. Likewise, we do not currently have a suitable marker of the spindle midbody for live-imaging, so cannot be sure when cytokinesis completes. That said, we feel confident that most of the reorientation is occurring prior to cytokinesis, because we have previously reported that the greatest changes in daughter cell positioning occur within the first 10-15 minutes of anaphase onset, when a gap in membrane-GFP/TdTomato is still visible (Lough et al, eLife, 2019). However, while we feel that there are many interesting questions that our work raises about the timing or reorientation relative to specific mitotic stages—e.g. is the midbody asymmetrically positioned, inherited, or ejected?—these questions are beyond the scope of the present study.

      Does the division angle in the AGS3 OE experiment (Figure 1D) correlate with AGS3 levels within the cell?

      This is an interesting question, and indeed, we our hypothesis would predict that it would. However, it is not straightforward to quantify AGS3 or mRFP1 levels, and as we explain in a new section of the Results (lines 212-237), we have some concerns that N-terminally tagged AGS3 may not be fully functional. We have added new data with C-terminally tagged AGS3-mKate2, which we feel provides even stronger evidence that mKate2+ cells show a planar shift compared to mKate2- cells (Fig. 3C,D). In the future, we could test this hypothesis at the population level by comparing division orientation profiles for AGS3-mKate2+ cells carrying either a non-targeting scramble or Gpsm11147 shRNA. We would predict that knocking down endogenous AGS3 while overexpressing AGS3-mKate2 should give an intermediate phenotype.

      I found the localization data to be the weakest part of the paper and feel that some reconsideration and reanalysis are warranted. First, the quantifications in Figures 2C, 3B, and 3F are unnecessarily vague scoring-based metrics. In 2C, "Localization pattern" should be replaced with membrane/cytoplasm ratio or an equivalent quantification. In 3B "LGN localization" should be replaced with apical/cytoplasmic and apical/basal ratios or equivalents. In 3F, "Polarized LGN frequency" should be replaced with apical/basal ratio or equivalent. It seems to me that non-AI processed data would be most appropriate for these quantifications unless such processing can be justified.

      This issue was raised by the previous two Reviewers and has been addressed by new data added to Figure 4.

      Second, it is important to note that the cytoplasmic localization of AGS3 does not allow one to conclude that AGS3 is not on the membrane. Unfortunately, high cytoplasmic signal can preclude the determination of membrane-bound signal.

      We agree with the Reviewer and have softened our language throughout the text.

      Finally, I had difficulty reconciling the images of LGN shown in Figure 3 with the conclusions made by the authors.

      We have added additional, representative images of LGN expression in control and AGS3 KD cells in Figure 4C-E.

      The challenge of the localization data is troubling because an important conclusion of the paper is that AGS3 acts via LGN. The localization data provided one leg of support for this conclusion and the other is provided by an epistasis analysis. Unfortunately, this data seems to be right on the edge because it is based on the difference between the solid and dashed blue lines in Figure 5B not being significant. However, we can see how close this is by comparing the solid and dashed red lines in the adjacent 5C, which are significantly different. Between the localization data, which doesn't seem clear cut, and the epistasis experiment, which is on the razor's edge, I'm concerned that the conclusion that AGS3 acts through LGN may be going beyond what the data allows.

      We appreciate the Reviewer’s comments about the importance of these two lines of experimentation: 1) AGS3’s effect on LGN localization, and 2) epistasis experiments between AGS3/Gpsm1 and LGN/Gpsm2. We feel we have significantly strengthened this first pillar with the additional data presented in Fig. 4C-J. Regarding the second point, we would like to emphasize that we present three lines of evidence for the existence of an epistatic relationship between LGN and AGS3: 1) the static division orientation data comparing LGN single KOs to both LGN KO + AGS3 KD and AGS3+LGN dKOs (Fig. 6B); 2) live imaging division orientation/telophase correction comparing LGN KOs to AGS3+LGN dKOs (Fig. 6C-E); 3) lineage tracing data comparing LGN KOs to AGS3+LGN dKOs (Fig. 7H,I). Further, we think the reviewer may have misconstrued the data presented in Fig. 5C (now Fig. 6C). The dashed lines indicate orientation at anaphase and solid lines 1h after anaphase, so the shift between dashed and solid lines indicates telophase correction, which occurs to similar (and statiscially significant) degrees in both LGN single mutants and AGS3+LGN dKOs. Comparisons between the single and double mutant would be between red and magenta solid lines or red and magenta dashed lines, and neither of these are statistically significant. We realize that our use of dashed lines in Fig. 5B (now Fig. 6B), which we normally only use to refer to anaphase entry in live imaging data, may have caused this confusion. Therefore, we have changed all plots to solid lines¬ in Fig. 6B, and use light and dark magenta, respectively, to differentiate between LGN KO + AGS3 KD and AGS3+LGN dKOs.

    1. Author Response

      Reviewer #3 (Public Review):

      The authors took a comprehensive set of analyses to examine the relationship between pupil diameter / derivative and BOLD-signal during rest in the ascending arousal system nuclei in 72 young participants. Focus is on the locus coeruleus, ventral tegmental area, substantia nigra, dorsal and median raphe nuclei and the basal forebrain. Analyses were performed using various processing pipelines: canonical versus custom hemodynamic response functions, with/without smoothing, time to peak analyses and cross spectral power density analyses to define the time lag between both measurements. The authors could not replicate previous correlations between locus coeruleus BOLD and pupil measurements using standard analytic approaches, and also found no relationship between locus coeruleus BOLD and pupil measurements when using custom hemodynamic response functions. When using time to peak and cross-correlation analyses, the authors found that coupling between pupil size and AAS BOLD patterns increases with decreasing time to peak, when the two signals were close in time. The authors conclude that these findings suggest that pupil size could be used as a noninvasive readout of AAS activity under passive conditions.

      These authors did a thorough assessment, and described the methods and results well and in a balanced manner.

      Outstanding questions:

      • the reliability of these observations? would we see the same findings in a different cohort or using a different sequence/field strength?

      • What is the independent association of each assessed nucleus with pupil dilation? That could be informative to understand their shared or unique role.

      We are grateful to the reviewer for their expert advice in helping us strengthen our manuscript. We agree with the reviewer that these two outstanding questions are important and we have done our best to answer these questions below. We believe that our manuscript has greatly improved, thanks to the reviewer’s suggestions for running these additional analyses.

    1. Author Response

      Reviewer #2 (Public Review):

      The availability of large collections of Mycobacterium tuberculosis (Mtb) isolates has enabled many important studies looking to identify mycobacterial genetic polymorphisms associated with anti-tuberculosis (TB) drug resistance, including both classical "resistance-conferring" mutations and novel "resistance-enabling" mutations. Importantly, these studies have expanded our understanding of mycobacterial genetic adaptations undermining chemotherapy, in many cases allowing for improved diagnostic tests and predictions of treatment failure. In this submission, Gao and colleagues adopt a different approach to the problem: although also applying a GWAS-type analysis, they instead attempt to elucidate polymorphisms implicated in poor outcomes of TB patients undergoing treatment for the drug-susceptible disease. Starting with a large dataset comprising 3496 samples with corresponding clinical (host) metadata, the authors generate Mtb whole-genome sequence data for 91 samples obtained from patients with "poor" outcomes and 3105 patients with "good" outcomes. These are used to identify 14 fixed and >230 unfixed mutations that might be associated with "poor" treatment outcomes, a conclusion which they argue is plausible given transcriptional evidence implicating many of the identified genes in the mycobacterial response in vitro to first-line drug exposure and/or hypoxia, both of which are considered relevant to clinical disease. Notably, they also identify a tendency for a greater proportion of "ROS mutational signatures" in unfixed mutations from "poor" outcome samples. Finally, incorporating these observations in a prediction model, the authors observe that the mycobacterial factors aren't adequate on their own but, when combined with key host factors - including patient age, sex, and duration of diagnostic delay (which have stronger predictive value) - they enhance predictive capacity. In summary, this paper reports a novel approach yielding observations that offer tantalizing insight into the mycobacterial factors which might influence TB treatment outcomes independent of drug resistance, however, the following must be considered:

      (i) The manuscript provides little to no detail about how the samples were obtained, other than the fact that they comprise "pre-treatment" samples: are they all sputum samples? Were they induced? Similarly, no information is provided about sample propagation: were the samples cultured to achieve sufficient biomass for whole-genome sequencing? If so, in what growth media, for how long, and how many passages? Were all samples treated identically? And were they plated to single colonies - or are the "isolates" referred to throughout the manuscript actually heterogenous populations of potentially different Mtb clones obtained - and propagated - as a mixed sample? This information is critical given the potential that the identified polymorphisms - both fixed and (perhaps even more so) unfixed - might have arisen as a consequence of in vitro (laboratory) manipulation under standard aerobic conditions.

      Thanks for your encouraging comments. The requested information about sample propagation has been added to the methods section in the new version. For details, please see our response, above, to the essential revisions (Q1).

      (ii) A key question that arises from this study (and others like it) is whether causation has been adequately established. Ideally, the Mtb genotypes contained within samples obtained pre-treatment should be compared with samples obtained from the same patients following treatment - that is, when the "poor" outcome was manifest. The expectation is that the polymorphisms identified prior to initiation of therapy - especially the 14 fixed mutations - should be evident (even dominant) at the later stage when therapy failed (or at the subsequent presentation in cases of relapse). Recognizing that this is not easily accomplished, though, it seems fair to suggest that the perceived relevance of the identified mutations would be strengthened if the authors were able to provide any other evidence - perhaps from studies of drug-resistant Mtb isolates - supporting their inferred role in undermining frontline treatment.

      Thank you for these insightful questions. We sequenced the isolates obtained at the time of relapse for all 47 relapse cases and found that the 14 GWAS-identified fixed mutations were only detected in relapse isolates from the 13 patients whose first samples also contained the GWAS-identified mutations. None of the 14 mutations we identified were found in isolates from the other relapsed patients. We also searched for the presence or absence of theses 14 mutations in published studies seeking noncanonical mutations associated with drug-resistant Mtb isolates [5-7]. None of the 14 mutations we identified were reported in any of these studies, but two of the genes (ctpB & metA) in which our mutations were found had been previously identified as potentially associated with first-line drug resistance.

      (iii) Related to the above, the authors make the valid point that their intention here was different from other studies which have deliberately utilized drug-resistant Mtb isolates to identify resistance-conferring and resistance-enabling mutations (such as in the study they cite by Hicks et al). It would be interesting to know, however, if any of the mutations identified in those other studies were also picked up in this work - and, if not, why that might be the case.

      As mentioned in our response to the previous question, none of our mutations were mentioned in prior studies. Our inference is that the 14 fixed mutations we identified had only limited effects on outcomes, which would explain why: they were not identified in previous studies; isolates from only 24.2% (22/91) of patients carried any of these 14 mutations; and none of the mutations were shared amongst all 22 patients.

      (iv) Finally, the analyses presented in this study are heavily dependent on the use of appropriate statistical methods to identify potentially rare genetic polymorphisms. However, as noted for sample processing (see my earlier comment above), there is very little detail provided about the methodology applied. This omission detracts from the interpretation, especially given that the predominance of lineage 2 (which contributes >75% of the isolates, with sublineage 2.3 constituting >50%) risks a lineage-specific association, rather than a more generalizable pathogenicity phenotype. Similarly, the heavy skew in the numbers of "good" (3105 samples) versus "poor" (91 samples) collections (approximately 34x difference in sample size) raises the possibility that mutations identified in the "poor" category might be artificially over-represented. More clarity in detailing the statistical methods is required to allay any concerns about the identification of candidate polymorphisms.

      Thank you for pointing this out. We have added details of our statistical methods to the methods section, and in the results section we have indicated the specific statistical methods used and the meaning of the statistical metrics.

    1. Author Response

      Reviewer #1 (Public Review):

      Lammer et al. examined the effects of social loneliness, and longitudinal change in social loneliness, on cognitive and brain aging. In a large sample longitudinal dataset, the authors found that both baseline loneliness and an increase in loneliness at follow-up were significantly associated with smaller hippocampal volume, reduced cortical thickness, and worse cognition in healthy older adults. In addition, those older adults with high loneliness at baseline showed even smaller hippocampal volume at follow-up. These results are interesting in identifying the importance of social support to cognitive and brain health in old age. With a longitudinal design, they were able to show that increased loneliness was related to reduced brain structural measures. Such results could help guide clinicians and policymakers in designing social support systems that would benefit the growing aging population.

      The strength of the current study lies in the large sample size and longitudinal follow-up design. The multilevel models used to separate within and between subject effects are well constructed. Combining neuroimaging data with behavioral changes provided further evidence that social loneliness may be related to accelerated brain aging. Stringent FDR correction, Bayes factor comparison, and the additional analyses for sensitivity showed the robustness and credibility of the results.

      Thank you for a thorough and overall positive evaluation of our manuscript and the constructive feedback. We considered all of your comments valuable, please see point-by-point responses below for more details.

      Weaknesses of the study were related to the interpretation and discussion of their findings.

      1a) Social loneliness is a relatively little-studied factor in cognitive ageing, and the authors should consider expanding the discussion, with some additional analyses, as to how their results could be used by clinicians and older adults to monitor social behaviors.

      We agree with the reviewer and are thankful for these suggestions. We have run additional analyses following the clinical cut-off of the questionnaire on social isolation and added those and their interpretations to the results and discussion section. Please see below response to questions 2a) and 3a) as well as to those in section b) to this reviewer how we implemented the reviewer’s advice in detail.

      2a) The authors examined the interaction between baseline and age change to see if higher baseline loneliness was associated with accelerated decline. The interaction was significant, but the authors did not further explore the interaction effect, which may have clinical significance. The authors should consider identifying a cut-off point in LSNS that suggests persons scoring less than this score on the LSNS may be at greater risk of accelerated brain decline than others. Such a cut-off point is important for clinicians, as well as for future researchers to compare their results.

      2a) Thanks to your recommendation, we decided to explore differences between handling LSNS as a categorical (using the standard threshold of 12) and continuous variable and recalculated all LMEs on HCV and cognitive functions with LSNS coded dichotomously. We found the results to be similarly good in detecting adverse effects of social isolation (see new Tables S16-18). The interaction of categorical LSNS with change in age on HCV tends towards showing an effect but does not reach significance even before FDR-correction.

      As cut-off points are central to clinical work, we are convinced that this expansion improved our study greatly, contributed to its benefit to our readers and we are thus very grateful for this valuable question.

      Our analyses indicate that the cut-off can be employed in clinical settings to detect social isolation that might harm patients’ brain health.

      However, this does not answer another important question, namely which public health strategy is most suitable to target social isolation for preventive purposes. Should it focus on the most isolated individuals (i.e. those categorised as socially isolated) or pursue a population strategy (Rose et al., 2008)? This actually is the topic of ongoing research in our group and we hope to answer it in future work. For now, we ran additional models testing an interaction effect of dichotomous LSNS with continuous LSNS. Finding evidence for such an interaction effect would suggest that having less social contact has stronger negative effects for those that are categorised as socially isolated. Roughly speaking, is it worse to have one instead of two reliable friends than it is to have four instead of five? If this were the case, this would point public health towards a high-risk rather than population strategy. We did not find any evidence for such an interaction effect and thus can not say that we have found that more social contact ceases to be beneficial beyond the threshold score of 12. In addition to the new results, we have expanded on this in the discussion section where it now reads: „We showed that the established LSNS cut-off can be employed by clinicians to identify subjects likely to suffer adverse effects due to social isolation. However, the absence of evidence for more pronounced negative effects of less social contact amongst those that are deemed socially isolated by the cut-off renders a public health strategy focused on high-risk individuals questionable.”

      3a) Although it was not directly tested in the paper, LSNS scores did not seem to change with increasing age (Table 1). This general stability of LSNS scores in older adults should be discussed further. The authors should consider how their relatively healthy and high SES sample may be less vulnerable to loss of family or friends in old age, making this sample sub-optimal for the question they have. The significance of the subject effect suggests that some individuals still experience a loss of social connectedness. The authors may want to elaborate on this and give some explanations of such subject differences in the ageing effect on social loneliness. Although stress was not a significant mediating factor, is it related to baseline loneliness or changes in loneliness in the current sample?

      Concerning the link between change in age and LSNS we indeed found a statistically significant effect of age change on higher social isolation in an ancillary LME. However, as the reviewer noticed, the per year effect is very small, meaning that it would need getting more than 20 years older to score one point higher on the LSNS sum score (see new Table S2, see also answer below to questions 4a and 3b). We therefore tend to agree that in our sample, higher age does not affect social isolation substantially.

      Furthermore, we very much appreciated your recommendation to further discuss how our relatively high SES-sample might be less vulnerable to loss of social contact during the aging process. As a foundation for this discussion, we investigated the link between SES and LSNS using an LME and found the association to be highly significant (see new Table S2). Furthermore, we added a table showing which percentage of our participants fell into the SES quintiles that would be observed in a fully representative German sample to help our readers to interpret our findings (see new Table S3). Following your advice, we have added a comment highlighting how the relatively high SES of our sample might have contributed to this in the limitations section: “As we found higher SES to be associated with lower LSNS scores, this relatively high SES sample might have led to underestimation of the detrimental effects of social isolation and increases in social isolation in the aging process.”

      Regarding the importance of chronic stress to social isolation, we did not only find no mediating effect of stress, we also did not find a significant simple association between TICS and LSNS scores (see new Table S2). We are hesitant to attribute this finding to the incorrectness of the stress-buffering hypothesis as the missingness in stress data makes all interpretations of analyses involving TICS scores problematic. We have expanded on this in the discussion section and added emphasis to the importance of also pursuing other mechanistic theories in our discussion, where it reads: “we could not find evidence that social isolation affected hippocampal volume through higher chronic stress measured with questionnaires, a hypothesis put forward by the stress buffering theory (Kawachi & Berkman, 2001). These latter analyses suffered from small sample sizes and a limited number of timepoints. Nonetheless, the lack of any significant link between chronic stress and social isolation (see Table S2) is hard to align with the stress-buffering hypothesis in spite of the missingness in the TICS.”.

      4a) The presentation of longitudinal data (Figure 1) lacks dimensionality. The scatter plots presented here are more suitable for cross-sectional studies and could cause confusion regarding the interpretation of the results. The authors should consider individual growth curves or spaghetti plots in visualizing change within subjects.

      We are grateful for your advice to visualise individual developments in social isolation and outcome measures over time in spaghetti plots and have done so to give our readers insight into these developments (see new Fig. S1). As you had assumed, there is no unequivocal pattern of increasing social isolation over time (see also answer to 3a). In addition, we decided to stick with presenting results of the statistical modeling of linear mixed effect using scatterplots in Figure 1, as this is regarded the most appropriate visualization of the tested effectors. Please see also response to 5b.

      Reviewer #2 (Public Review):

      The paper by Laurenz Lammer and colleagues used cohort data to investigate the cross-sectional and longitudinal association between loneliness and brain structure and cognitive function. The main finding was that baseline social isolation and change in social isolation were associated with smaller hippocampus volumes, reduced cortical thickness, and poorer cognitive function. Given that more and more people feel lonely nowadays (e.g., due to the pandemic), the study by Lammer and colleagues addresses a highly relevant health concern of our time.

      Significant strengths of the study:

      • large cohort;

      • the cross-sectional and longitudinal analyses confirmed the findings;

      • the study was preregistered;

      • the study included men and women;

      • analyses were sound and controlled for essential confounders.

      Thank you for your time to thoroughly review the manuscript and for the encouraging comments. Please see below how we implemented your advice.

      The major weaknesses of the study:

      1a) it is unclear whether loneliness causally contributes to brain structure and cognitive function;

      Indeed, based on structural equation analyses of the available data from this cohort, we could not find strong evidence for neither causality (social isolation causes brain/cognitive decline) nor reverse causality (brain/cognitive decline causes social isolation). This could be due to a lack of power to detect such effects due to the drop in sample size for these analyses. Overall, regarding these two competing hypotheses, we see some minor indication of support for causality of social isolation in our data due to the presence of robust and significant associations in our very healthy sample, the absence of clear increases in effect size when including cognitively less healthy participants and the absence of clear decreases in effect sizes when only including participants with high MMST scores. Accordingly, we added this concluding synopsis to our paragraph on causality in our discussion: “Still, overall these results only add a modicum of corroboration to the case for a causal role of social isolation.” and pointed towards the key role of RCTs in understanding causality in this regard: ”Intervention studies will be the gold standard to provide evidence with regards to the causal role and effect size of social isolation.”

      2a) the factors that may cause loneliness are unclear.

      Thank you very much for encouraging us to shed some light on participant characteristics of potential relevance to social isolation. Starting from the impulse to look into marital status and employment, we also investigated links to socioeconomic status, migration background, age at baseline, change in age, gender, living alone and the number of persons living in the participants dwelling. We found all of these factors except for gender and migration background to be significantly linked to social isolation. Results are presented in Table S2 and briefly referred to in the results section: “In our sample, social isolation was positively correlated with not living alone, being married, the number of persons living in the participants’ dwelling, being gainfully employed, younger baseline age and less change in age and being married but no to gender or having a migration background. See Tables S1-2 for descriptive statistics and details of the associations. To contextualise the observed link to SES, a comparison of SES category frequencies in LIFE-Adult and a fully representative sample (Lampert et al., 2013) is provided in Table S3.” And added to the discussion: “Existing and future research on reasons for and the role of social isolation in health and disease should provide guidance for the urgently needed development and evaluation of tailored strategies against social isolation and its detrimental effects.”

    1. Author Response

      Reviewer #1 (Public Review):

      Weakness of the study include:

      1) There are no data supporting a role for insulin regulation of microtubule-dependent GLUT4-containg vesicle movement. The data in Fig.2B do not support a differences in the number of "moving" GLUT4 vesicles between basal and insulin-stimulated fibers. The statement on line 103 that they "observed a ~16% but insignificant increase" to be confusing. These data do not support an effect of insulin on the number of moving GLUT4 vesicles that can be detected in an individual experiment. There is also effect of insulin on GLUT4 vesicles in the data reported in Fig.S2D, Fig.S5B, and Fig.S5F. However, the data in Fig. 2C suggest there was a consistent increase in "moving" vesicles in insulin-stimulated conditions in 4 independent experiments (how are these data normalized?). Because the basis of insulin-regulation of glucose uptake is the control of GLUT4 translocation to the plasma membrane, the authors need to clarify their thinking on why they do not detect insulin robust effects on GLUT4 dynamics in the individual experiments. Is it that they are not measuring the correct parameter? That the assay is not sensitive to the changes?

      The small (or no effect) of insulin distracts a bit from the findings that there is microtubule-dependent GLUT4 movement in basal and stimulated muscle fibers, and that disruption of this movement by depolymerization of microtubules or Kif5b knockdown blunts GLUT4 translocation. As noted above, the data strongly support microtubule-dependent GLUT4 dynamics as permissive for insulin-stimulated GLUT4 translocation even if this dynamics might not be a target of insulin action.

      In light of the reviewer´s comment and to avoid confusing/distracting readers we have removed figure 2C showing the effect of insulin based on pooled data across all our independent experiments. We discuss several possibilities for the lack of significant insulin effect on GLUT4 movement in individual experiments in the discussion section (lines 342 to 361 in TC version of MS). The discussion has been updated to reflect the points raised by the reviewer. More sensitive techniques than currently available in our lab are required to firmly conclude whether microtubule-based GLUT4 trafficking is directly regulated by insulin.

      2) The analyses of GLUT4-containing structures are not particularly informative. Co-localization with other markers (beyond syntaxin6) are needed to understand these structures. Defining structures as small, medium or large is incomplete. In particular, it is important to probe the microtubule nucleation site clusters for other membrane markers. Transferrin receptor? IRAP?

      While our analysis based on structure-segmentation clearly demonstrate a microtubule-dependent effect on GLUT4 localization, we completely agree that additional work including co-labelling of GLUT4 and various compartment markers is required to fully understand the localization changes observed for GLUT4-containing structures upon microtubule disruption. However, for practical reasons, it is not currently feasible for us to complete these analyses within a reasonable time-frame so we will reserve this for future studies.

      3) The Kinesore data do not support the authors hypothesis. The data show that Kinesore increases the amount of GLUT4 in the plasma membrane of basal cells and that insulin further increases plasma membrane GLUT4 to the same extent as it does in control cells. How does that provide insight into the role microtubules (or kif5b) in GLUT4 biology? Why does Kinesore increase plasma membrane GLUT4? Is it an effect of Kinesin 1 on GLUT4 vesicles? Kinesore is reported to remodel the microtubule cytoskeleton by a mechanism dependent on Kinesin 1. Is that the reason for the change in GLUT4?

      To better understand the effect of kinesore on GLUT4-dependent glucose uptake, we have now incubated EDL and Soleus muscles ± kinesore and ± insulin and measured 2-DG uptake (GLUT4 translocation and glucose transport is considered the rate-limiting step for 2-DG uptake in incubated muscles due to the lack of muscle perfusion in this model) and proximal insulin signaling. In contrast to the enhancing effect on membrane GLUT4 observed following kinesore treatment in basal and insulin stimulated L6 cells, kinesore did not stimulate basal 2-DG uptake in EDL and Soleus. Furthermore, kinesore markedly impaired insulin-stimulated 2-DG uptake (figure 4B). We also tested the effect of 2h kinesore treatment in differentiated primary human myotubes. In this model, kinesore reduced basal glucose uptake and blocked the insulin effect (figure 4C). Together, this suggests that kinesore inhibits GLUT4-dependent glucose uptake in adult muscle and primary human muscle cells, presumably by inhibiting the binding of GLUT4 containing cargo, despite kinesore also having an activating effect on Kinesin-1 motor function. This possibility is discussed in the current version of the manuscript (line 177-180, 203-211). These data are consistent with the KIF5B knockdown data in L6 and support a necessary role of this motor protein in skeletal muscle GLUT4 trafficking.

      To better understand, why kinesore led to increased rather than decreased GLUT4 translocation in L6 cells, we also disrupted the microtubule network using nocodazole and colchicine prior to kinesore stimulation. Surprisingly, kinesore stimulation enhanced membrane GLUT4 even in microtubule-disrupted L6 cells, indicating that the effect of kinesore on GLUT4 translocation is microtubule-independent in L6 cells. With three of four data sets supporting a necessary role of Kinesin-1 motor proteins in GLUT4 trafficking, including the adult muscle data, we end up concluding:

      …our shRNA data in L6 myoblasts and kinesore data in adult muscle support the requirement of KIF5B-containing Kinesin-1 motor proteins in insulin-stimulated GLUT4-dependent glucose uptake in skeletal muscle.

      However, we would also like to include the discrepant effect of Kinesore in L6 myoblasts as this may be useful information to others using this compound and/or studying GLUT4 in cultured cells.

      4) The analysis of Kif5b is a bit cursory. Depolymerization of microtubules in muscle fibers essentially blocks all GLUT4 movement (only the insulin condition is shown in Fig.2B but I assume basal would be equally inhibited), and fully inhibits insulin-stimulated glucose uptake in muscle fibers. What are the effects of nocodazole in L6 cells (cell used for kif5b studies) and is it similar in magnitude to kif5b knockdown? Those data would identify there are non-Kif5b microtubule-dependent effects.

      To address the magnitude of reduced insulin-stimulated GLUT4 translocation in microtubule-disrupted L6 cells, we investigated the effect of nocodazole (13 µM) and colchicine (25 µM) on GLUT4 translocation in L6 cells.

      Insulin stimulated GLUT4 translocation was reduced but not blocked by either nocodazole or colchicine. This is in accordance with previous in vitro studies in 3T3 adipocytes and muscle cells (PMID: 11085918, PMID: 11145966, PMID: 24705014). Overall, these data still support that Kif5b is a major microtubule motor protein regulating GLUT4 translocation across cell-types.

      5) The authors need to show that the fibers isolated from the HFD mice remain insulin-resistant ex vivo by measuring glucose uptake. It is possible that once removed from the mice they "revert" to normal insulin-sensitivity, which might contribute to the differences reported in Fig5.

      This is an important point. In figure 5 figure supplement 1E, we show that the fibers isolated from the diet-induced obese mice display impaired insulin-induced p-Akt Thr308 and p-TBC1D4 Thr642 after isolation and in vitro culture. This shows that the insulin resistance is present at the muscular level and is preserved after isolation and in vitro culturing.

      6) Although it is interesting that the authors have included the insulin-resistance models/experiments, they are not well developed and therefore the conclusions are not particularly strong.

      In this study, we induced insulin resistance by two different means (C2 ceramide treatment and diet-induced obesity) and demonstrated at the level of p-Akt and p-TBC1D4 in cultured muscle fibers that we successfully achieved insulin resistance in our models. In particular the high fat diet model is arguably the most common in vivo model of obesity-linked insulin resistance. Thus, we were able to study GLUT4 trafficking on microtubules in normal vs. insulin-resistant muscle fibers and found this to be impaired in insulin-resistant muscle. Although one could always have done more, we believe that our data on adult muscle GLUT4 movement in insulin-resistance are robust, novel and do support our conclusions and title.

      7) The data do not support the title.

      We respectfully disagree. See our reply to comment 6 above.

    1. Authorr Response

      Reviewer #1 (Public Review):

      1) The study finds Lyn to be degraded more efficiently via the proteasome and to be more tightly controlled by phosphatases when compared to Lck. However, rather than interpreting the findings as distinct kinase-intrinsic properties, one could attribute the slower degradation and stricter PTP control of Lyn to the fact that Lyn is the principal and predominant SFK in B cells and thus a "standard target" of the B-lymphoid molecular machinery, to which it is better adapted to.

      We respectfully disagree with the reviewer’s comment that our interpretation is limited to “kinase-intrinsic properties”. In many points within the manuscript we refer to the “B-lymphoid molecular machinery”. More specifically:

      • Lines 62-64 in the original submission (lines 60-61 in the revised manuscript): “….enzymatic promiscuity of SFKs can be buffered by their differential susceptibility to regulatory control mechanisms designed for keeping global SFK activity levels under strict control….”

      • Lines 113-114 in the original submission (lines 137-138 in the revised manuscript): “Lck and Lyn differ in the efficiency for signal ignition and in their susceptibility to regulatory mechanisms in B-cells”

      • Lines 135-136 in the original submission (lines 159-160 in the revised manuscript): “Thus, the proteasomal degradation machinery constrains the abundance of Lyn, but not Lck, within B-cells.”

      • Lines 162-163 in the original submission (lines 185-186 in the revised manuscript): “Collectively these data show that the BCR signaling machinery is more responsive to the action of Lyn, at the same time imposing stricter regulation on its expression and activity levels.”

      • Lines 475-477 in the original submission (lines 527-528 in the revised manuscript): “…identified specialized control mechanisms designed to keep Lyn, but not Lck, activity levels under strict control.”

      However, we cannot rule out, as a mutually inclusive scenario, that intrinsic SFK features contribute to their differential regulation by cellular mechanisms, a possibility that we also refer to in the manuscript. More specifically:

      • Lines 335-337 in the original submission (modified text in the revised version, lines 372-374): “On one hand there is the total amount of SFK activity within the cell, and on the other the individuality of SFK family members, dictated by intrinsic molecular features.”

      • Lines 477-478 in the original submission (lines 528-529 in the revised manuscript): “These data may signify that SFKs have been evolutionarily diversified to best suit the needs of the cellular environment they are expressed in…”

      Based on the reviewer’s comment, and to clarify further, we have modified the revised version of the manuscript (lines 372-374) as follows:

      “On one hand there is the total amount of SFK activity within the cell, and on the other the individuality of SFK family members, dictated by intrinsic molecular features and/or adaptation to cell-specific regulatory mechanisms.”

      We hope that our clarifications, satisfy the reviewer.

      2) Venn diagram depicting differentially regulated transcripts between Lck- and Lyn-expressing cells, it does not seem like Lck is able to regulate pathways which are not "canonically" regulated by Lyn.

      and

      As a distinct functional difference between Lck and Lyn is not established in this work, said SFKs' largely exclusive expression in T and B cells remains enigmatic.

      We thank the reviewer for the comment. We address this issue on the discussion section of the revised manuscript (lines 514-519).

      3) There is also the persisting problem of Lck being expressed to a much higher extent and the effect of the endogenously expressed Lyn since the model systems are not based on a Lyn-deficient cell line.

      For the purpose of the analysis, we tried to circumvent the discrepancies between Lck and Lyn expression levels by our equal GFP gating strategy (explained in Figure 1-figure supplement 3E/Fig.S3E in the original submission). Nevertheless, as shown in Figure 1C there is a physiological reason for the two SFKs not being equally expressed, and we refer to the biological implications of these individualities in the Discussion.

      The effect of endogenously expressed Lyn is represented by the phenotype of -Dox cells which we use as background in all our studies, especially since we show that there are no alterations on Lyn or any other SFK activation status resulting from Lck overexpression (Figure 1-figure supplement 2B/ Fig.S2B in the original submission), so we do not believe this is a problem. Additionally, a Lyn-deficient environment would also not be perfect, since very plausibly it could have undergone further signaling and survival adaptations that we could not account for.

      4) Lastly, the authors follow up their finding of deregulated transcripts belonging to the ER/UPR ontology cluster. Flow cytometric analysis indeed shows an influence of Lck and Lyn expression on ER homeostasis, which can be reverted with SFK inhibitors. Alas, additional follow-up experiments to functionally investigate the deregulated pathways suggested by the RNAseq analysis are not included in this study.

      We thank the reviewer for the comment, and we agree. However, its beyond of our capabilities and manpower and the scope of the present work to perform numerous functional or semi-functional studies for every GO analysis pathway that emerged from the transcriptomics studies. Although follow up work from our group will focus on comprehensive and meticulous analyses of gene expression profiles, currently such an effort would require long-lasting studies which would also significantly extend the size of the manuscript but also distort the focus from the effects we wish to pinpoint with the present work i.e. the unique adaptation of SFKs within the lymphocyte environment and gene expression profile tendencies exclusively controlled by SFK-generated signals.

      In an effort to satisfy the reviewer, we performed focused follow up studies specifically on the ER effect of SFK-transduced signals, since it appears to be a so-far unknown aspect of their function. The new data are presented in the revised version of Figure 4 (panels C and D) and Supplementary Figure 4-figure supplement 1. Corresponding text can be found in lines 323-345 of the revised manuscript (results section) and lines 499-512 and line 531 of the discussion. In brief, we show an SFK kinase-activity dependent activation of the ER-phagy receptor FAM134B, which is not accompanied by recruitment of LC3B, as dictated by the currently known canonical ER-phagy pathway. This is the first report of SFKs’ involvement in ER-phagy process and first time FAM134B activation is described in B-cells. Since this field is relatively new, and the role and regulation of ER-phagy is almost unexplored in B-cells, we hope that the reviewers will appreciate the novelty of the finding and its sufficiency for the current manuscript. We do realize that these initial data prompts for more detailed mechanistic investigation, which we are pursuing in the form of a more complete and comprehensive future study.

      Reviewer #2 (Public Review):

      1) Studies reveal no qualitative functional differences in Lck and Lyn that are likely to explain its unique ectopic expression of Lck in CLL

      and

      If Lck promotes pathophysiology by transduction of a qualitatively unique signal, one would expect that transcriptome analysis should reveal this difference.

      We thank the reviewer for the comment. We address this issue on the discussion section of the revised manuscript (lines 514-519).

      2) It is unclear from the material and methods whether the overexpressed Lyn is LynA or Lyn B. It appears in the text (lines 130-133) that they overexpress LynB specifically. A recent paper from Tania Freedman (Sci Adv 2022 PMID:35452291) suggests that LynA is more activating whereas LynB is more balanced with an inhibitory bias. The point is that it is important to discuss this because they may not be making a relevant comparison.

      We thank the reviewer for the comment, to clarify this, we added in the Materials and Methods section of the revised manuscript (under “Cloning and Plasmids”) the use of Lyn isoform B.

      We initially attempted to produce BJAB lines overexpressing LynA, however expression levels of this isoform was particularly low and we could not proceed with further analyses, so we cannot comment on how LynA might behave in an overexpression model in B-cells, especially given the absence of relevant information in the existing literature.

      The recent Sci Adv 2022 PMID:35452291 study deals with germline LynA and LynB isoform-specific knockouts and their propensity towards autoimmunity in mice. The authors compared the single isoform (LynA or LynB) and total Lyn knockouts by performing systemic phenotypic analyses of autoimmunity features (splenomegaly, myeloid cell profiles, proinflammatory markers on myeloid cells, B cell development, expansion of activated and autoimmunity-associated B cell subsets, autoimmunity scores). Differences they pinpoint between LynA and LynB are summarized as follows:

      1. “It was found that LynB has the dominant regulatory role in mice of both sexes, but that LynA expression is uniquely required to prevent autoimmunity in female mice”. The etiology of which is unclear.

      2. “LynB generally appears to be the dominant immunosuppressive isoform, with LynB deletion causing severe autoimmune disease in male and female mice. For some indicators (splenomegaly, glomerular IgG and C3 deposition, and kidney fibrosis), LynBKO and total LynKO mice developed equally severe phenotypes. In other cases (serum IgM and BAFF, glomerular immune infiltration, myeloid cell polarization, and monocyte/granulocyte expansion), LynBKO mice had less severe phenotypes than total LynKO mice, suggesting an additive effect with LynA”.

      3. “LynA and LynB seemed equally capable of promoting B cell development, regulating myeloid cell polarization and restraining myeloid-driven inflammation. Given the increased number of activated/inflammatory B cell types in LynAKO and LynBKO mice, future studies will be aimed at determining whether the single-isoform knockouts have a more B cell–initiated than myeloid cell–initiated form of autoimmune disease”.

      After careful reading of the manuscript, we could not find any functional analyses on the activation status of the distinct isoforms, or signaling events they elicit. Furthermore, the authors do not report any conclusions that LynA is more activating at the molecular level. Based on the above, we cannot connect the data published in PMID:35452291 paper and our results for discussing “LynA being more activating” and implications this might have on our studies.

      To comply with the reviewer’s suggestion, in our revised manuscript we cite this study (ref number 29) in the following sentence appearing in lines 380-383:

      “Lyn exists as two alternatively spliced variants LynA and LynB. Distinct biological functions between the two isoforms still remain poorly understood. A recent study (29) documented that LynB provides an advantage in protecting against autoimmunity compared to LynA; however, the underlying mechanisms for this phenotype are unclear.”

    1. Author Response

      Reviewer #2 (Public Review):

      The authors use data from 3 cross-sectional age-stratified serosurveys on Enterovirus D68 from England between 2006 and 2017 to examine the transmission dynamics of this pathogen in this setting. A key public health challenge on EV-D68 has been its implication in outbreaks of acute flaccid myelitis over the past decade, and past circulation patterns and population immunity to this pathogen are not yet well-understood. Towards this end, the authors develop and compare a suite of catalytic models as fitted to this dataset and incorporate different assumptions on how the force of infection varies over time and age. They find high overall EV-D68 seroprevalence as measured by neutralizing antibodies, and detect increased transmission during this time period as measured by the annual probability of infection and basic reproduction number. Interestingly, their data indicate very high seroprevalence in the youngest children (1 year-olds), and to accommodate this observation, the authors separate the force of infection in this age class from the other groups. They then reconstruct the historical patterns of EV-D68 circulation using their models and conclude that, while the serologic data suggest that transmissibility has increased between serosurvey rounds, additional factors not accounted for here (e.g., changes in pathogenicity) are likely necessary to explain the recent emergence of AFM outbreaks, particularly given the broader age-profile of reported AFM cases. The Discussion mentions important current unknowns on the biological interpretation of EV-D68 neutralizing antibody titers for protection against infection and disease. The analysis is rigorous and the conclusions are well-supported, but a few aspects of the work need to be clarified and extended, detailed below:

      1) Due to the lack of a clear single cut-point for seropositivity on this assay, the authors sensibly present results for two cut-points in the main text (1:16 and 1:64). While some differences that stem from using different cut-points are fully expected (i.e., seroprevalence being higher using the less stringent cut-point), differences that are less expected should be further discussed. For instance, it was not clear in Figure 2 why the annual probability of infection decreased after 2010 using the 1:64 cut-point, while it continued to increase using the 1:16 cut-point. It would also be helpful to explain why overall seroprevalence and R0 continue to increase over this time period using the 1:64 cut-point. Lastly, it would be useful to see the x-axis in Figure 4 extended to the start of the time period that FOI is estimated, with accompanying credible intervals.

      For the discussion on differences between the two cut-offs, please see response to essential comment 1.

      Extending the x-axis before 2006 in Figure 4 is not possible. Estimates of the overall seroprevalence at a year y require FOI estimates up until y-40. This implies the first estimates we can provide are for 2006.

      Credible intervals have been added to Figure 4.

      2) Additional context of EV-D68 in the study setting of England would be useful. While the Introduction does mention AFM cases "in the UK and elsewhere in Europe" (line 53), a summary of reported data on EV-D68/AFM in England prior to this study would provide important context. The Methods refers to "whether transmission had increased over time (before the first reported big outbreak of EV-D68 in the US in 2014)" (lines 133-134), rather than in this setting. It would be useful to summarize the viral genomic data from the region for additional context - particularly since the emergence of a viral clade is highlighted as a co-occurrence with the increased transmissibility detected in this analysis.

      We have added a figure (new Figure 1 – figure supplement 1) showing the annual number of EV-D68 detections reported by Public Health England from 2004 to 2020.

      We have also added the following text to the introduction: “Similarly, in the UK, reported EV-D68 virus detections also show a biennial pattern between 2014 and 2018 (Figure 1 – figure supplement 1).”

      We have also amended the sentence in the Methods.

      Finally, below is a screenshot of the nexstrain tree for EV-D68 based on the VP1 region and with tips representing sequences from the UK (light blue) and European countries in colour. There is a lot of mixing between sequences from different regions, indicating widespread transmission and small regional clustering. We have added the following text to the Discussion: “Reported EV-D68 outbreaks in 2014 and 2016 were due to clade B viruses, while the 2018 outbreaks were reported to be linked to both B3 and A2 clade viruses in the UK (10), France (32) and elsewhere.”

      Reviewer #3 (Public Review):

      In the proposed manuscript, the authors use cross-sectional seroprevalence data from blood samples that were tested for evidence of antibodies against D68 for the UK. Samples were collected at 3 time points from individuals of all ages. The authors then fit a suite of serocatalytic models to explain the changing level of seropositivity by age. From each model they estimate the force of infection and assess whether there have been changes in transmissibility over the study period. D68 is an important pathogen, especially due to its links with acute flaccid myelitis, and its transmission intensity remains poorly understood.

      Serocatalytic models appear to be appropriate here. I have a few comments.

      The biggest challenge to this project is the difficulty in assigning individuals as seronegative or seropositive. There is no clear bimodal distribution in titers that would allow obvious discrimination and apparently no good validation data with controls with known serostatus. The authors tackle this problem by presenting results to four different cut-points (1:16 to 1:128) - resulting in seropositivity ranging from around 50% to around 80%. They then run the serocatalytic models with two of these (1:16 and 1:64) - leading to a range of FoI values of 0.25-0.90 for the 1 year olds and 0.05-0.25 for older age groups (depending on model and cutpoint). This represents a substantial amount of variability. While I certainly see the benefit of attacking this uncertainty head on, it does ultimately limit the inferences that can be made about the underlying risk of infection in UK communities, except that it's very uncertain and possibly quite high.

      I find the force of infection in 1 year olds very high (with a suggestion that up to 75% get infected within a year) and difficult to believe, especially as the force of infection is assumed much lower for all other ages.

      The authors exclude all <1s due to maternal antibodies, which seems sensible, however, does this mean that it is impossible for <1s to become infected in the model? We know for other pathogens (e.g., dengue virus) with protection from maternal antibodies that the protection from infection is gone after a few months. Maybe allowing for infections in the first year of life too would reduce the very large, and difficult to believe, difference in risk between 1 year olds and older age groups. I suspect you wouldn't need to rely on <1 serodata - just allow for infections in this time period.

      Relatedly, would it be possible to break the age data into months rather than years in these infants to help tease apart what happens in the critical early stages of life.

      Yes. We have added two figures (new Figures 1C and 1D) showing the prevalence of antibodies in children <1 yo. We show these data for the three serosurveys combined, because the number of individuals per month of age is very small.

      One of the major findings of the paper is that there is a steadily increasing R0. This again is difficult to understand. It would suggest there are either year on year increases in inherent transmissibility of the virus through fitness changes, or year on year increases in the mixing of the population. It would be useful for the authors to discuss potential explanations for an inferred gradual increase in R0.

      We have removed the estimates of R0 from the manuscript.

      On a similar note, I struggle to reconcile evidence of a stable or even small drop in FoI in the 1:64 models 4 and 5 from 2010/11 (Figure 3) with steadily increasing R0 in this period (Figure 4). Is this due to changes in the susceptibility proportion. It would be good to understand if there are important assumptions in the Farrington approach that may also contribute to this discrepancy.

      We have removed the estimates of R0 from the manuscript and only present the reconstruction of the annual number of new infections per age class and year (new Figure 5). We think this measure is more adapted to the discussion of the results.

      In addition, when using the classical expression R{0t}=1/(1-S(t)), with S(t) the annual proportion seropositive, the high seroprevalence estimates (new Figure 4) result in extremely high estimates of the basic reproduction number (median ranges: 11.6 – 29.7 for 1:16 and 3.3 – 7.6 for 1:64 during the period 2006 to 2017).

      We had previously used the Farrington approach as it is adapted to cases when the force of infections is different for different age classes.

      The R0 estimates (Figure 4) should also be presented with uncertainty.

      R0 no longer presented, but estimates of overall seroprevalence now presented with uncertainty.

      Finally, given the substantial uncertainty in the assay, it seems optimistic to attempt to fit annual force of infections in the 30 year period prior to the start of the sampling periods. I would be tempted to include a constant lambda prior to the dates of the first study across the models considered.

      We thank the reviewers for the suggestion.

      We implemented this change (constant FOI before 2006) in the previous models without maternal antibodies and the result for the random-walk-based models was that the variance of the random walk was estimated over a very short period, thus resulting in a rather non- smoothed FOI.

      Implementing this change with the new models with maternal antibodies and random-walk on the FOI was technically a bit complex. We therefore kept the simple random-walk over the whole period and added the following paragraph to the Discussion:

      “It is important to interpret well the results for the estimates of the FOI over time from our analysis under the assumptions of the models. First, as the best model uses a random walk on the FOI, the change in transmission that we infer happens continuously over several years. In reality, this may have occurred differently (e.g. in a shorter period of time). Our ability to recover more complex changes in transmission is limited by the data available. It would not be surprising if EV-D68 has exhibited biennial (or longer) cycles of transmission in England over the last few years, as it has been shown in the US (7) and is common for other enteroviruses (30). However, it is difficult to recover changes at this finer time scale with serology data unless sampling is very frequent (at least annual). Therefore, our study can only reveal broader long-term secular changes. Second, interpretation of the results before 2006 must be avoided for two resasons. On the one hand, as we go backwards in time, there is more uncertaintly about the time of seroconversion of the individuals informing the estimates of the FOI. On the other hand, because age and time are confounded in cross-sectional seroprevalence measurements, the random walk on time may account for possible differences in the FOI through age (possibly higher in the youngest age classes, and lowest in the oldest), which are note explicitly accounted for here. This may explain the decline in FOI when going backwards in time before the first cross-sectional study in 2006.”

    1. Author Response

      Reviewer #3 (Public Review):

      A large body of work in the literature has established that the diversity in cells of identical genetic background occurs due to two components: 1) intrinsic noise - such as stochastic fluctuations in gene expression - as well as 2) extrinsic noise - variability that arises from sources that are external to the biochemical process of gene expression, such as abundances of ribosomes or stage in the cell cycle. Note that this widely-accepted definition does not separate intrinsic and extrinsic from intracellular and extracellular. The authors cite a few of these seminal papers (which focus on noise introduced to gene expression) but then define their interpretation of intrinsic noise much more broadly "... intrinsic noise as phenotype(s) fluctuations across isogenic cell populations cultured under the same conditions. Measurement noise in some cases can also be thought of as intrinsic noise. Fluctuations in cellular phenotype(s) driven by the global environment will be referred to as extrinsic noise." This misuse of widely accepted terminology creates significant confusion in the interpretation of the results.

      A point of contention with redefining noise as the authors have done is that they are lumping all processes unique to the cell as intrinsic and all environmental factors as extrinsic. Thus, when statements are made such as "external factors that contribute to noise are principally manifest through convection" (line 40-41, page 2) the veracity of these assumptions must be established. For example, when a ligand binds and unbinds from a receptor due to thermal energy, that "noise" in cellular stimulation is not convection-based, yet an example of how extrinsic noise can influence cellular responses. The definition is important because the underlying premise for the pipeline presented is that "While intrinsic cell variability can be significant, we believe that it is the extrinsic factor(s) that drive sample variability in most experimental cellular systems" (lines 42-43, page 4).

      We thank the referee for this very important critical comment. The referee correctly points out that the terminology (intrinsic vs. extrinsic noise) used in the cited papers has to be adapted and more clearly stated.

      We wish to point out that the autonomous system in Michael Elowitz and colleagues’ original paper was a single protein within a single cell. The noise that was measured in these experiments was driven by temporal fluctuations. An example of extrinsic noise for this system is, indeed, as pointed out by the referee, ligand binding and unbinding from a receptor.

      By contrast, our autonomous system is an ensemble of cells isolated from other samples but still subject to fluctuations in the external environment. We did not continuously measure temporal fluctuations in individual cells, but recorded snapshot(s) of cellular phenotype(s) within a single sample. The source of noise in these measurements is variability between individual cells, and we referred to this type of noise as intrinsic because it driven by the processes within the sample. We denoted as extrinsic noise that which is driven by external factors to this autonomous system (a particular sample), such as variability between different samples due to temperature, humidity, etc.

      All of these external factors (to the best of our knowledge) are related to movement and gradient formation of fluid or gas and, hence, from a physicochemical perspective, driven by convection process(es). The initial cell seeding that eventually leads to unique microenvironment formation can also be thought as an example of extrinsic noise using this terminology. The process of cell sedimentation and attachment is driven by advection, as the referee correctly points out. We have, therefore, adjusted the text accordingly.

      We hope that clarifying the intrinsic/extrinsic terminology in the "Introduction" section of the manuscript (line 37) should be sufficient to avoid the confusion the referee discusses. We are open (very reluctantly) to switching terminology to terms internal and external noise.

      Throughout, figures lack labels and sufficient explanation for interpretation, as well as the number of experiments used to generate the data that is processed through the pipeline for each condition. For a study designed to eliminate replicate culture conditions, the onus is on the authors to show that replicates are in fact fully recapitulated in the population variance after statistical binning/processing.

      To address this comment, we modified the figure legends and labels of most of the figures.

      We wish to emphasize that each point-injection experiment we performed is unique due to randomness in the local delivery method. This is due to the variability in the manual micro-injection release rate and direction of the initial flow. Several experiments (3+) were performed to improve the width of the label(s) distribution(s) and their mixing condition, and the results of the better optimized local delivery were selected as representative for the manuscript. Sample selection was independent of the outcome of drugs action and based on initial label distribution only. An experimental improvement of our method, similar to initialization of the pseudo-random number generator in numerical experiments, is required to achieve systematic reproducibility of drug(s) distribution(s). One way to do so is robotically, but certainly the best is to design a system that utilizes a predictably constant drug gradient within a sample that contains large enough cells, a topic that will be the subject of future experiments.

      Ultimately, when the paper presents results such as Figure 9 as the culmination of the pipeline as applied to cell viability studies, it is unclear how useful insight is extracted from this methodology. Four drugs are applied in combination to adherent HeLa cells and time-dependent local cell density is provided as a proxy for cell viability. While it is stated that "The absolute drug concentration can be determined using the homogeneous delivery method discussed above" (line 421-422, page 19), this analysis is not performed, and I am left unsure of whether extrinsic factors are truly driving sample variability under this context. It is unclear to the reader how the point injections were administered, and no discussion of how the confounding factors of synergy or antagonism will be addressed through this methodology.

      We attempted to explain that data shown in Figure 9 were not meant to be the climactic point of the entire pipeline (rather, the data shown in Figure 6 represent our key achievement). In this four-drug experiment, we exhausted the fluorescent spectrum bandwidth necessary to distinguish drug labels (i.e., using commonly available microscopy tools). In order to estimate local cell density, we had to rely on bright field imaging data which is not the most accurate possible implementation (see further response to your comment below). More importantly, we had to wash samples between the measurements to remove detached (dead) cells and cell debris. This step can (and usually does) influence local cell density in a non-uniform fashion, since both media removal and deposition are performed locally by pipetting (cells in the vicinity of aspiration/media deposit sites can be washed off regardless of the drug treatment.)

      To clarify how point injections were administered, we added a detailed description in the Methods section. Please see section Drug labeling and delivery, pages 11-12.

      In this manuscript, we wished to establish possible applications of our method and avoid in depth analysis or biological interpretation of a specific drug combination that is dependent on the cell line or on a particular experimental condition. We added a paragraph in the "Discussion" section suggesting the necessity of future research dedicated to methodology and analytical interpretation of high-dimensional context-dependent drug interaction data.

    1. Author Response

      Reviewer #2 (Public Review):

      The authors unexpectedly found that the protein Grb2, an adaptor protein that mediates the recruitment of the Ras guanine-nucleotide exchange factor, SOS, to the EGF receptor, can be recruited to membranes by the immune cell tyrosine kinase Btk. The authors show, using total internal reflection fluorescence (TIRF) microscopy that the interaction with Grb2 is reversible, dependent on the proline-rich region of Btk, and independent of PIP3. These experiments are well performed and unambiguous.

      The authors next asked whether Grb2 binding to Btk influences its kinase activity, by evaluating (i) Btk autophosphorylation and (ii) the phosphorylation of a peptide from the endogenous substrate PLCy1. The readout relies on non-specific antibody-mediated detection of phosphotyrosine but nevertheless reveals a concentration-dependent increase in both Btk autophosphorylation and PLCy1 phosphorylation. The experiments, however, have only been performed in duplicate and, particularly in the case of PLCy1 phosphorylation, exhibit enormous variability which is not reflected in the example blot the authors have chosen to display in Figure 3C. Comparison of the same, duplicate experiment presented in Figure 3 Supplement 2 paints a very different picture.

      We added an experiment wherein we measure phosphorylation of the PLC𝛾2-peptide fusion by Btk in the presence of different concentrations of Grb2, and we have carried out LC-MS/MS to probe which Tyr are phosphorylated in these experiments. We have also modified our presentation of the Western blot data to allow readers to view each replicate separately. We believe this makes it easier to evaluate the trends observed in each replicate, and because the intensity measured here is only semi-quantitative, due to limitations of the technique, we believe this is a more accurate way to present our results. Both Tyr of the PLC𝛾2-peptide are phosphorylated, as well as one Tyr at the very C-terminus of GFP (Figure 3 – Supplements 3-5).

      The authors next sought to determine which domains of Grb2 are required for activation of Btk. Again, these experiments were only performed in duplicates, and the authors’ claims that Grb2 can moderately stimulate the SH3-SH2-kinase module of Grb2 are not well supported by their data (Figure 4C-D).

      We have opted to remove the data for the activation of the SH3-SH2-kinase construct (Src module) from the revised manuscript. Upon further inspection, we agree that these experiments only showed a weak trend and believe that much more experimentation is needed to draw firm conclusions regarding this construct. We do still speculate that SH2 linker displacement may contribute to our observations of enhanced catalytic activity of Btk in the presence of Grb2, however this speculation is based solely on previous work with Btk and other kinases (Aryal et al., 2022; Moarefi et al., 1997).

      The authors next asked whether Grb2 stimulates Btk by promoting its dimerization and trans- autophosphorylation. The authors measured the diffusion coefficient of Btk on PIP3- containing supported lipid bilayers in the presence and absence of Grb2. They noted that the diffusion coefficient of individual Btk particles decreases with increasing unlabeled Btk, which they interpret as Btk dimerization. Grb2 does not appear to influence the diffusion of Btk on the membrane (Figure 5A). Presumably, the diffusion coefficient reported here is the average of a number of single-molecule tracks, which should result in error bars. It is unclear why these have not been reported. Next, the authors assessed the ability of Grb2 to stimulate a mutant of Btk that is impaired in its ability to dimerize on PIP3-containing membranes. In contrast to wild-type Btk, autophosphorylation of dimerization-deficient Btk is not enhanced by Grb2. Whilst the data are consistent with this conclusion, again, the experiment has only been repeated once and the western blot presented in Figure 5 Supplement 2 is unreadable. It is also puzzling why Grb2 gets phosphorylated in this experiment, but not in the same experiment reported in Figure 3 Supplement 2.

      The diffusion coefficient reported here is determined from a large number of single molecule tracks. We have expanded our explanation of how this is done in the Materials and Methods, as well as providing an example of the data and fits from one of the conditions in Figure 4 – Supplement 3. We are now including standard deviation for each diffusion coefficient determined from the fit of the step size distribution.

      We have opted to remove the data involving the dimerization-deficient Btk construct. We agree that these results are difficult to interpret.

      We have investigated the Grb2 phosphorylation signal and concluded that this is an off-target effect of the antibody. MS/MS cannot detect and phosphorylation on Grb2. We now comment on this in the figure legend of Figure 3 – Supplement 2.

      Finally, the authors argue that Grb2 facilitates the recruitment of Btk to molecular condensates of adaptor and scaffold proteins immobilized on a supported lipid bilayer (SLB) (Figure 6). This is a highly complex series of experiments in which various components are added to supported lipid bilayers and the diffusion of labelled Btk is measured. When Btk is added to SLBs containing the LAT adaptor protein (phosphorylated in situ by Hck immobilized on the membrane via its His tag), it exhibits similar mobility to LAT alone, and its mobility is decreased by the addition of Grb2. The addition of the proline-rich region (PRR) of SOS further decreases this mobility. In this final condition, the authors incubate the reactions for 1 h until LAT undergoes a phase transition, forming gel-like, protein-rich domains on the membrane, shown in Figure 6B. The authors’ conclusion that Btk is recruited into these phase-separated domains based on a slow-down in its diffusion is not well supported by the data, which rather indicates that Btk is excluded from these domains (Figure 6B – Btk punctae (green) are almost exclusively found in between the LAT condensates (red)). As such, the restricted mobility of Btk that the authors report may simply reflect the influence of barriers to diffusion on the membrane that result from LAT condensation into phase- separated domains. The authors also present data in Figure 6 Supplement 1 indicating that Grb2 recruitment to Btk is out-competed by SOS-PRR and that Btk does not support the co- recruitment of Grb2 and SOS-PRR to the membrane. These data would appear to suggest that the authors’ interpretation of the decreased mobility of Btk on the membrane may not be correct.

      We have now included an example of one of the single molecule videos, overlayed with the surrounding LAT phase, to more directly display the data that was recorded for this experiment. In this video, it is possible to see that the LAT dense phase occupies only some of the observed window, and although it is possible that these dense “islands” function as barriers to Btk diffusion, Btk would be expected to diffuse freely outside of the LAT dense areas of the bilayer. This property can be clearly seen in the video we have now included. This is reminiscent of what was observed previously during the LAT phase transition for tracking of LAT itself (Sun et al., 2022). Given the extensive previous analysis of LAT diffusion on supported lipid bilayers (Lin et al., 2022; Sun et al., 2022), we believe the necessary controls have been included to support our conclusions. However, we agree there is much to be learned about this interaction and we hope that future studies will further investigate the relationship between cytoplasmic kinases and plasma membrane associated signaling clusters.

      Reviewer #3 (Public Review):

      The study of Nocka and colleagues examines the role of membrane scaffolding in Btk kinase activation by the Grb2 adaptor protein. The studies appear to make a case for a reinterpretation of the "Saraste dimer" of Btk as a signaling entity and assigns roles to the component domains in the Src module in Btk activation. The point of distinction from earlier studies is that this work ascribes a function to an adaptor protein as promoting the kinase activation, rather than vice versa, and also illustrates why Btk can be activated via modes distinct from its close relative, such as Itk. Importantly, these studies address these key questions through membrane tethering of Btk, which is a successful, reductionist way to mimic cellular scenarios. The writing could be improved and can absolutely be more economical in word choice and use; currently, there is a good deal of background to each section that is not always comprehensive or crucial to contextualise the findings, while key information is often omitted. The results are currently not described in a detailed manner so there is an imbalance between the findings, which should be the focus, relative to background and interpretations or models.

      We have assessed the manuscript and made many improvements to shift the focus to the findings, while providing only the necessary background for readers unfamiliar with the specifics of Btk and Grb2 signaling and structure.

    1. Author Response

      Reviewer #1 (Public Review):

      Ge et. al., examined sodium-glucose cotransporter-2 inhibitors (SGLT2i) in Alport syndrome (AS), and demonstrate that it was beneficial in AS through reduced lipotoxicity in podocytes as a key mechanism of action. The SGLT2i empagliflozin has been previously shown to have positive effects on hyperglycemia control, as well as on cardiovascular and renal outcomes of type II diabetes mellitus through tubuloglomerular feedback, but its effect on glomerular diseases such as AS are unknown to date. The authors have previously identified that cholesterol efflux in podocytes plays a critical pathogenic role in a diabetic kidney disease setting. The evidence that authors provide in favor of their hypothesis in a disease of non-metabolic origin such as AS, was supported as the SGLT2i was effective in reducing the deleterious effects of lipotoxicity in podocytes, ameliorated glomerular injury and proteinuria, and extending the life span of Col4a3 knockout mice. They further show that empagliflozin treatment mitigated AS podocytes from cell death through apoptosis, but did not impact the cell's cytotoxicity. These results support the notion that empagliflozin affects the regulation of important metabolic switch in mouse kidneys, perhaps through decreasing lipid accumulation in podocytes.

      However, the authors solely rely on one IHC staining image of a human biopsy to demonstrate SGLT2 expression in podocytes in vivo. Although the authors have done several experiments which greatly increase the confidence in their findings that empagliflozin is beneficial in AS and would have clinical significance, their data does not rule out the possibility that empagliflozin has beneficial effects through the other glomerular cells in AS, or limited to impacting lipids in podocytes in AS.

      We thank the reviewer for recognizing the significance of our findings and for pointing out some additional concerns with our study. In this revised version, we have added experiments that focus on investigating the specific effect of empagliflozin on AS podocytes. We added immunofluorescence staining of AS mouse kidney sections which supports the idea that SGLT2 is expressed in podocytes. We investigated the effect of SGLT2 knockdown in AS podocyte using siRNA and compared the anti-lipotoxic effects of siSGLT2 to SGLT2i.

      Reviewer #3 (Public Review):

      Using cultured human podocytes the expression of SGLT2 is established using immunostaining and western blotting. An analysis of podocyte RNA wasn't performed, but the expression in cultured podocytes was comparable to that seen in human cultured proximal tubular cells. This work then paved the way for treatment of immortalized cells obtained from an Alport syndrome mouse model (Col4A3-/-), representing an autosomal recessive form of Alport syndrome. Podocytes from Alport syndrome mice showed a lipid droplet accumulation which was reduced to some extent by SGLT2 inhibition. In a series of metabolic experiments, it was shown that SGLT2 inhibition reduced the formation of pyruvate as a metabolic substrate in Alport podocytes. In vivo experiments showed an improvement in survival of Col4a3-/- mice treated with SGLT2 inhibition. When compared to ace inhibitor, SGLT2 inhibition has a similar effect on renal function and no additive effect was seen with SGLT2 inhibitor plus ace inhibitor. Like the cell assays, the in vivo treatment seemed to prevent the podocyte lipid accumulation in Alport syndrome mice.

      This data in cells and animals generally supports the findings in SGLT2 inhibitor human studies, where Alport syndrome patients with proteinuria and progressive CKD seem to benefit. The work paves the way for a dedicated trial of SGLT2i in Alport patients and a reassessment of the human podocyte disease phenotype in this condition, before and after treatment. There are patients with mutations in SGLT2 with familial renal glycosuria - it would be interesting to test via urine derived podocytes whether a similar metabolic switch was occurring and its consequences to pave the way for long term treatment regimes.

      We thank the reviewer for recognizing the significance of our findings. We appreciate the reviewer’s concern that podocyte SGLT2 RNA levels should be studied. In this revised version, we added the results of SGLT2 mRNA expression analysis in immortalized podocytes and tubular cells. These results were added in Figure 1E. We agree with the insightful suggestions to study the metabolic switch in familial renal glucosuria in patients with SGLT2 mutations, as well as to evaluate Col4a5 AS model. We have included these insights in our discussion.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors address the origin of the macrophage increase in sensory ganglia after peripheral nerve injury, showing that there is no major influx by blood-derived monocytes into ganglia after injury and that resident macrophages proliferate, which is dependent on CX3CR1 signaling.

      • Interesting and relevant question, mainly addressed with adequate experimental approaches.

      • Most conclusions are supported by the data, however, some important controls and experiments are missing.

      • The authors should demarcate their results from the study of Iwai et al, 2021 which addresses similar questions.

      Thank you for the positive comments, we hope that our point-by-point responses below and the important changes/inclusions in the MS satisfactorily addressed your concerns. We agree that some important controls were missing, and we have included additional data in the revised manuscript. Regarding the Iwai et al. paper, it is in line with our hypothesis. In fact, they suggest that in trigeminal ganglia (TG), resident macrophages proliferate after peripheral injury, although they detected few blood monocytes infiltrating the TG. Our paper, besides to confirm Iwai et al. results, by using different and complementary approaches are more specific compared to BM transfer in irradiated mice, we also advanced in terms of the mechanisms that these cells proliferate (CX3CR1 signalling) and the impact of these proliferation for neuropathic pain development. We discussed these points in the new version of the MS. Please see page 4 lines 88-93.

      Reviewer #2 (Public Review):

      The investigators looked at mφs in lumbar DRG after a spared nerve injury in which two of the three branches of the sciatic nerve are transected and the third left intact. This is a classical preparation for studying neuropathic pain. This paper demonstrates that the increase of mφs is an increase in the number of CX3CR1+ (resident) mφs and not CCR2+ (infiltrating mφs) by using CX3CR1 and CCR2 individual reporter mice. Using a CX3CR1 conditional knockout (KO) mouse, they found that this receptor must be present on the mφs for the increase in number to occur. Next, they did a parabiosis experiment with GFP+ mice and found that neither of these mφ subtypes infiltrated into the DRG. To examine proliferation, they injected animals with Ki67 and found this label, which is an indication of proliferation, was present in the CX3CR1+ mφs (but not the CCR2+ mφs). Finally, they identified the CX3CR1 mφs to be the cells that express TNFα and IL-1β but not IL-6.

      An experiment that would be useful would be to determine if there is an increase or a decrease in the availability to mφs of the ligand CXC3L1 after the spared nerve injury. The authors state from the work of others that membrane-bound CX3CL1 is constitutively expressed and that it is decreased after nerve injury. They hypothesize that this indicates a release of the chemokine, but such a decrease could also indicate a decrease in expression. A few sentences on what is known in other systems on the importance and mode of action of membrane-bound and non-membrane-bound CX3CL1 would be useful.

      Thanks to the reviewer for a great summary of our manuscript. We have now performed a time course of Cx3cl1 expression in the DRG after the spared nerve injury and it was included in figure 7A. We also apologise for the lack of information regarding the importance and mode of action of membrane-bound and non-membrane-bound CX3CL1, which is now included in the discussion section (Page 16).

      The main weakness of the manuscript is that many highly relevant previous findings, in some cases reporting nearly identical experiments sometimes with the same and sometimes with somewhat different results, are not mentioned. Kalinski et al. (which is cited but not in this context) reported a very similar parabiosis experiment. While they did not identify subtypes of mφs, they found an increase in infiltration of mφs, which was small (though statistically significant) compared to the larger increase that occurred in the distal nerve. In 2013 and 2018, Niemi et al. and Lindborg et al (J Neurosci and J

      Neuroinflammation respectively) reported that mφs in the DRG are somewhat decreased in a CCR2 KO mouse, suggesting again that there is some infiltration of mφs into the DRG after axotomy. They also showed that the mφ chemokine CCL2 increases in the DRG after sciatic nerve injury. With regard to proliferation, Yu et al. in 2020 (which again is cited but not in this context) also used a spared nerve paradigm stained DRGs for CX3CR1+ mφs and found an increase. They then stained DRG sections for Ki67 and demonstrated proliferation in this population. An earlier reference by Krishnan et al in 2018 published in J Neuropathol Exp Neurol is entitled "An Intimate Role for Adult Dorsal Root Ganglia Resident Cycling Cells in the Generation of Local Macrophages and Satellite Glial Cells". With regard to cytokine expression, in 1995, Murphy et al published a paper in J Neurosci demonstrating induction of interleukin-6 in axotomized sensory neurons.

      Thank you for the comment. These papers, you have indicated, are the main reason we have idealised our MS. The controversy regarding the possible infiltration of peripheral blood monocytes for the increase in the number of macrophages in the sensory ganglia after peripheral nerve injury. Furthermore, some of these papers you also indicated, came out during the execution of this manuscript, and they also brought controversies or did not explore some points. Therefore, we believe that our work by using different and complementary approaches strongly support the hypothesis that after peripheral nerve injury, peripheral blood monocytes did not infiltrate the DRGs significantly, but that the increase in the macrophages population is due to the proliferation of resident macrophages. Furthermore, we provided novel mechanistic evidence of the role of CX3CR1 signalling for the proliferation of these cells (figures 7 and S6). In addition, our new experiments suggested by the referees and editor suggest that CX3CR1-dependent proliferation of DRG macrophages is involved in the development of neuropathic pain (Figures 6D and 7E). We will make these points clear in the new version of the MS. Please see pages 11, 12, 14 and 17 (discussion and introduction section).

      Reviewer #3 (Public Review):

      This paper addresses the mechanism underlying a well-documented finding whereby the numbers of resident macrophages increase in dorsal root ganglia following peripheral nerve injury. It delineates the relative contribution of monocyte recruitment via circulation and local proliferation. The paper is clearly structured and written, and the data overall support the main conclusion that the increase in nerve-associated macrophages is primarily driven by proliferation, not monocyte recruitment. Its main weakness is that the question that is being asked is rather restricted, so the additional insight gained for the field will be incremental. It would be particularly interesting in the future to address whether the existence of a protective barrier indeed is the reason peripheral cells are not recruited to the nerve injury lesion and to assess e.g. whether forced breaching of this barrier results in monocyte influx and altered injury response.

      We appreciate your comments and suggestions. In the new version of the MS, we are presenting a series of novel experiments that confirm and support our initial hypothesis. Furthermore, novel experiments also explore the importance of the phenomenon we have explored in the context of neuropathic pain development. Regarding your suggestion about the next steps, we are working now in an attempt to understand why these cells are not able to infiltrate the DRGs after injury. Interestingly, one paper that came out during the revision of this work, showed that CD8+ T cells that are not able to infiltrate the DRGs after nerve injury in adult mice, start to infiltrate the DRGs of old mice (Zhou et al. 2022), indicating that ageing process may promote changes in this protective barrier. In addition, we have published a recent paper indicating that immune cells infiltrate the dorsal root leptomeninges after SNI (Maganin et al. 2022). We included these references and discussed these points in the new version of our MS. Please see page 15 lines 366 and 370.

      References:

      Zhou, L., G. Kong, I. Palmisano, M. T. Cencioni, M. Danzi, F. De Virgiliis, J. S. Chadwick, G. Crawford, Z. Yu, F. De Winter, V. Lemmon, J. Bixby, R. Puttagunta, J. Verhaagen, C. Pospori, C. Lo Celso, J. Strid, M. Botto, and S. Di Giovanni. 2022. "Reversible CD8 T cell-neuron cross-talk causes aging-dependent neuronal regenerative decline." Science 376 (6594): eabd5926. https://doi.org/10.1126/science.abd5926.

      Maganin, A. G., G. R. Souza, M. D. Fonseca, A. H. Lopes, R. M. Guimarães, A. Dagostin, N. T. Cecilio, A. S. Mendes, W. A. Gonçalves, C. E. Silva, F. I. Fernandes Gomes, L. M. Mauriz Marques, R. L. Silva, L. M. Arruda, D. A. Santana, H. Lemos, L. Huang, M. Davoli-Ferreira, D. Santana-Coelho, M. B. Sant'Anna, R. Kusuda, J. Talbot, G. Pacholczyk, G. A. Buqui, N. P. Lopes, J. C. Alves-Filho, R. M. Leão, J. C. O'Connor, F. Q. Cunha, A. Mellor, and T. M. Cunha. 2022. "Meningeal dendritic cells drive neuropathic pain through elevation of the kynurenine metabolic pathway in mice." J Clin Invest 132 (23). https://doi.org/10.1172/JCI153805.

  4. Mar 2023
    1. Author Response

      Reviewer #1 (Public Review):

      This study focuses on the role of polo like kinase 1 (PLK-1) during oocyte meiosis. In mammalian oocytes, Plk1 localizes to chromosomes and spindle poles, and there is evidence that it is required for nuclear envelope breakdown, spindle formation, chromosome segregation, and polar body extrusion. However, how Plk1 is targeted to its various locations and how it performs these functions is not well understood. This study uses C. elegans oocytes as a model to explore PLK-1 function during meiosis. They take advantage of an analogue-sensitive allele of plk-1, which enabled them to bypass nuclear envelope breakdown defects that occur following PLK-1 RNAi. This allowed them to dissect later roles of PLK-1 in oocytes, demonstrating that depletion causes defects in spindle organization, chromosome congression, segregation, and polar body extrusion. Moreover, the authors defined mechanisms by which PLK-1 is targeted to chromosomes, showing that CENP-C (HCP-4) is required for localization to chromosome arms and that BUB-1 is required for targeting to the midbivalent region. Finally, they demonstrate that upon removal of PLK-1 from both domains, there are severe meiotic defects. These findings are interesting. However, there is a need for additional analysis to better support some of their conclusions, and to aid in interpretation of particular phenotypes. Specific comments are below.

      • For many important claims of the paper, a single representative image is shown but the n is not noted. This is an issue throughout the paper for much of the localization analysis (e.g. Figure 1B, 1C, 1D, 2A, 2B, 3A, 3B, 3C, etc.); in cases like this, numbers should be included to increase the rigor of the presented data. How many images or movies were analyzed that looked like the one shown? For linescans, were they done only on one image? How many independent experiments were done, etc?

      We had initially chosen a representative image. Localisation was the same in all images that allowed ‘proper’ assessment of PLK-1 localisation. In our case, this means that we can only analyse bivalents that are perpendicular to the light path to distinguish between bivalent, chromosome arms, and kinetochore. We now report the number of oocytes (N) and bivalents (n) analysed for each condition. The line scans were done in one representative image.

      • In the abstract, it is stated that PLK-1 plays a role in spindle assembly/stability (this is also stated elsewhere, e.g. line 101). This phrasing implies that the authors have demonstrated roles in both spindle assembly and stability. However, to distinguish between these roles, they would have to show that removal of PLK-1 before spindle assembly causes defects, and also that removal of PLK-1 from pre-formed spindles causes collapse. I don't think it is necessary to do this, as the spindle roles of PLK-1 are not a focus of the paper. However, the language should be altered so that it does not imply that the paper has demonstrated roles in both. A good place to do this would be in the section from lines 144-147, where they first discuss the spindle defects. It would be straightforward to explain that their approach does not distinguish between spindle assembly and stability, and that PLK-1 could have a role in either or both.

      We fully agree with this comment. We cannot distinguish between spindle assembly and stability, and it is also not the focus of our current work. We have changed the text accordingly.

      • It is stated that there is kinetochore localization of PLK-1 (and I do see some dim cup-like localization in images after PLK-1 is removed from the chromosome arms via HCP-4 RNAi). However, this cup-like localization is not clear in most wild-type images (e.g. Figure 1B, 1D, 2A, 3A, etc.). Although I recognize that the chromatin staining might be obscuring kinetochore localization, if PLK-1 was truly a kinetochore protein I would also expect it to localize to filaments within the spindle (as many other kinetochore proteins do), especially since the authors state that BUB-1 targets PLK-1 to the kinetochore (and BUB-1 is in the filaments). In fact, the only images where it looks like PLK-1 may be localized to filaments are in Figure 4C and 6A, when HCP-4 has been depleted (though I don't know if this generally true across all HCP-4 RNAi images). For me, this calls into question the conclusion that PLK-1 truly is on the kinetochore in wild type conditions - could it be that PLK-1 only localizes to the kinetochore (and to the filaments) when HCP-4 is depleted? The authors need to resolve this issue and provide better evidence that PLK-1 normally localizes to the kinetochore, if they want to make this claim. Additionally, the observation that PLK-1 is not on the kinetochore filaments (in wild type conditions) should be addressed in the text somewhere - do the authors think that this is a special type of kinetochore protein that does not localize to the filaments?

      While our initial claim of PLK-1 kinetochore localisation was based on its cup-like localisation, we have now performed additional analysis and experiments to confirm this claim. First, we corroborated that PLK-1 cup-like pattern co-localises with the Mis12 complex component KNL-3 (New Figure 5-figure supplement 1). Second, we show that PLK-1 is present in the so called ‘linear elements’ (filaments) both within the spindle and in the cortex. Since PLK-1 presence in these filaments is seen in wild type as well as hcp-4 mutant oocytes, we conclude that PLK-1 likely localises in kinetochore in normal conditions.

      • The authors should provide a control experiment, treating wild-type worms with 10uM 3-IB-PP1. This would be important to ensure that the spindle defects seen at this concentration in the plk-1as strain are not non-specific effects of the inhibitor. There is a control in Figure 1 - figure supplement 3 using 1uM 3-IB-PP1 but didn't see a control for 10uM (the concentration at which spindle defects are observed).

      This control has now been included in Figure 1-figure supplement 3.

      • In Figure 2F, the gels for BUB-1+PLK-1 look different in the presence and absence of phosphorylation by Cdk1 - for these data, I agree with the authors that it looks as if the complex elutes at a higher volume if BUB-1 is not phosphorylated (lines 200-204). However, Figure 2G has a repeat of the condition with phosphorylated BUB-1, and in this panel, the complex appears to elute at a higher volume than it did on the gel in panel F. The gel in panel G looks much more similar to the unphosphorylated condition in panel F. The authors need to explain this discrepancy (i.e., Is there a reason why the gels cannot be compared between panels? How reproducible are these data?). Ideally, the authors would include a repeat of the unphosphorylated BUB-1 + PLK-1 condition in panel G, done at the same time as the conditions shown in that panel, to avoid the impression that their results may not be reproducible.

      The specific elution volume cannot be compared in different experiments as the column has proven to “drift” over time – with proteins eluting at a later volume than they did previously despite extensive washing. What is reproducible under the experimental conditions is that the unphosphorylated wild type proteins, or the phosphorylated T527A/T163A mutant proteins A) elute at a later volume than the phosphorylated wild type proteins and B) bind to a lower proportion of the MBP-PLK1PBD (as you can see in the relative absorbance profiles and Coomassie gels).

      • The authors would need to provide convincing evidence that co-depletion of BUB-1 and HCP-4 delocalizes PLK-1 from the chromosomes entirely, and that this co-depletion condition is more severe than either single depletion alone.

      We now provide a quantitation on the total PLK-1 levels to go along the images (New Figure 8-figure supplement 1).

      Additionally, the bub-1T527A and hcp-4T163A alleles are nice tools to, in theory, more specifically delocalize PLK-1 from the midbivalent and chromosome arms, respectively, to explore the functions of chromosome-associated PLK-1. However, I think the authors cannot rule out the possibility that other proteins are also being depleted from the midbivalent and/or chromosome arms in their conditions, and that this delocalization may contribute to the phenotypes observed. For example, hcp-4 depletion was recently shown to delocalize KLP-19 from the chromosome arms (Horton et.al. 2022), so in the experiment shown in Figure 6E (HCP-4 RNAi in the bub-1 mutant), PLK-1 was likely not the only protein missing from the chromosome arms. Therefore, understanding if other proteins are absent from these domains (in the bub-1T527A and hcp-4T16A3 mutants) would help the reader understand and interpret the presented phenotypes (and how specific they are to PLK-1 loss). Consequently, I think that to better understand the co-depletion analysis presented in Figure 6 (and Figure 6 supplement 1), the authors should analyze other midbivalent and chromosome arm proteins, to determine if any are also delocalized (e.g. SUMO, KLP-19, MCAK, etc.).

      As stated above, this paper focuses on identifying the specific meiotic events PLK-1 plays a role in and characterising its targeting mechanism. We are following on this work to understand what proteins are regulated by PLK-1 in different chromosome domains and how this relates to the observed phenotypes.

      For the current, we should emphasise that mutating a single Thr residue within an STP motif in a largely disordered region is far more specific than depleting HCP-4 or BUB-1, making it likely that the observed effects are mediated through PLK-1 targeting. It should be noted that the finding presented in Horton et.al. 2022 is in contradiction with another study in which hcp-4 depletion did not impact KLP-19 localisation (Hattersley et al 2022).

      Additionally, instead of performing a combination of mutant and RNAi analysis (i.e. HCP-4 RNAi in the bub-1 mutant (Figure 6) and BUB-1 RNAi in the hcp-4 mutant (Figure 6 figure supplement 1)), it would be more powerful to generate a double mutant - this has a higher chance of being a more specific depletion condition.

      We have performed these experiments, which are now presented in Figure 9.

    1. Author Response

      Reviewer #1 (Public Review):

      Sorkac et al. devised a genetically encoded retrograde synaptic tracing method they call retro-Tango based on their previously developed anterograde synaptic tracing method trans-Tango. The development of genetically encoded trans-synaptic tracers has long been a difficult stumbling block in the field, and the development of trans-Tango a few years back was a breakthrough that was immediately, widely, and successfully applied. The recent development of the retrograde tracer method BActrace was also exciting for the field, but requires lexA driver lines and required by its design the test of candidate presynaptic neurons instead of an unbiased test for connectivity.

      Retro-Tango now provides an unbiased retrograde tracer. They cleverly used the same reporter system as for trans-Tango by reversing the signaling modules to be placed in pre-synaptic neurons instead of post-synaptic neurons. Therefore, synaptic tracing leads to the labeling of pre-synaptic neurons under the regulation of the QUAS system. Using visual, olfactory as well sexually dimorphic circuits authors went about providing examples of specificity, efficiency, and usefulness of the retro-Tango method. The authors successfully demonstrated that many of the known pre-synaptic neurons can be successfully and specifically labelled using the retro-Tango method.

      Most importantly, because it is based on the most used, very well tested and widely adopted trans-Tango method, retro-Tango promises to not just be a clever development, but a really widely and well-used technique as well. This is an outstanding contribution.

      We would like to thank Dr. Hiesinger for his very kind words and for the overall appreciation of the contribution of the development of retro-Tango to the field. We are also grateful for the suggestions below aimed at improving the clarity of our manuscript. We individually address the points raised by Dr. Hiesinger below.

      Reviewer #2 (Public Review):

      Tools that enable labeling and genetic manipulations of synaptic partners are important to reveal the structure and function of neural circuits. In a previous study, Barnea and colleagues developed an anterograde tracing method in Drosophila, trans-TANGO, which targets a synthetic ligand to presynaptic terminals to activate a postsynaptic receptor and trigger nuclear translocation of a transcription factor. This allows the labeling and genetic manipulation of cells postsynaptic to the ligand-expressing starter cells. Here, the same group modified trans-TANGO by targeting the ligand to the dendrites of starter cells to genetically access pre-synaptic partners of the starter cells; they call this method retro-TANGO. The authors applied retro-TANGO to various neural circuits, including those involved in escape response, navigation, and sensory circuits for sex peptides and odorants. They also compared their retro-TANGO data with synaptic connectivity derived from connectivity obtained from serial electron microscopy (EM) reconstruction and concluded that retro-TANGO can allow trans-synaptic labeling of presynaptic neurons that make ~ 17 synapses or more with the starter cells.

      Overall, this study has generated and characterized a valuable retrograde transsynaptic tracing tool in Drosophila. It's simpler to use than the recently described BAcTrace (Cachero et al., 2020) and can also be adapted to other species. However, the manuscript can be substantially strengthened by providing more quantitative data and more evidence supporting retrograde specificity.

      We thank Dr. Luo for his kind words and his assessment of the value of retro-Tango as a new tool in the transsynaptic labeling toolkit in Drosophila. We followed the suggestions of Dr. Luo for providing more quantitative data and addressing the specificity and directionality of retro-Tango. We strongly believe that the implementation of his suggestions did enhance the quality of our manuscript.

      Reviewer #3 (Public Review):

      This is a valuable addition to the currently available arsenal of methods to study the Drosophila brain.

      There are many positives to the present manuscript as it is:

      (i) The introduction makes a clear and fair comparison with other available tracing methods.

      (ii) The authors do a systematic analysis of the factors that influence the labeling by retro-tango (age, temperature, male versus female, etc...)

      (iii) The authors acknowledge that there are some limitations to retro-TANGo. For example, the fact that retro-T does not label all the expected neurons as indicated by the EM connectome. This is fine because no technique is perfect, and it is very laudable that the authors did a serious study of what one should expect from retro-tango (for example, a threshold determined by the number of synapses between the connected neurons).

      We would like to thank the reviewer for the kind words and the positive assessment of our manuscript. In addition, we would like to acknowledge the reviewer for the recommendations below, which we followed and we think made our manuscript stronger.

    1. Author Response

      Reviewer #1 (Public Review):

      Bustion and colleagues outline the creation and testing of an in-silicon method to query gut microbiome databases for genes encoding enzymes predicted to catalyze a reaction of interest, which is provided by the user. Strengths of the tool include attempts to examine nearly 9,000 MetaCyc reactions in a pre-calculated fashion and to rank order enzymes based on their likelihood of catalyzing a reaction. Substrates, products, and even cofactors, if known, are employed to strengthen the power of the search algorithm, which also employs a hidden Markov model to improve the selection of putative hit enzymes. The authors outline high success rates with examples presented and compare those results with other extant methods, which are reported to perform in a less robust manner. Weaknesses include lack of evidence of success on a more difficult "real world" example. However, the tool outlined is a clear advance over existing methods and will be useful to explore the diversity of chemical transformation performed by commensal microbiota.

      We thank Reviewer 1 for their positive feedback and constructive summary. We agree that a real-world example would add confidence to our findings. We previously demonstrated SIMMER’s utility using published datasets. To expand upon these findings, we added another evaluation on an external dataset (Artacho et al., 2020) and performed new experiments to test SIMMER predictions for methotrexate metabolism into DAMPA and glutamate, a reaction known to be performed by the human microbiome but for which human gut strains and specific gut enzymes were not previously known. Both the new external dataset and our experimental findings validate SIMMER’s predictions of bacteria capable of metabolizing methotrexate, the mainline therapeutic for rheumatoid arthritis patients.

      Reviewer #2 (Public Review):

      This work provides a new computational tool for the systematic characterization of biotransformation reactions in the human gut microbiome: given a biotransformation reaction of interest, it predicts a list of candidate bacterial species, enzymes, and EC identifiers putatively capable of performing the queried reaction. The method is innovative and clearly presented.

      The pipeline that relies on both chemical and protein similarity algorithms, is in principle applicable to any biotransformation reaction that can be formulated as linked substrates and products (possibly including co-factors). This contrasts with other approaches that, for example, only rely on smaller databases and solely rely on substrates and chemical similarity. Moreover, SIMMER outperformed two other recently developed methods, against which it was benchmarked for its prediction accuracy when tested on a control test set derived from literature.

      The work interestingly focuses on predicting bacterial enzymes responsible for drug biotransformation, therefore showcasing its potential as a hypothesis generator for characterizing and validating novel bacterial enzymes in vitro.

      The authors correctly describe the relevance of an accurate input (in terms of reaction completeness, including cofactors and reaction products) as paramount for the quality of the prediction.

      The conclusions of this paper are mostly well supported by data, but some aspects of performance evaluation and its generality might benefit from additional elaborations and clarifications.

      1) Great emphasis has been dedicated to the prediction performance of SIMMER over a positive control set derived from the available literature. However, a more extensive description and analysis of false positive results are needed to better understand the possible impact of the (potentially many) false positive predictions listed for each reaction.

      We agree that our analysis would benefit from an assessment of false positives. Unfortunately, current literature usually reports which reactions an enzyme is capable, rather than incapable, of performing. For this reason, we took a conservative approach and decided to define all reactions preceding that which yielded a positive control enzyme sequence as false positives. This is now described above in Essential Revisions Response 1.3.

      2) The authors imply that the current method is superior to two other methods based on accuracy. However, a more extensive description of the benchmarking results would strengthen these benchmarking efforts.

      We have addressed this concern in Essential Revisions Response 3.

      3) The authors only showcase SIMMER in the context of drug metabolism but claim its applicability to be general enough to also describe other biotransformation in the human gut microbiota. Although in principle believable, the authors could improve the credibility and generalizability of their method by demonstrating another use case, e.g., food compounds, for which extensive metagenomic and metabolomic data are already available from previous gut microbiome studies.

      We agree that assessments of SIMMER’s predictions on food metabolism would improve the generalizability of the method. We have edited the text to focus on drug metabolism, as we believe SIMMER’s application to food metabolism merits a more thorough, future investigation.

      4) Showcasing experimental in vitro validation of SIMMER predicted enzyme(s) could greatly strengthen the relevance of this work.

      We have addressed this in Essential Revisions Response 2.

      5) Throughout the text and the title, a more careful and precise phrasing of the tool's scope (characterization of microbiome-encoded enzymatic reactions and not the identification of novel chemical transformations) would improve the reader's understanding of the work.

      We agree, and have reworded many key phrases in the text, including the title.

      Reviewer #3 (Public Review):

      This manuscript presents a new tool, SIMMER, to predict bacterial enzymemediated transformations of compounds, an important and incompletely understood aspect of microbiome drug metabolism. The authors compare their resource to existing resources that allow users to generate hypotheses related to compound toxicity and putative routes of compound metabolism. The authors identify the key innovations of their resource as including full chemical representations of reactions and a novel method to predict an enzyme's EC number (a description of function) from its reaction.

      Strengths

      Generating user-friendly tools to explore existing knowledge of bacterial enzymes and their reactions is important.

      SIMMER is a novel resource where the user provides the substrates and products as input and receives a list of potential microbiome enzymes as output.

      SIMMER includes a novel EC predictor based on reaction rather than based on sequence.

      Weaknesses

      Validation claims are not well supported by the results.

      We have extensively edited the manuscript to better describe our previous computational validations, and we have added new analyses to further evaluate SIMMER. We added an additional validation on an external dataset, an in vitro experimental assessment of SIMMER’s predictions for methotrexate metabolism, two new reactions to the positive control analysis, a false positive rate, and additional comparisons to the two competing methods.

      Need for the user to know both the substrate and the product for a reaction of interest limits the utility of the resource.

      We agree that this is a limitation for the user, but as we show in our Results, relying on substrates alone does not yield appropriate representations of reactions and therefore does not allow for accurate predictions of responsible species/strains and enzymes (i.e., finding True Positives, and confirming associations from previously collected data). We agree that tools requiring only substrates are convenient, but our results show that they are less helpful in finding appropriate metabolism and enzyme predictions. Many studies of biotransformation in the human gut identify the product information or product structure via HPLC, LC-MS, and NMR techniques. In cases where such data was not gathered, or not gathered with enough structural resolution, researchers can use tools such as Biotransformer to make product template predictions before inputting a query to SIMMER. This recommendation is included in the present manuscript’s lines 376–391:

      In instances when DrugBug and MicrobeFDT did make predictions, they suffered from low accuracy (Table 1), which we hypothesized was due to both methods’ reliance on substrate rather than reaction chemistry. Biotransformations involve the relationship between substrate(s), cofactor(s), and an enzyme to yield a particular product(s). As one substrate can exhibit affinity for multiple enzymes, resulting in multiple unique products, sole employment of substrates in a chemical fingerprint does not achieve the resolution necessary to make relevant predictions. To test if SIMMER’s better performance could be attributed to including cofactors and products, we modified our code to run with a chemical representation that includes only the substrate of each positive control reaction. Enzyme prediction accuracy dropped from 88% down to 33%, and EC prediction accuracy dropped from 93% down to 48% (Table 1—source data), supporting the hypothesis that SIMMER’s better performance when compared to DrugBug and MicrobeFDT is due in large part to our using chemical representations that include the full reaction. These results are in line with our previous demonstration that SIMMER clusters enzymatic reaction chemistry only when a full reaction is employed (Figure 2, Figure 2—figure supplement 4).

      Reliance on homology transfer annotation to predict enzyme function; this approach has important, microbiome-relevant, limitations.

      Please refer to our separate Common_Questions.pdf document, Common question 1: Are EC codes sufficient to select enzyme orthologs within an overall class?

    1. Author Response:

      The authors would like to thank the Editors and reviewers for their careful consideration of our article and we express our appreciation for the work required by both Editors and reviewers to study and produce the detailed reviewer reports. We are pleased at the general consensus that our paper is of interest and highlights an important region of the channel for drug-protein interaction. We are also cognizant that the reviewer reports highlight areas where important revisions need to be made to our work before it can be considered fully complete. We will revise the paper according to the comments of the reviewers and submit a new version in the near future which we hope will become the version of record.

    1. Author Response

      We thank the editors and reviewers for their support of our work, as well as their constructive feedback and useful suggestions, which have improved the readability and presentation of the manuscript for a broader audience.

    1. Author Response

      Reviewer 1 (Public Review):

      Fox, Birman, and Gardner use a previously proposed convolutional neural network of the ventral visual pathway to test the behavioral and physiological impact of an attentional gain spotlight operating on the inputs to the network. They show that a gain modulation that matches the behavioral benefit of attentional cueing in a matching behavioral task, induces changes in the receptive fields (RFs) of the model units, which are consistent with previous neurophysiological reports: RF scaling, RF shift towards the attentional focus, and RF shrinkage around the focus of attention. Ingenious simulations then allow them to isolate the specific impact of these RF modulations in achieving performance improvements. The simulations show that RF scaling is primarily responsible for the improvement in performance in this computational model, whereas RF shift does not induce any significant change in decoding performance. This is significant because many previous studies have hypothesized a leading role of RF shifts in attentional selection. With their elegant approach, the authors show in this manuscript that this is questionable and argue that changes in the shape of RFs are epiphenomena of the truly relevant modulation, which is the multiplicative scaling of neural responses.

      Strengths:

      The use of a multi-layer network that accomplishes visual processing, with an approximate correspondence with the visual system, is a strength of this manuscript that allows it to address in a principled way the behavioral advantage contributed by various attentional neural modulations.

      The simulations designed to isolate the contributions of the various RF modulations are very ingenious and convincingly demonstrate a superior role of gain modulation over RF shifts in improving detection performance in the model.

      We thank the reviewer for these supportive comments.

      Weaknesses:

      There is no mention of a possible specificity of the manuscript conclusions in relation to the type of task to be performed. It is conceivable that mechanisms that are not important for detection tasks are instead crucial for a reproduction task, as in Vo et al. (2017).

      We agree that other behavioral tasks may rely on different attentional mechanisms then the ones we have studied here for detection and discrimination and now specifically point this out in the discussion [379-395].

      The manuscript puts emphasis on the biological plausibility of the model, and some quantitative agreements. But at some important points these comparisons do not appear very consistent:

      1) It is unclear what output of the model at each cortical area is to be compared with neurophysiological data. On the one hand, the manuscript argues that a 1.25 attentional factor is consistent with single-neuron results, but here this factor is applied to the inputs into V1 units. When this modulation goes through normalization in area V1, the output of V1 has a 2x gain. Intuitively, one would think that recordings in V1 neurons would correspond to layer V1 outputs in the model, but this is not the approach taken in the manuscript. This needs clarification. Also, note that the 20-40% gain reported in line 287 corresponds to high-order visual areas (V4 or MT), but not to V1, in the cited references. The quantitative correspondence between gain factors at various processing steps in the model and in the data is confusing and should be clearer.

      We agree that making a one-to-one mapping of gain effects measured in neurophysiology and different layers of the CNN is problematic. We therefore have clarified that the introduction of gain at the earliest stages of processing is meant to study how gain propagates through a complex CNN and has downstream effects [49-52 and 410-447] and we have also also clarified the various uncertainties in making one-to-one mapping from the CNN to neurophysiological measurements of gain [410-447].

      2) The model assumes a gain modulation in the inputs to V1. This would correspond to an attentional gain modulation in LGN unit outputs. There is little evidence of such strong modulation of LGN activity by attention. Also in V1 attentional modulation is small. As stated in Discussion (line 295), there is no reason to favor the current model as opposed to a model where the attentional gain is imposed later on in the visual hierarchy (for example V4). If anything, neurophysiology would be more consistent with this last scenario, given the evidence for direct V4 gain control from frontal eye fields (Moore and Armstrong, Nature 2003). The rationale for focusing on a model that incorporates the attentional spotlight on the inputs to V1 should be disclosed.

      We agree that measurements of gain changes with attention appear larger in later stages of visual processing and do not wish to explicitly link the gain changes imposed at the earliest stages of processing in our CNN observer model with changes in input from LGN as we agree this would be unrealistic. Instead, our goal was to examine how gain changes can propagate through complex neural networks and cause downstream effects on spatial tuning properties and the efficacy of readout. We have substantially re-written the manuscript, in particular the introduction [24-38, 49-52] and discussion [441-447] to better describe this rationale. We also now explicitly discuss how our propagated gain test demonstrates exactly the reviewer’s point - that gain can be injected late in the system, rather than at the earliest stages [274-276, 441-447].

      3) The model chosen is the CORnet-z model, but this model does not include recurrent dynamics within each layer. Recurrent dynamics is a prominent feature in the cortex, and there is evidence indicating that attentional modulations operate differently in feedforward and in recurrent architectures (Compte and Wang, Cerebral Cortex 2006). A specific feature of recurrent models is that the attentional spotlight need not be a multiplicative factor (which is biologically complicated) but an additive term before the ReLU non-linearity, which achieves the expected RF modulations (Compte and Wang, 2006). A model with recurrence thus represents another architecture that links gain and shift in a way that has not been explored in this manuscript, and this may limit the generalization of the conclusions (line 205).

      We appreciate the reviewer pointing us toward the Compte paper and we’ve added a discussion of recurrence as an alternate model [410-423].

      Reviewer 2 (Public Review):

      This manuscript by Fox, Birman, and Gardner combines human behavioral experiments with spatial attention manipulation and computational modeling (image-computable convolutional neural network models) to investigate the computational mechanisms that may underlie improvements in behavioral performance when deploying spatial attention.

      Strengths:

      • The manuscript is clear and the analyses, modeling, and exposition are executed well.

      • The behavioral experiments are carefully conducted and of high quality.

      • The manuscript takes a creative approach to constructing a ”neural network observer model”, that is, coupling an image-computable model to a potential readout mechanism that specifies how the representations might be used for the purposes of behavior. The focused analyses of the model innards (architecture, parameters) provide insight into how different model components lead to the final behavior of the model.

      Thank you for these supportive comments.

      Weaknesses:

      • The overall conclusions and insights gained seem heavily dependent on particular choices and design decisions made in this specific model. In particular, the readout mechanism lacks some critical descriptive details, and it is not clear whether the readout mechanism (512-dimensional representation that reflects summing over visual space) is a reasonable choice. As such, while the computational analyses and results may be correct for this model, it is not clear whether the strong general conclusions are justified. Thus, the results in their current form feel more like exploratory work showing proof of concept of how the issue of attention and underlying computational mechanisms can be studied in a rigorous and concrete computational modeling context, rather than definitive results concerning how attention operates in the visual system.

      Please see below for our response to the issue with readout and conclusions.

      Overall, the work is solidly constructed, but the overall generality and strength of the conclusions require substantial dampening.

    1. Author Response:

      We would like to thank the reviewers for their time, insights, and constructive feedback. We appreciate the recognition by the reviewers of the value and importance of our study. The reviewers also highlighted: the importance of carefully using and interpreting data from small molecule inhibitors due to possible off-target effects, considering inter-study differences in the cardiomyocyte cell trajectories, examining a possible role of PI3K signaling in proliferation and the intriguing yet not fully elucidated role of membrane protrusions in cardiac fusion. We agree with this important feedback. We plan to address these comments and others directly, in detail.

    1. Author Response:

      We thank the reviewers and editors for their careful reading and reviews of our work. We are grateful that they appreciate the value in our experimental approach and results. We acknowledge what we interpret as the major criticism, that in our original manuscript we focused too heavily on the hypothesized role of GABAergic neurons in driving habituation. This hypothesis will remain only indirectly supported until we can identify a GABAergic population of neurons that drives habituation. Therefore, we will revise our manuscript, decreasing the focus on GABA, and rather emphasizing the following three points:

      1. By performing the first Ca2+ imaging experiments during dark flash habituation, we identify multiple distinct functional classes of neurons which have different adaptation profiles, including non-adapting and potentiating classes. These neurons are spread throughout the brain, indicating that habituation is a complex and distributed process. 

      2. By performing a pharmacological screen for dark flash habituation modifiers, we confirm habituation behaviour manifests from multiple distinct molecular mechanisms that independently modulate different behavioural outputs. We also implicate multiple novel pathways in habituation plasticity, some of which we have validated through dose-response studies.

      3. By combining pharmacology and Ca2+ imaging, we did not observe a simple relationship between the behavioural effects of a drug treatment and functional alterations in neurons. This observation further supports our model that habituation is a multidimensional process, for which a simple circuit model will be insufficient. 

      We would like to point out that, in our opinion, there appears to be a factual error in the final sentence of the eLife assessment: “However, the data presented are incomplete and do not show a convincing causative link between pharmacological manipulations, neural activity patterns, and behavioral outcomes”. We believe that a “convincing causative link” between pharmacological manipulations and behavioural outcomes has been clearly demonstrated for PTX, Melatonin, Estradiol and Hexestrol through our dose response experiments. Similarly a link between pharmacology and neural activity patterns has also been directly demonstrated. As mentioned in (3), we acknowledge that our data linking neural activity and behaviour is more tenuous, as will be more explicitly reflected in our revised manuscript. Nevertheless, we maintain that one of the primary strengths of our study is our attempt to integrate analyses that span the behavioural, pharmacological, and neural activity-levels.

    1. Author Response

      Reviewer #1 (Public Review):

      Rosas et al studied the mechanism/s that enabled carbapenems resistance of a Klebsiella isolate, FK688, which was isolated from an infected patient. To identify and characterize this mechanism, they used a combination of multiple methods. They started by sequencing the genome of this strain by a combination of short and long read sequencing. They show that Klebsiella FK688 does not encode a carbapenemase, and thus looked for other mechanisms that can explain this resistance. They discover that both DHA-1 (located on the mega-plasmid) and an inactivation of the porin OmpK36, are required for carbapenem resistance in this strain. By using experimental evolution, it was shown that resistance is lost rapidly in the absence of antibiotics selection, by a deletion in pNAR1 that removed blaDHA-1. Moreover, their results suggested that it is likely that exposure to other antibiotics selected for the acquisition of the mega-plasmid that carries DHA-1, which then enabled this strain to gain resistance to carbapenemase by a single deletion.

      The major strength of this study is the use of various approaches, to tackle an important and interesting problem.

      The conclusions of this paper are mostly well supported by data, but one aspect is not clear enough. The description of the evolutionary experiment is not clear. I could not find a clear description of the names of the evolved populations. However, the authors describe strains B3 and A2, but their source is not clear. The legends of the relevant figure (Figure 5) are confusing. For example, the text describing panel B is not related to the image shown in this panel. Moreover, it is shown in panel C (and written in the main text) that the OmpK36+ evolved populations had only translucent colonies, so what is the source of B3(o)?

      We appreciate the point and in response have added a panel to Figure 5 (in the revised paper this is now Fig. 5A) to illustrate the evolutionary experiment and specify that there are two lineages (A and B) with 20 replicates each that, after 200 generations of evolution, give rise to populations of which A2 and B3 are the exemplars characterized.

      We have corrected the legends in Figure 5.

      We now explain (sentence starting on Line 197) that the B3 (o) is the single isolate of an opaque colony from lineage B3, it is the only colony that we identified from out of 595 colonies observed in the B3 population. B3(o) was sequenced and analysed as a comparator and has some value in that regard, despite being an anomaly.

      Reviewer #2 (Public Review):

      The authors sequenced a clinical pathogen, Klebsiella FK688, and definitively establish the genetic basis of the carbapenem-resistance phenotype of this strain. They also show that the causal mutations confer reduced fitness under laboratory conditions, and that carbapenem sensitivity readily re-evolves in the lab due to the fitness costs associated with the resistance mutations in the clinical isolate. They also establish that subinhibitory concentrations of ceftazidime select for the otherwise deleterious blaDHA-1 gene. Based on this finding the authors speculate that prior beta-lactam selection faced by the ancestors of Klebsiella FK688 potentiated the evolution of the carbapenem-resistance phenotype of this strain. If this hypothesis is true, then prior history of beta-lactam exposure may generally potentiate the evolution of carbapenem resistance.

      Strengths:

      From a technical perspective, the findings in this paper are solid. In addition, the authors establish a simple genetic basis for carbapenem resistance in a clinical strain, which is a valuable and non-trivial finding (i.e. they show that the CRE phenotype in this strain is not an omnigenic trait distributed over hundreds of loci).

      Weaknesses:

      The main weakness of this paper is that the authors draw overly broad conclusions of a conceptual nature from narrow experimental findings. This could be addressed by drawing more modest and narrow implications from the findings.

      1) The title of this paper is "Treatment history shapes the evolution of complex carbapenem-resistant phenotypes in Klebsiella spp." But they provide no data on the treatment history of the patient from whom this strain was isolated from. Therefore, the authors have no evidence to support their central claim. Indeed, it is completely possible that this strain never faced beta-lactam selection in the past, or that the patient's hypothetical history of betalactamase was irrelevant for the evolution of FK688. First, it is completely possible that this is a hospital-acquired infection, such that the history of this strain is due to selection in other contexts in the hospital that have little to do with the patient's treatment history. Second, it is completely possible that this strain (the chromosome anyway) has no prior history of beta-lactamase selection, and that it acquired the megaplasmid containing blaDHA-1 via conjugation from some other strain. In this second hypothetical scenario, it is possible that the fitness cost of the blaDHA-1 gene is not particularly high in a different source strain, but that it has some cost in the FK688 strain that it was isolated from. And of course, fitness costs in the human host could be very different than fitness costs in the laboratory, where strains are evolving under strong selection for fast growth. And given the benefit of resistance, it's clear that this strain clearly has a strong fitness advantage over faster-growing sensitive strains in the context of the source patient under antibiotic treatment.

      My general point here is that the broad claims made about patient history or prior history shaping the evolution of this strain are largely indefensible because there is no data here to make solid inferences about how prior history shaped the evolution of this strain.

      We appreciate the point and have changed our title and scaled back the strength of our conclusions regarding patient treatment history.

      2) Historical contingency. The authors claim that their work shows how historical contingency shapes the evolution of resistance. One problem with this claim is that it is trivial- this is only a significant claim if the reader believes that prior history is not important in the evolution of antibiotic resistance, which is a straw-man null hypothesis, to mix a couple metaphors. To be more concrete, clearly strain background (prior history) matters-eliminating the plasmid with the resistance gene eliminates resistance. But that is not particularly surprising, given the past 50 years of evolutionary microbiology literature on plasmids and resistance. By contrast to this work, the major contribution of papers that examine the role of historical contingency in evolution (i.e. various Lenski papers) is that those works quantitatively measure the role of history in comparison to other factors (chance, adaptation). Since this work is a deep dive into a single clinical isolate, the data presented here do not and cannot shed light on the role of historical contingency in the emergence of this strain. The authors' claims about the prior history that led to the CRE phenotype are reasonable- but are fundamentally speculative. I have nothing against speculation, as long as it is clear what claims are speculative, and what are concrete implications. But the authors frame these speculative claims as concrete implications of their findings.

      This is a fair point. We have reframed the study to not focus on historical contingency.

      As the reviewer points out, any discussion about historical contingency in the context of evolution is trivial in one sense. One of the reasons that the studies of Lenski and Blount provide new insights into the role of historical evolution because they knew the history of their populations (at, least for the number of generations since the LTEE began), and had a high degree of control and understanding of the growth conditions where the trait evolved. As such, they could go back to time points before the trait evolved, and then repeat the evolution experiment many times, in the exact same environment where the trait originally evolved, and then count how often they observed the evolution of that trait.

      Here we study a clinical isolate, and have less understanding of the evolutionary history of our strain. While we cannot re-evolve carbapenem resistant in the exact same environment experienced by the FK688 strain, we did test the capacity for the wild type, and two possible intermediate genotypes genotypes, to evolve carbapenem resistance in growth media with carbapenem.

      Altogether- we have comprehensive evidence for the genetic cause of carbapenem resistance: the BLA1 plasmid + OmpK36. We showed, by experiment, that it is much more likely for carbapenem resistance to evolve in a FK688 strain that carries the BLA1 plasmid, than in an FK688 strain that did not carry the plasmid even if it had acquired the OmpK36 mutation. We think this not trivial because a significant proportion of all of the carbapenem resistant Klebsiella that have been isolated are non-carbapenemase CRE. Our reconstruction provides a plausible explanation for why non-carbapenemase CRE evolve – because they are evolving from strains that have already been treated with a non-carbapenem beta-lactam drug and have thereby selected for the presence of a beta-lactamase (that is not a carbapenemase).

      So, while we have scaled back the strength of our claims, we do think that our results can provide some insight into how the evolutionary history of a pathogen can shape the molecular path to antibiotic resistance.

      3) The authors claim that "[This work] suggests that the strategic combinations of antibiotics could direct the evolution of low-fitness, drug-resistant genotypes". I suppose this is true, but I also think this is a stretch of an implication given these findings. To be blunt, while I suppose it's better to have costly resistance variants that re-evolve sensitivity than to have low-cost high-resistance strains circulating, I think the patient's family would probably disagree that the evolution of a low-fitness drug-resistant genotype was good or strategic in the clinical context, even if better from a public health perspective. Low-fitness drug-resistant strains are just as lethal under clinical antibiotic concentrations!

      Thank you for the comment, we see how this sentence could be seen as too strong a conclusion and have rewritten the last sentence of the DISCUSSION (line 351):

      “These results show how an individual’s treatment history might shape the evolution of AMR, and should be taken into consideration in order to explain the evolution of non-carbapenemase CRE”

      The authors do show the plausibility of their hypothesis/model that prior beta-lactam selection is sufficient to potentiate the evolution of carbapenem-resistance (by the additional ompK loss-of-function mutation). I think those findings are very nice. But the authors undermine their results by extrapolating too far from their data. Hence, I think narrowing the scope of the implications would improve this paper.

      In addition to narrowing the scope of the implications as written, I also would like to add that there may be other ways of framing this paper (other than historical contingency) that may make the significance of this work more apparent to a broader audience. This may be worth considering during the revision process.

      We have taken these suggestions on board and have re-framed the final sentences of the ABSTRACT, INTRODUCTION and DISCUSSION accordingly. Specifically, we have removed reference to historical contingency and instead have reframed our experiments as providing a genetic and evolutionary explanation for an interesting and concerning cause of antibiotic resistance – non-carbapenemase CRE.

    1. Author Response

      Reviewer #1 (Public Review):

      During the height of the Covid19-pandemic, there was great and widely spread concern about the lowered protection the screening programs within the cancer area could offer. Not only were programs halted for some periods because of a lack of staff or concern about the spreading of SARS CoV2. When screening activities were upheld, participation decreased, and follow-up of positive test results was delayed. Mariam El-Zein and coworkers have addressed this concern in the context of cervical screening in Canada, one of the rather few countries in the world with well organized, population-based, although regionalized, cervical screening program.

      Comment 1: Despite the existence of screening registries, they choose to do this in form of a survey on the internet, to different professional groups within the chain of care in cervical screening and colposcopy. The reason for taking this "soft data" approach is somewhat diffuse.

      We are happy to provide a counterargument to the reviewer’s concern about the “soft data” approach. Our unit – McGill’s Division of Cancer Epidemiology – is a major stakeholder in policymaking and cervical screening guideline development in Canada. It is one of the components in a McGill Task Force on COVID-19 and Cancer that has been widely engaged in assessing the pandemic’s impact on the entire spectrum of cancer control and care (examples: PMID: 33669102, PMID: 34843106). Canada is a country of continental size, and during the pandemic even travel between provinces was interrupted. It is only via a web-based survey that one could have captured the required information. We took advantage of our unit’s credibility and stature to secure a substantial response to the survey, which elicited a high level of detail.

      The survey questionnaire instrument was thoughtfully developed with input from Canadian experts who are active in the field of cervical cancer prevention and involved in clinical care to comprehensively formulate informative questions (and practical, reasonable responses) underpinning each of the themes covered. Of note, some of these coinvestigators, having executive roles in relevant clinical professional bodies, advised our team on the logistics of circulating the survey to members. The administration of the survey was coordinated with the pertinent societies. Our aim was to provide an overall portrait across Canada of the extent of the harms to cervical cancer screening and treatment processes at the beginning of the COVID-19 pandemic (specifically a snapshot from mid-March to mid-August 2020), as perceived by professional groups in multiple health disciplines.

      Indeed, as the reviewer mentioned, there are fully (i.e., for Saskatchewan) and partially (i.e., for British Columbia, Alberta, Manitoba, Ontario) organized cervical cancer screening programs in Canada in addition to opportunistic programs (i.e., for North West Territories, Yukon, Nunavut, Quebec). The Canadian Partnership Against Cancer also collects information on cervical cancer screening programs and/or strategies across Canada. Using data from these different sources enables a quantitative assessment of the impact of the pandemic on cervical cancer screening, but this was not the research methodology used; the survey approach was our research strategy as we attempted to collect responses from all provinces and territories, regardless of the different screening programs and modalities implemented across the country, and including regions that do not have an official screening program.

      Since the effects of the COVID-19 pandemic will stay with us for years to come, our research team is also examining – using a “hard data” approach via administrative healthcare datasets – the long-term effects that will accrue on cervical cancer morbidity and mortality from the interruptions and delays in screening processes and other activities in the process of care. A discussion of this is, however, beyond the scope and objectives of our manuscript.

      No modifications were made in the manuscript to address this comment.

      Comment 2: The authors claim they want to "capture modifications". However, the suggestions that come from this study are limited and are submitted for publication 2 years after the survey when the height of the pandemic has passed long since, and its burden on the screening program has largely disappeared. The value of the study had been larger if either the conclusions had been communicated almost directly, or if the survey had been done later, to sum up the total effect of the pandemic on the Canadian cervical screening program.

      We appreciate this comment. As part of our commitment to transparency, we now plainly acknowledge that considerable time (1.5 years) has elapsed between the time the survey data were available (March 2021) and manuscript submission (September 2022) for publication in the special issue, curated by eLife, on the impact of the COVID-19 pandemic on cancer prevention, control, care and survivorship. However, we also argue that this lag time is reasonable given the undertaking of data management, analysis, and reporting of a large amount of data, including the synthesis of replies to open-ended questions. We also took this opportunity to expose two graduate students to the research process.

      Changes made: Page 15, Lines 437-440.

      In terms of assessing the total effect of the pandemic on the Canadian cervical screening program, this work is in progress, but not within the current manuscript. The PubMed references mentioned above show examples of directions we are taking. Also, as mentioned in our response 1 to comment 1, we will use data from administrative healthcare datasets (medical and drug claims, hospitalization data, death registry data) and hospital cancer registries (clinical characteristics such as cancer stage, grade, and biomarkers) on cancer patients diagnosed in Quebec between 2010 and 2026. Using these datasets, we intend to compare the pre- and post-pandemic eras in order to analyze changes in patterns of cancer care, cancer prognosis, and survival, including shifts at stage at diagnosis.

      Comment 3: Another major problem with this study is the coverage. The results of persistent activities to get a large uptake is somewhat depressing although this is not expressed by the authors. 510 professionals filled out the survey partially or in total. 10 professions were targeted. The authors make no attempt to assess the coverage or the validity of the sample. They state the method used does not make that possible. But the number of family practicians, colposcopists, cytotechnicians, etc. involved in the program should roughly be known and the proportion of those who answered the survey could have been calculated. My guess is that it is far below 10%.

      There were no extensive additional efforts to increase participation rate, apart from follow-up reminder emails to complete the survey, which is standard practice followed by the societies that administered the survey to their constituents. We respectfully disagree with the reviewer concerning coverage being a major limitation, particularly in view of the difficulty in general to secure a high response rate in a survey such as ours, at a time like the middle of the pandemic. Although it appears to be a seemingly easy to compute classic non-response rate, information on the “population of interest” (i.e., number of professionals approached in addition to the advertisement of the survey on social media platform”) is not available to estimate the extent of non-response. Even if the response rate is below 10% as suggested by the Reviewer, our survey and findings should be considered on their merits; the target population was involved in the survey design to ensure the validity of coverage of the questions along the continuum of care in cervical cancer screening and treatment. In addition, we followed the Checklist for Reporting Results of Internet E-surveys to inform the design, conduct, and reporting of our survey research.

      Changes made: Page 14, Lines 421-425.

      Comment 4: The national distribution seems shewed despite the authors boosting its pan-Canadian character. I am just faintly familiar with the Canadian regions, but, as an example, only 2 replies from Quebec must question the national validity of this survey.

      We apologize for this typo error in Table 1; many cells were accidently shifted down (the last couple of provinces had the wrong numbers). There were actually 21 survey respondents from the province of Quebec. This has now been corrected.

      Changes made: Page 19.

      Comment 5: The result section is dominated by quantitative data from the responses to the 61 questions. All questions and their answers are tabulated. As there is no way to assess the selection bias of the answers these quantitative results have no real value from an epidemiological standpoint.

      Indeed, we opted to provide the reader with descriptive results on all the questions and sub-questions that were asked, with explicit annotation to each question number and clear reference to the formulated question by appending the full survey instrument to the manuscript. We designed the survey as a descriptive and not an analytical study, contrary to traditional epidemiology studies that investigate a specific exposure-outcome relationship.

      Changes made: Page 12, Lines 366-368.

      In the spirit of other papers in the special issue on COVID-19 and cancer, curated by eLife, we measured the impact of the pandemic on the process of care like many other eLife articles did. The eLife collection is a snapshot of a period when not only was cancer control disrupted, but the ability to conduct valid research was also severely curtailed. The reviewer will likely agree that our paper is not the only one to suffer from these methodological shortcomings. Yet, taken together, the gestalt value of the eLife collection will inform epidemiologic modellers for the next long while on how this period affected cancer control. We are happy to contribute with this paper a few more pieces of the puzzle, adding to that which eLife published for many other jurisdictions.

      Comment 6: The replies to the open-ended questions are summarized in a table and in the text. The main conclusion of the content analysis of the answers to the direct questions, and one of the main conclusions of the study, is that the majority favors HPV self-sampling in light of the pandemic. However, this not-surprising view is taken by only 80 responders while almost as many (n=60) had no knowledge about HPV self-sampling.

      Another aim of our survey was to identify the windows of opportunity that were created by the pandemic and pinpoint positive aspects that could enable the transformation of cervical cancer screening (i.e., HPV primary based screening and HPV self-sampling). We found that 33% of respondents were of the opinion that the pandemic context could facilitate the implementation of self-sampling and that 50.1% were in favor of the implementation of this new screening practice (described in Results Theme 1: Screening Practice and Stable 5).

      Changes made: Page 4, Lines 93-97.

      The reviewer is correct that in the open-ended sub-question of Question 23 “Are you in favor of the implementation of HPV self-sampling as an alternative screening method in your clinical practice?”, 60 respondents justified their answer to the nominal question by their lack of familiarity with HPV self-sampling, compared to 80 who shared positive comments. However, we would like to draw the reviewer’s attention to the responses to the nominal part of the question in Stable 5. Of those who answered “Maybe”, 47.1% said that they were not familiar enough to express a favorable or unfavorable opinion. We would also like to draw the reviewer’s attention to the results of our cross-tabulation of profession and the question of relevance (described in Results Theme 1: Screening Practice). The lack of familiarity with novel screening practices such as self-sampling can be explained by the fact that most (75.0%) of those who expressed these views were primary healthcare professionals, and not secondary and tertiary specialists.

      Changes made: Page 12, Lines 344-346

      Comment 7: The authors conclude that their study identified the need for recommendations and strategies and building resilience in the screening system. No one would dispute the need, but the additional weight this study adds, unfortunately, is low, from a scientific standpoint.

      Although no one would dispute the need as the reviewer is suggesting, but as epidemiologists we needed to collect this empirical evidence. We urge the reviewer to consider that this article is to contribute to a more complete picture of the collective process of discovery of the impact of the pandemic initiated by eLife’s special issue.

      No modifications were made in the manuscript to address this comment.

      Comment 8: The conclusion I draw from this study is that the authors have done a good job in identifying some possible areas within the Canadian screening programs where the SARS-Cov2 pandemic had negative effects and received some support for that in a survey. Furthermore, they listed a few actions that could be taken to alleviate the vulnerability of the program in a future similar situation, and received limited support for that. No more, no less.

      We thank the Reviewer for the positive feedback provided in the first part of the comment. As for the rest, we believe we have addressed above the reviewer’s concerns.

      Reviewer #2 (Public Review):

      The study aimed to provide information on the extent to which the COVID-19 pandemic impacted cervical cancer (CC) screening and treatment in 3 Canadian provinces. The survey methodology is appropriate, and the results provide detailed descriptive statistics by province and type of practice. The results support the authors' conclusions. This evidence together with data gathered from other national surveys may provide baseline data on the impact of the pandemic on CC outcomes such as late-stage diagnoses and CC treatment outcomes due to these delays.

      We are flattered by the Reviewer’s overall assessment of our manuscript.

      Comment: This study relies mostly on descriptive statistics and open-ended questions that provide details about what CC screening and treatment procedures were delayed. It is unclear how the reader would use the results to affect current or future practice.

      As mentioned in our reply above to a similar comment raised by reviewer 1, our overarching aim was to portray in a purely descriptive manner the negative and positive impacts of the COVID-19 pandemic on cervical cancer screening-related activities, as perceived by healthcare professionals. Please refer to arguments above.

      Changes made: Page 12, Lines 366-368; Page 15, Lines 437-440.

    1. Author Response

      Reviewer #1 (Public Review):

      In this study, the authors set out to determine the degree to which early language experience affects neural representations of concepts. To do so, they use fMRI to measure responses to 90 words in adults who are deaf. One group of deaf adults (n=16) were native signers (and thus had early language exposure); a second group (n=21) was exposed to sign language later on. The groups were relatively well-matched in other respects. The primary finding was that the high dimensional representations of concepts in the left lateral anterior temporal lobe (ATL) differed between native and delayed signers, suggesting a role for early language experience in concept representation.

      The analyses are carefully conducted and reflect a number of thoughtful choices. These include the "inverted MDS" method for constructing semantic RDMs, a normal hearing comparison group for both behavioral and fMRI data, and care taken to avoid bias in defining functional ROIs. And, comparing early and delayed signing groups is a clever way to study the role of early language experience on adult language representations.

      We greatly appreciate the reviewer’s positive evaluation and constructive comments on our study.

      One interesting result that I struggled to put in a broader context relates to the disconnect between behavioral and neural results. Specifically, the behavioral semantic RDMs (Figure 1a) did not differ between any of the groups of participants. This suggests that the representations of the 90 concepts are represented similarly in all of the participants. However, the similarity of the neural RDMs in left lateral ATL differs between the native and delayed signing groups (but not in other regions). Given the similarity of the behavioral semantic RDMs, it is unclear how to interpret the difference in left lateral ATL representations. In other words, the neural differences in left ATL do not affect behavior (semantic representation). The importance of the differences in neural RDMs is therefore questionable.

      Thank you for this comment. In the Revision we have added explicit discussions about this important issue of the relationship between the behavioral and neural profiles for semantics:

      Introduction (pages 4-5): “(previous) studies have reported little effects on semantics behaviors, including semantic interference effects in the picture-sign paradigm (Baus et al., 2008), scalar implicature (Davidson and Mayberry, 2015), or accuracy scores of several written word semantic tasks (e.g., synonym judgment) (Choubsaz and Gheitury, 2017). However, as shown by the color knowledge in the congenitally blind studies (e.g., Wang et al., 2020), similar semantic behaviors may arise from (partly) different neural representations. Semantic processing is supported by a multifaceted cognitive system and a complex neural network entailing distributed semantic regions (Bi, 2021; Binder and Desai, 2011; Lambon Ralph et al., 2017; Martin, 2016), and thus focal neural changes may not necessarily lead to semantic behavioral changes. Neurally, neurophysiological signatures assumed to reflect semantic processes showed incongruent effects across studies: N400 effects in the semantic violation of written sentences were not affected (Skotara et al., 2012), whereas M400 in the picture-sign matching task showed atypical activation patterns (reduced recruitment of left fronto-temporal regions and involvement of right parietal and occipital regions) (Ferjan Ramirez et al., 2016, 2014; Mayberry et al., 2018). It remains to be tested whether and where delayed L1 acquisition affects how semantics are neurally represented, using imaging techniques with higher spatial resolutions.”

      Discussion (pages 17-18): “Notably, different from phonological and syntactic processes, where both visible behavioral underdevelopment (e.g., Caselli et al., 2021; Cheng and Mayberry, 2021; Mayberry et al., 2002) and brain functional changes (Mayberry et al., 2011; Richardson et al., 2020; Twomey et al., 2020) were observed, for semantics we only observed brain functional changes in dATL but no visible behavioral effects. Consistent with the literature where deaf delayed signers did not show differences to controls in semantic interference effects in the picture-sign paradigm (Baus et al., 2008), scalar implicature (Davidson and Mayberry, 2015), or N400 measures (Skotara et al., 2012), we did not observe visible differences in terms of semantic distance structures (Figure 1a) or reaction time of lexical decision and word-triplet semantic judgment (Supplementary file 1). As reasoned in the Introduction, this seeming neuro-behavior discrepancy might be related to the multifaceted, distributed nature of the cognitive and neural basis of semantics more broadly. The general semantic behavioral tasks we employed could be achieved with representations derived from multiple types of experiences, supported by highly distributed neural systems (e.g., (Bi, 2021; Binder and Desai, 2011; Lambon Ralph et al., 2017; Martin, 2016), including those not affected by the delayed L1 acquisition in regions beyond the dATL. This finding invites future studies to specify the exact developmental mechanisms in the left dATL (Fu et al., 2022; Unger and Fisher, 2021) and to uncover semantic behavioral consequences related to the functionality of this area.”

      An important point is that, if I understand correctly, the semantic space is defined by the 90 experimental items. That is, behavioral RDMs were created by having normal hearing participants arrange 90 items spatially, and neural RDMs were created by comparing patterns of responses to these 90 experimental items. This 90-dimensional space is thus both (a) lower dimensional than many semantic space models that include hundreds of directions and (b) constrained by the specific 90 experimental items chosen. On the one hand, this seems to limit the generalizability of the findings for semantic representations more broadly.

      Indeed, for the RDM the spaces were constructed by the relations among the 90 items, as is the standard practice for current RSA analyses. Regarding the dimensionality issue, we would like to clarify that although the space is a 90 x 90 matrix, the semantic distance for each pair was obtained by the subjects’ ratings, i.e., the psychological space, which is likely to be high-dimensional in nature. That is, we compressed the potentially high-dimensional psychological construct into one measure to construct the 90 x 90 matrix. If we understood correctly, semantic space models with hundreds of directions the reviewer referred to are various types of embedding and/or distributional models. There although each word is projected onto a high-dimensional vector, the distance for each pair is still extracted (e.g., by cosine similarity) to construct the cross-item similarity matrix for RSA. Regarding the generalization of the findings across items, we greatly appreciate this concern and indeed that was one of the reasons why we extracted the categorical structure based on the clustering of the items (see also response to the next Comment). We also examined the univariate abstractness contrast, which looked at the broad categorical effects rather than specific items. We have made clarifications accordingly in the Revision to address these concerns (page 8).

      The logic behind using a categorical semantic RDM (e.g., Figure 2a) was not clear. The behavioral semantic RDMs (Figure 1a) clearly show gradations in dissimilarity, particularly for the abstract categories. It would seem that using the behavioral semantic RDM would capture a more accurate representation of the semantic space than the categorical one.

      Thank you for this suggestion. We opted for the categorical structural similarity based on the clustering analyses to boost signal and to allow for better generalization across items (i.e., along the categorical structure). Agreeing with the reviewer that such an approach may lose the important graded space especially for the abstract items, we added an analysis using continuous semantic distances specifically focused on the abstract items (page 10):

      “1) Types of semantic distance measures: While semantic categories for concrete/object words are robust and well-documented, the semantic categorization within the abstract/nonobject words is much fuzzier and remains controversial (Catricalà et al., 2014; Wang et al., 2021). The behavioral semantic RDM in Figure 1a indeed shows gradations in dissimilarity for abstract/nonobject words. We thus checked the two groups’ semantic RDMs using the continuous behavioral measures and further examined whether group differences in the left dATL were affected by the types of semantic distance (categorical vs. continuous) being used for abstract/nonobject words. The two deaf groups showed comparable similarities to the hearing benchmark (by correlating each deaf subject’s RDM with the group-averaged RDM of hearing subjects, Welch’s t23.0 = -0.12, two-tailed p = .90). RSA was performed by correlating each deaf subject’s neural RDM in the left dATL with these two types of semantic RDMs. Significant group differences were observed (Figure 3), for both the categorical RDM (Welch’s t31.0 = 3.06, two-tailed p = .005, Hedges’ g = 0.98) and the continuous behavioral semantic RDM (Welch’s t36.7 = 2.47, two-tailed p = .018, Hedges’ g = 0.76), with significant semantic encoding in dATL observed in both analyses for native signers (one-tailed ps < .003) and neither for delay signers (one-tailed ps > .42). These results indicate that the reduced dATL encoding of abstract/nonobject word meanings induced by delayed L1 acquisition was reliable across semantic distance measures.”

      As the reviewer suggested, we could also carry out RSA using the 90-word behavioral semantic RDM. We did observe similar group differences with this RDM, with delayed signers showing a trend of semantic encoding reduction in the left dATL relative to native signers (native signers, mean (SD): 0.019 (0.023); delayed signers, mean (SD): 0.006 (0.022), Welch’s t31.5 = 1.78, two-tailed p = .085; a delayed signer was excluded from this analysis for being an outlier beyond 3 standard deviations). It appears that the behavioral semantic RDM yielded smaller effect sizes in group differences than the categorical RDM, but the ANOVA (the within-subject factor - RDM-type: categorical, behavioral; the between-subject factor – group: native, delayed) revealed no significant effects of RDM-type or its interaction with the group (ps > .71), but a significant main effect of group (F(1,36) = 9.19, p = .004). The seemingly weaker group differences using the behavioral semantic RDM should not be over-interpreted.

      Reviewer #2 (Public Review):

      The authors investigated patterns of fMRI activation for familiar words in two groups of deaf people. One "language rich" group received exposure to sign from birth, whereas the "language poor" group included kids born to hearing parents who had limited exposure to language during the first few years of life. The primary findings involved group differences in BOLD activation patterns across different areas of interest within the semantic network when participants made intermittent 1-back category judgments for words appearing in succession.

      There was much to be liked about this study, including the rigor of the methods and the novel contrasts of two deaf samples. These strengths were balanced by a number of questions about the assumptions and theoretical interpretations underlying the data. I will elaborate on the major points in the paragraphs to follow, but briefly, the ways in which the authors are framing critical period constraints in language fundamentally differ from the standard nativist perspectives (e.g., Chomsky, Lenneberg). The assumptions of what constitutes a deprivation model require further justification and perhaps recasting to avoid unnecessary stigma (i.e., this reviewer was uncomfortable with the assertion that being born deaf to hearing parents by default constitutes deprivation). The introduction lacked principled hypotheses that motivated the choice of comparing abstract and concrete words, and potential accounts of group differences were underdeveloped (e.g., how do parents in China typically react to having a deaf child, and what supports are in place for preventing language deprivation? Are newborn infants universally screened for hearing loss in China? The answers to these questions might help the readers to understand why/how deaf children in this circumstance might experience deprivation).

      We appreciate the reviewer’s positive evaluations and constructive comments on our study. We have revised the manuscript substantially in light of these comments (see below).

      References to critical periods require a bit more elaboration with respect to lexical-semantic vs. semantic acquisition. The nature of the critical period in language acquisition remains controversial with respect to its constraints. Lenneberg and Chomsky speculated that the limit of the critical period for language acquisition was about puberty (13ish years of age). This is much older than the deaf sample tested here so arguments about aging out of the critical period at least for language acquisition need more nuance. Another issue relates to learning semantic mappings vs. learning language as falling under the same critical period umbrella. This seems highly unlikely as semantic acquisition in early childhood is aided by linguistic labeling but would likely occur in parallel even in the context of language deprivation. Much of the prior literature on critical periods and nativist approaches to language development has focused on syntactic acquisition and elements such as recursion rather than a mapping of symbols to conceptual referents. This makes the critical period group comparison somewhat tenuous because what you are really interested in is a critical period for word meaning acquisition not the more general case of syntactic competency.

      The point above is highlighted in the following statement underlying one of the primary assumptions of the study:

      Pg. 3, "Here, we take advantage of a special early-life language-deprivation human model: individuals who were born profoundly deaf in hearing families and thus had very limited natural language exposure (speech or sign) during the critical period of language acquisition in early childhood"

      "hypofunction of the language system as a result of missing the critical period of language acquisition" (pg 3), same critique as previous - the critical period window is thought to be 13ish years old.

      There are a couple of problems with this assertion/assumption. Although it is true that most children who are born deaf have hearing parents, it is not justifiable to label this condition an early-life deprivation model. Hearing parents who are extremely motivated to learn sign language and pursue related language enrichment strategies can successfully offset many of these effects. Similarly, it is not inconceivable that a deaf child born to a deaf parent might be neglected or abandoned without the benefit of early sign exposure. My argument here is that classifying deaf children born to hearing parents as automatically 'language deprived' is potentially both stigmatizing and scientifically unjustified.

      We originally used the term “language deprivation” because it has been recently advocated in the deaf field mainly to increase society’s awareness of the risks of language deprivation and the lifelong impact that deaf and hard-of-hearing children face (e.g., Hall, 2017, Maternal and Child Health Journal; Lillo-Martin & Henner, 2020, Annual Review of Linguistics). In the current context, we agree with the reviewer that “early-life deprivation” model may not precisely describe the language acquisition condition of delayed signers. Indeed, for some of the delayed subjects in our study, their hearing parents actively tried to provide additional aids of exposure to signs (via preschool special education programs; learning signs by themselves) or speech (via hearing aids). In the revision, we avoided the term “language deprivation” and used the terms “subjects with varying amounts and qualities of early language exposure” or “delayed L1 acquisition” to more precisely describe our experimental manipulation throughout the revised manuscript.

      We fully agree with the reviewer that the “critical period” of language acquisition is too much an umbrella term, which may be taken to refer to critical period for different, specific cognitive and/or neural development in the literature. In the Revision we avoided using this term to reduce ambiguity. Instead, we now made explicit throughout the specific processes being discussed (phonology, syntax, semantics). The effects of early language experience (reduced in delayed L1 acquisition) on the behavioral and neural patterns relating to phonology, syntax, and semantics are now elaborated, discussed separately and explicitly in both the Introduction and Discussion (pages 3-4, 17-18).

      Regarding the potential nonlinguistic socio-environmental differences (e.g., coping strategies after deafness awareness), we have added further clarifications (page 15): “Notably, routine nation-wide neonate hearing screening in China did not start until 2009, years after the early childhood of our participants (born before 2000), and some hearing parents may nonetheless try to give deaf children additional aids of exposure to signs (via preschool special education programs) or speech (via hearing aids). Critically, our positive results of the robust group differences in dATL suggest that early homesign/aid measures and later formal education for sign and written language experiences are insufficient for typical dATL neurodevelopment; the full-fledged language experience during early infancy and childhood (before school age) plays a necessary role in this process.” Relevant information has also been added in the Method/Result sections.

      Pg. 6 "It should be noted that the neural semantic abstractness effect does not equate with language-derived semantic knowledge, as it might arise from some nonverbal cognitive processes that are more engaged in abstract word processing (Binder et al., 2016)." - I had great difficulty understanding what this meant.

      We have revised this sentence as follows: “While the abstractness effect has often been used to reflect linguistic processes (e.g., (Wang et al., 2010)), “abstractness” is not a single dimension and instead relates to both linguistic and nonlinguistic (e.g., emotion) cognitive processes (Binder et al., 2016; Troche et al., 2014; Wang et al., 2018).” (page 11)

    1. Author Response

      Reviewer #1 (Public Review):

      In this paper, the authors present a method for discovering response properties of neurons, which often have complex relationships with other experimentally measured variables, like stimuli and animal behaviors. To find these relationships, the authors fit neural data with artificial neural networks, which are chosen to have an architecture that is tractable and interpretable. To interpret the results, they examine the first- and second-order approximations of the fitted artificial neural network models. They apply their method profitably to two datasets.

      The strength of this paper is in the problem it is attempting to solve: it is important for the field to develop more useful ways to analyze and understand the massive neural datasets collected with modern imaging techniques.

      The weaknesses of this paper lie in its claims (1) to be model free and (2) to distinguish the method from prior methods for systems identification, including spike triggered averaging and covariance (or rather their continuous response equivalents). On the first claim, the systems identification methods are arguably substantially more model free approach. On the second claim, this reviewer would require more evidence that the presented approach is substantially different from or an improvement on systems identification methods in common use applied directly to the data.

      We thank the reviewer for carefully engaging with the manuscript and believe that our revisions address these points of critique both through novel analysis and through clarifications.

      First claim: We fully agree that systems identification approaches are in theory truly model-free while MINE imposes constraints through the chosen architecture. However, our new analysis comparing MINE to direct fitting of the kernels of a Volterra expansion highlights that this is not really the case in practice. In order to obtain good fits, the model-free-ness has to be substantially reduced by imposing constraints on the degrees of freedom. We quantify this reduction in Figure S3 and directly compare it to the effective degrees of freedom of the CNN. Reducing degrees of freedom is also a theme that can be found throughout the literature on systems-identification, especially when the analysis does not involve Gaussian white noise as input stimuli. We therefore stand by our claim that MINE is “essentially model-free” in the sense that it does not rely on defining a model a-priori much like systems identification. And we also clarify our choice of calling the method “model-free” in the introduction where we state: “While the architecture and hyper-parameters of the CNN used by MINE do impose constraints on which relationships can be modeled, we consider the convolutional network ``model-free’’ because it does not make any explicit assumptions about the underlying probability distributions or functional forms of the data.”

      Second claim: We believe that our new analysis for the comparison with the Volterra expansion approach of systems identification addresses this point. By directly fitting Volterra kernels instead of relying on spike-triggered analysis we put the comparison on a more equal footing than our previous STA/STC exposition. We can show that while the methods are equivalent for Gaussian white noise stimuli, MINE is superior for highly correlated input stimuli. We show that imposing constraints on the regression used to identify the Volterra kernels can overcome this gap to a large extent, but MINE still produces a model that has higher predictive power and MINE also does more than extracting receptive fields. We are also not entirely sure to what extent Wiener/Volterra analysis has been applied to calcium imaging data. While there is a vast body of literature on systems identification, there is little evidence that it has been widely applied to data in which both inputs and outputs are highly correlated across time, such as calcium imaging experiments using naturalistic stimuli. While this doesn’t have to mean anything in and of itself it might point to the fact that this analysis is not easily accessible and requires ample tuning. These are precisely two problems that MINE aims to overcome. We now more explicitly state in the manuscript that we believe this accessibility to be one of the core strengths of MINE.

      Reviewer #2 (Public Review):

      This paper describes a relatively unbiased and sensitive method for identifying the contributions of different behavioral parameters to neural activity. Their approach addresses, in an elegant way, several difficulties that arise in modeling of neuronal responses in population imaging data, namely variations in temporal filtering and latency, the effects of calcium indicator kinetics, interactions between different variables, and non-linear computations. Typical approaches to solving these problems require the introduction of prior knowledge or assumptions that bias the output, or involve a trade-off between model complexity and interpretability. The authors fit individual neuron's responses using neural network models that allow for complex non-linear relationships between behavioral variables and outputs, but combine this with analysis, based on Taylor series approximations of the network function, that gives insight into how different variables are contributing to the model.

      The authors have thoroughly validated their method using simulated data as well as showing its applicability to example state of the art data sets from mouse and zebrafish. They provide evidence that it can outperform current approaches based on linear regression for the identification of neurons carrying behaviorally relevant signals. They also demonstrate use cases showing how their approach can be used to classify neurons based on computational features. They have provided Python code for the implementation and have explained the methods well, so it will be easy for other groups to replicate their work. The method could be applied productively to many types of experiments in behavioral and systems neuroscience across different model systems. Overall, the paper is clearly written and the experiments are well designed and analysed, and represent a useful contribution to the neuroscience field.

      We thank the reviewer for their favorable assessment of our work.

      Reviewer #3 (Public Review):

      In the current study, the authors present a novel and original approach (termed MINE) to analyze neuronal recordings in terms of task features. The method proposed combines the interpretability of regressor-based methods with the flexibility of convolutional neural networks and the aim is to provide an unbiased, "model-free" approach to this very important problem.

      In my opinion, the authors succeed in most of these aspects. They use three datasets: an artificially-generated one that provides a ground-truth, a published dataset from wide-scale cortical mouse recordings and a novel one that studies thermosensation in larval zebrafish. MINE compares favorably in all three cases.

      I believe that the paper would mostly benefit from an increased effort in clear exposition of the Taylor expansion approach, which is at the core of the method. The methods section describes the mathematics, but I wonder whether it would be possible to illustrate or schematize this in a main Figure, e.g. as an addition to Figure 1 or as a new figure. Around line 185, the manuscript reads: "We therefore perform local Taylor expansions of the network at different experimental timepoints. In other words, we differentiate the network's learned transfer function that transforms predictors into neural activity."

      It would help to explicitly state with respect to what the derivative is being computed (i.e. time) and maybe a diagram (which I had to draw to understand the paper) in which a neuronal activity trace is shown and from time t onwards a prediction is computed using terms in the Taylor expansion would be very instructive (showing on an actual trace how disregarding certain terms changes the prediction and hence the conclusions about the actual dependence of the trace on the behavioral features). The formulation in terms of Jacobians and Hessians can then be restricted to the Methods section and the paper will be easier to read for a wider audience.

      We agree with the reviewer that readability is key. We hope that our re-write and re-organization of the manuscript makes it easier to follow. We now start with a unified description of complexity and non-linearity both derived from a Taylor decomposition around the data-average. We use this section (starting Line 91) to lay out the logic of the Taylor expansion and explicitly state that the derivatives describe the expected change in output given any change in predictors. We did not want to remove the math entirely from the paper, simply because we found it hard to explain the concept entirely without it. We have provided an annotation to the formula parts in the new Figure 2 and a small schematic to illustrate the pointwise expansion of the Taylor metric in the new Figure 4.

      The method is presented as a "model-free" approach (title and introduction). I think it would help to discuss this with some precision. The Taylor expansion approach does imply certain beliefs on the structure of the data (which are well founded in most cases). Do the authors agree that MINE would encapsulate any regression model where both linear and interaction terms are allowed to include an arbitrary non-linearity (in the case of the interaction terms, different non-linearities for both variables)? If this is the case, maybe an explicit statement would allow the reader to quickly identify the versatility of MINE.

      We are now attempting to make the statement of model-free more precise through quantifications in our rewritten section on deriving receptive fields. We now provide an explanation in the introduction for why we believe that “model-free” is justified. We state: “While the architecture and hyper-parameters of the CNN used by MINE do impose constraints on which relationships can be modeled, we consider the convolutional network ``model-free’’ because it does not make any explicit assumptions about the underlying probability distributions or functional forms of the data.”

      In principle, MINE can accommodate higher-order interactions as well (say of the form xyz or x*y^2) and it certainly has flexibility in applying nonlinear transformations. However, we did not find a satisfying way to quantify the space of possible models MINE can represent exactly and therefore do not feel comfortable to make a precise statement about this.

      I find the section relating to non-linearities interesting, but was slightly disappointed to find that the authors do not propose a single method. In Figure 3E, the authors show that a logistic regression model that combines the curvature and NLC apporaches outperforms either, but the model is not described in any sort of detail. I appreciate the attempt made by the authors to apply this to the zebrafish imaging dataset in Figure 7, but it was still unclear to me how non-linearities and complexity are related.

      We fully agree with the reviewer. We have now merged non-linearity and complexity determination. We hope that this a) simplifies the paper and b) creates a metric that likely generalizes better and in which specific values are more interpretable. In brief, we now define both the nonlinearity and complexity based on truncations of the Taylor expansion around the data average. This new result section (Lines 90-142) also gives us a chance to (hopefully) better introduce the Taylor expansion approach.

    1. Author Response

      Reviewer #1 (Public Review):

      Li et al investigated the behavioral response and fMRI activations associated with deep brain stimulation (DBS) of the lateral habenula (LHb) in 2 distinct rodent models of depression. They found that a) LHb DBS reduces depressive and anxiety behaviors using multiple behavioral tests: sucrose preference, forced swim, and open field. These results held across multiple models of depression and multiple tests, and generally restored results of these behavioral tests to parity with controls. Furthermore, fMRI activations of brain regions with known connectivity to LHb strongly correlated with behavioral responses to LHb DBS, particularly in limbic regions. These behavioral responses clearly depended on electrode location, with more medial placements within the LHb producing a more robust behavioral effect.

      The conclusions of this paper are generally well supported by the data, with the primary weaknesses of the study being 1) limited novelty due to LHb already being a well-established target for DBS in depression, and 2) the questionable validity of rodent models of depression in general. The authors deal with the first point (novelty) by extending their study to electrode localization and fMRI correlates with the behavioral response, leading to insight into surgical targeting as well as mechanism of effect, respectively. They also partially mitigate fundamental problems with rodent models of depression by using 2 different models and showing consistent responses to LHb DBS across both. The methods used in this study were sound, with high-quality techniques used for electrode implantation, confirmation of electrode placement, fMRI acquisition, anesthesia and physiological monitoring, as well as an appropriate statistical analytic approach.

      We thank the reviewer deeply for the positive assessment on our work.

      Reviewer #2 (Public Review):

      This important paper is a real tour de force and combines functional MRI, behaviour, and brain stimulation to characterise the effect of stimulation of the lateral habenula in a rodent model for depression. The results are stunning and the data presented seems compelling.

      My only comment is I would like more discussion on the relevance of these results for the treatment of depression in humans, both in terms of the rodent model and in terms of the results shown in this study.

      We thank the reviewer deeply for the positive assessment on our work. We have added discussion on the relevance of our finding for the treatment of depression in humans on Page 17 of the revised manuscript as follows:

      “The WKY and LPS-treated depressive rat models share similar characteristics, including abnormalities in various neurotransmitter and endocrine systems and emotional changes resulting from inflammatory stimuli. These models are widely used in pharmacological and nonpharmacological depression treatment studies(Caldarone et al., 2015; Aleksandrova et al., 2019; Lasselin et al., 2020). Previous research indicates that classic antidepressants used in humans, such as selective serotonin reuptake inhibitors, also cause an antidepressant reaction in WKY rats. Ketamine, a rapid-acting antidepressant in clinical practice, has been shown to be effective in both WKY and LPS-treated rats(Aleksandrova et al., 2019; J. Zhao et al., 2020). In WKY rats, DBS of the NAc increased exploratory activity and exerted anxiolytic effects, and NAc-DBS was found to be effective for TRD treatment in humans(Dandekar et al., 2018; Aleksandrova et al., 2019). These results suggest that the depression rat models can provide valuable information about the efficacy of various pharmacological and nonpharmacological therapies. In a recent case report, researchers observed acute stimulation effects in addition to long-term clinical improvements in depression, anxiety, and sleep in a patient with TRD upon administering LHb-DBS (Wang et al., 2020). This finding supports the clinical relevance of our observations. However, no animal model of depression can completely replicate human symptoms, and further research is necessary to validate our findings in human patients. Additionally, the long-term efficacy and side effects of LHb-DBS require further investigation. Nevertheless, we believe that our findings propose a promising addition to the rapid-acting therapeutic options for the most refractory depression patients.”

    1. Authorr Response

      Reviewer #2 (Public Review):

      This manuscript is clear in that it shows no/minimal weight gain in a mouse model of trisomy 21 compared to the control mouse, even under a high-calorie diet. The difference is the clear demonstration of the increased expression of sarcolipin. It is important that the expression of SERCA was also shown not different between the genotypes. Additionally, an important result is that manipulating the skeletal muscle was sufficient to promote weight loss without the need for hypermetabolism in other tissues such as adipose tissue.

      • A clear explanation of why the expression of sarcolipin/hypermetabolism is different between mouse and human under the same condition would be useful.

      Overexpression of sarcolipin is only seen in this particular mouse model carrying the near complete human chromosome 21. In another widely used mouse model (Ts65Dn) of Down syndrome where all the triplicated genes (~40% of the human Chr21 orthologs) are of mouse origin, we did not observe the same overexpression of sarcolipin (PMID: 36587842). The reason for this is presently unknown. Human Chr21 contains a significant number of non-coding human genes (>400) with uncertain effects on the mouse transcriptome. Data in Figure 8 represents our efforts to understand what drives the overexpression of mouse sarcolipin (Sln) gene expression in the TcMAC21 mouse model. Although we narrowed it down and highlighted some potential candidate transcriptional drivers for Sln overexpression (Fig. 8), future work is clearly needed to confirm and establish if any of those candidates are the or one of the bona fide driver(s).

      • p.12-13 and15. The language around 'futile' cycling is not correct because Ca movement through the sarcoplasmic reticulum of the resting fiber is essential to the function of the muscle. Firstly, the cycle of Ca through the SR is through the ryanodine receptor (RyR) as well as due to slippage through the SERCA (PMID: 11306667, PMID: 35311921). This is not made clear anywhere in the manuscript. Ca leak out of the SR through RyR is an essential component to the control/setting of the resting cytoplasmic [Ca2+] via the activation of store-operated Ca2+ entry, which is in a balance with the activation of the PMCA on the t-system membrane (PMID: 35218018). The SERCA resequesters the leaked Ca2+ from the SR. It is not possible that the resting [Ca2+] is set by the reduced efficiency of the SERCA, as indicated in the ms (PMID: 20709761). It is expected that the mito [Ca2+] steady state is set by the raised resting cyto [Ca2+] (PMID: 20709761). Ca2+ transients during EC coupling will promote transient increases in mito Ca2+ (PMID: 21795684, PMID: 36121378), but not steady-state increases. Some of these problems are highlighted by the errors in the diagram Fig 5D: please change/correct (i) the invagination of the sarcolemma is called the t-system; (ii) the cycle of Ca leak through the SR starts with RyR Ca leak, where the Ca is resequestered by the SERCA, in addition to Ca slippage through the pump. Draw a RyR opposite the t-system on the SR terminal cisternae. The heat generated by SERCA is absorbed in the cytoplasm, metabolites enter the mito and the OxPhos generates heat (PMID: 31346851). (iii) Ca does not enter mito because it cannot get into the SR (the resting cyto Ca is controlled by the t-system/plasma membrane, PMID: 20709761, PMID: 35218018). Please redraw.

      We have redrawn Fig. 6D diagram as suggested by the reviewer. We have also clarified the information as presented in revised Fig. 6D in the text and figure legend. Heat is generated by mitochondrial oxidative activity. In addition, ATP hydrolysis by the Ca2+ ATPase (SERCA pump) also generates heat (PMID: 12512777; PMID: 34826239; PMID: 11342561; PMID: 17018526; PMID: 12887329). In resting muscle, for every ATP hydrolyzed by the SERCA pump, 2 Ca2+ molecules get transported into the sarcoplasmic reticulum (SR) (PMID: 15189143). In the presence of sarcolipin (SLN), a higher number of ATP needs to be hydrolyzed to move the same number of Ca2+ molecules into the SR, due to Ca2+ slippage (PMID: 34826239; PMID: 23341466). In essence, ATP hydrolysis and Ca2+ transport into the SR by SERCA becomes uncoupled in the presence of SLN. This uncoupling of the SERCA pump, in the context of Ca2+ cycling in and out of the SR (also involving Ryr1), represents the ATP-consuming futile cycle in the skeletal muscle (PMID: 34741717). Since SLN is persistently overexpressed, the ATP-consuming futile activity of the SERCA pump is presumably happening in resting muscle, as well as during EC coupling (since the TcMAC21 mice are also hyperactive).

      • The changing of the properties of the muscle towards oxidative properties is consistent with the expression of sarcolipin in mouse muscle (all of it is in type II fibers). It is important to show whether the muscles have fiber-type shifts. Please report the fiber types of the muscles that have been surveyed in this project.

      In the qPCR data as shown in Figure 6C, we have profiled many genes associated with slow- and fast-twitched muscle fibers in gastrocnemius, and little if any changes were noted. At least at the level of the transcript, there is no indication of fiber type switching in gastrocnemius muscle. However, we did not perform the same qPCR analyses for all the other muscle types isolated (i.e., EDL, quadriceps, plantaris, soleus, and tongue). The main reason for this is that we had used all of these muscle tissues in our respirometry analysis as shown in Figure 6O-Q and Figure 6-Figure Supplement 4-9. Unfortunately, we did not have any leftover muscle tissues to profile muscle fiber types.

      • Non-shivering thermogenesis (NST) is mentioned in this manuscript as the means of hypermetabolism, as has the lengthened duration of the cyto Ca transients during EC coupling. It is not clear at all what the contribution of NST compared to the increased work of the SERCA to clear released Ca from the cyto to the hypermetabolism. What are the relative proportions? If sarcolipin is largely for NST, then hypermetabolism is about the resting muscle.

      In our view, the hypermetabolism we observed in the TcMAC21 mice is primarily due to SLN-mediated uncoupling of the SERCA pump. Chronic effects of SLN overexpression elevates ATP consumption by the SERCA pump and drives the catabolic process (i.e., increased mitochondrial OXPHOS) to generate the ATP needed to meet the demand created by the persistent uncoupling of the SERCA pump. However, the TcMAC21 mice are also hyperactive, and this can also contribute to increased metabolic rate. Since the mice are both hyperactive and hypermetabolic, we do not know the relative contribution of each to the overall phenotype of the mice.

      • The link that SLN is causing more ATP use at the pump but the heat generated by OxPhos in mito is important and should be made, see Barclays' work (eg. PMID: 31346851). A direct link between the SERCA function and mito function is occurring but I currently don't see one being made in the ms. This could be made clear in Fig 5D diagram.

      We have modified and clarified Figure 6D as suggested.

      p.22. "The reprogramming of glycolytic...elevated Ca transients...". The language is wrong here. Oxidative fibers do not have elevated Ca transients compared to glycolytic. The amplitude of Ca release is greater in glycolytic and the duration of the transient is longer in the oxidative (eg. PMID: 12813151).

      We have corrected this in the text and added the citation.

      • p.22. "as less calcium is being transported into the SR due to uncoupling of the SERCA pumps". The same amount of Ca is being transported, just at the expense of more ATP than would be the case in the absence of SLN. Otherwise, the SR Ca2+ content would not be at a steady state while the SR continuously leaks Ca2+.

      We have corrected this in the revised text. The incorrect statement has been deleted.

      • p.23. Tavi & Westerblad (PMID: 21911615) show how Ca transient amplitude and frequency signal in slow and fast twitch fibres. Here, we are not concerned with what is happening in myotubes, where the SR is less developed than in adult fibres.

      We did not use any myotubes in the present study. The myotube was mentioned in the context of discussing a published work (PMID: 30208317).

      Reviewer #3 (Public Review):

      Sarver et al., propose that TcMAC21 mice are hypermetabolic and that this is the cause of their reduced weight. Unfortunately, the developmental defects of TcMAC21 mice make this a challenging question to definitively answer. The authors claim that TcMAC21 mice are hypermetabolic due to a futile calcium cycling in skeletal muscle, which is caused by up-regulation of SLN. However, all of the data that would go into the energy balance equation (food intake, energy absorption, and energy expenditure) have been improperly analyzed. TcMAC21 pups are 8.5 g lighter than euploid littermates. The body weight data and images in Fig. 3A indicate that TcMAC21 mice runted. This difference is primarily a result of lower lean mass (FIG. 2B). This is important as it sets up many concerns that need to be addressed. Specific comments are noted below.

      There is no overt developmental defect in the TcMAC21 mice as their birth weight are not different from the euploid controls (PMID: 32597754). A “runted” mouse is considered very small, poorly developed, and less competitive (PMID: 22822473). The lean phenotype of TcMAC21 mice is due to their hypermetabolism and not the result of developmental defects. The absolute lean mass of TcMAC21 mice is lower than the euploid controls. This is to be expected. A human being that weighs 150 pounds will have less lean mass compared to another person weighing 250 pounds. Lean mass scales with body weight. This does not mean that there is a muscle deficit in the person weighing 150 pounds. That is the reason why the lean mass is also generally presented as % lean mass (after normalizing to body weight). This normalization can tell us whether the amount of lean mass is appropriate (or normal) for a given weight. The % lean mass is either not different between TcMAC21 or euploid mice fed a control chow (Fig. 2B) or significantly higher in TcMAC21 mice fed a high-fat diet (Fig. 3B). This tell us that there is no developmental deficit in the skeletal muscle (biggest contributor to lean mass) of TcMAC21. The amount of lean mass seen in TcMAC21 mice scale appropriately with their lower body weight. Our food intake and energy absorption data were correctly done and analyzed (addressed below). In fact, TcMAC21 mice have the same or slighter higher food intake (absolute amount without normalization) despite weighing much less than the euploid controls (Fig. 2C and Fig. 3A, and Supplementary File 2 and Supplementary File 5). A sick or runted mouse generally consumes much less food and are physically much less active. The TcMAC21 mice are actually hyperactive (Fig. 2D-F and Fig. 4D-F). All our data argue against the notion of “runting” or “developmental defects” in TcMAC21 mice, and instead support our conclusion that TcMAC21 mice are lean due to elevated activity and hypermetabolism.

      Specific comments:

      1) It is incorrect to normalize EE to lean mass if this parameter is different between groups. Normalizing the EE data to lean mass makes it appear as though TcMAC21 mice exhibited increased EE when in fact this is a mathematical artefact. EE data should simply be plotted as ml/h (or kcal/h) per mouse. Alternatively, ANCOVA can be applied using lean mass as a covariate. Excellent reviews on this topic have been written (PMID: 20103710; PMID: 22205519).

      Energy expenditure (EE) data should not be plotted as kcal/h per mouse, as indicated in the review article that the reviewer alluded to (PMID: 22205519). It is a given that EE increases as a function of body weight, as larger body mass requires greater energy to maintain. Plotting EE data per mouse (i.e., kcal/h) would lead to the erroneous conclusion that a fat mouse would have a higher EE compared to a lean mouse. Because lean mass is metabolically much more active than fat mass, normalizing EE data to lean mass is an acceptable way to plot EE data, although not ideal, as indicated by the review article the reviewer alluded to (PMID: 20103710). Often times, normalizing EE to lean mass gives similar results as the ANCOVA, as pointed out by the authors (PMID: 22205519). However, both review articles recommend ANCOVA (using body mass as a covariant of EE) as the preferred method to plot and evaluate EE data. Alongside the EE data (normalized to lean mass), we have now also included the ANCOVA data (Fig. 2D-F and Fig. 4D-F) where we used body weight as a covariate as recommended (PMID: 22205519). The results clearly indicate that the TcMAC21 mice have significantly higher EE compared to the euploid controls.

      2) It makes no sense to normalize food intake to weight, as it makes no sense to divide metabolic rate by weight as well (see above). If food intake is not normalized, this will clearly show that TcMAC21 mice eat much less than controls, and if plotted as cumulative food intake will show that TcMAC21 are smaller and gain less weight on a high-fat diet because they simply eat less. This further indicates that the major tenet of this paper is not correct.

      It is expected that a smaller mouse will eat less food compared to a bigger mouse. Normalizing food intake to body weight can tell you whether the amount of food intake is appropriate (or normal) for a given weight. Amazingly, despite a much lower body weight, ad libitum fed TcMAC21 mice consumed the same or a slightly higher absolute amount of food, without normalizing the data to body weight (Fig. 2C and Fig. 4A and Supplementary File 2 for the chow-fed group and Supplementary File 5 for the HFD-fed group). In fact, the absolute food intake (without normalization) in the refeeding period, after a fast, was significantly higher in the TcMAC21 mice relative to euploid controls (17.7 ± 0.082 vs. 13 ±0.87 kcal, P = 0.002; Supplementary File 5). Thus, relative to their body weight, ad libitum fed TcMAC21 consumed a significantly higher amount of calories (Fig. 2C and Fig. 4A). For transparency, we chose to show side-by-side both the absolute and relative food intake data. These results, along with the rest of the data, provide compelling evidence that hypermetabolism, and not reduced food intake, underlies the lean phenotype of the TcMAC21 mice.

      3) The authors have tried to address the smaller weight of TcMAC21 mice by including weight-matched wild-type mice. However, they only focus on analyzing surface temperature, which is not an indicator of thermogenesis. Moreover, there is no information on whether these weight-matched wild-type mice are similar in age or body composition to the TcMAC21 mice. Nevertheless, the increased surface temperature can also indicate increased heat conservation, which is opposite to thermogenesis. It would make sense that TcMAC21 mice with massive reductions in lean mass would activate compensatory mechanisms of heat conservation to offset increased heat dissipation to the environment. This does seem to be the case, based on the data shown in Fig. 6D (see below).

      Skin temperature has been widely and extensively used a proxy for thermogenesis, often in association with thermogenesis of brown adipose tissue (BAT), which is located just deep to the skin over the shoulder blades of the mouse. Mice fed a high-fat diet lose the “brownness” of their brown adipose tissue as excessive circulating lipid is stored in this depot. This is a well-known phenomenon. One can see this clearly in Figure 4K where the euploid BAT has accumulated a significant amount of lipid while the TcMAC21 BAT has not. The addition of weight-matched mice was solely to help indicate whether or not the BAT was a major contributor to the TcMAC21 hypermetabolic phenotype.

      We did not conduct body composition analysis on the weight-matched mice. With a body weight of less than 30 grams, these wild-type mice represent a similarly lean and healthy adult mouse. They are not age-matched (the control mice are younger) because this is not possible. A wild-type mouse of the same age of TcMAC21 (already on high-fat diet for 12 weeks or longer) will weigh significantly more than the TcMAC21, just as the age-matched euploid littermates weighed significantly more than the TcMAC21 mice.

      The idea of heat conservation is possible, but our data clearly indicate the TcMAC21 mice have elevated thermogenesis. The supporting data include: 1) increased deep colonic temperature; 2) activation of oxidative and thermogenic gene program in skeletal muscle; 3) overexpression of sarcolipin in the skeletal muscle, leading to futile SERCA pump activity and heat generation; 4) Increased skeletal muscle mitochondrial respiration; 5) elevated T3 levels; 6) increased physical activity level; 7) increased energy expenditure (EE normalized to lean mass or ANCOVA using body weight as a covariate). Taken together, these data provide compelling evidence to support our conclusion that the TcMAC21 mice are indeed hypermetabolic and have elevated thermogenesis.

      4) A more optimal method of testing whether increased heat dissipation plays a role in the EE of TcMAC21 mice, is to measure EE at thermoneutrality, where energy dissipation to the environment will be minimized. Here the authors have attempted this in Fig. 6D. Unfortunately, the authors normalized EE to lean mass, artefactually elevating TcMAC21 EE. Despite this mistake, it now looks as though the large differences in EE that were seen at room temp have been attenuated, and only significantly limited to the dark phase. This indicates that in addition to the normalization artefact, higher heat dissipation from smaller TcMAC21 mice may also contribute to the elevated EE at 22C.

      It is well known that at thermoneutrality mouse will markedly reduce their EE. Therefore, it is not surprising that the TcMAC21 mice, housed at thermoneutrality, will have lower EE compared to the TcMAC21 mice housed at room temperature. This also holds true for the euploid controls. This is to be expected. Yet, remarkably, the TcMAC21 mice still have significantly higher EE compared to the euploid controls when housed at thermoneutrality. The TcMAC21 mice never reduce their EE to the level of the euploid controls. We have now included the ANCOVA data for EE using body weight as a covariate as recommended (PMID: 22205519) (Fig. 7F). The results clearly indicate that the TcMAC21 mice have significantly higher EE compared to euploid controls even at thermoneutrality. The data obtained at thermoneutrality, as well as the body weight-matched control experiment as shown in Figure 4I, argue against heat dissipation as the driver of increased EE. Instead, our data support hyperactivity and hypermetabolism as the driver of increased EE.

      5) In Fig. 6D, why is the hourly plot not shown here (like 2D and 4C)? The data clearly are not as striking as the EE data at 22C?

      Because of space limitation in Figure 7, we did not include the hourly tracing data and instead showed the overall energy expenditure (EE) during the light and dark cycle as bar graphs. Per reviewer request, we have now included the hourly tracing data in Fig. 7F, along with the ANCOVA data. The data clearly indicates that TcMAC21 mice, housed at thermoneutrality, have higher EE, especially in the dark cycle when they are active. This is quite remarkable. We know from many published studies that mice significantly reduce their EE when house at thermoneutrality. And yet, the TcMAC21 mice never reduce their EE to the level of euploid controls when housed at thermoneutrality.

      6) GTT was similar between TcMAC21 and controls (Fig. 3I). However, the smaller insulin response could be due to the fact that glucose was normalized to body weight. It would be better to normalize to lean mass, since that is different as well, or simply give all mice the same amount of glucose that the control group receives since this is how it is done in humans.

      The dose of glucose injection in GTT based on mouse weight is widely and extensively practiced across the metabolic community. The TcMAC21 mice are markedly more insulin sensitive, supported by multiple independent lines of evidence: 1) Overnight fasting blood glucose and insulin levels are significantly lower in TcMAC21 mice relative to euploid controls (Figure 3G). 2) Insulin tolerance test clearly indicate a substantial improvement in insulin sensitivity in TcMAC21 mice even though the insulin dose injected was much smaller (i.e., insulin dose was based on body weight) (Figure 3K). 3) The insulin response during refeeding, after an overnight fast, is dramatically lower even though the refeeding blood glucose levels rise to the same levels as the euploid controls (Fig. 3L-M). This is similar to the GTT data where the rate of glucose clearance in TcMAC21 mice is the same as the euploid controls despite a dramatically lower insulin response (Fig. 3I-J). Taken together, these data clearly indicate a markedly heightened insulin sensitivity in TcMAC21 mice relative to euploid controls.

      7) The fecal energy in Fig. 4B only measures the concentration of energy per gram of feces. However, this analysis has failed to take into account total fecal excretion, which should be used to multiply the energy density of the feces. Thus, these data are incomplete and not sufficient to exclude absorption differences between the groups. And it is now curious why if all other metabolic measurements (even though wrong), such as food intake and EE are normalized to body weight, why have the authors not normalized to body weight for the feces data? Is this because if this was done this would show massive elevating in fecal energy in TcMAC21 mice and thus falsify their hypothesis?

      The fecal data the reviewer requested was originally in the supplemental figure section. We have now moved these data to the main figure to ensure that this will not be missed by any reader. As indicated in the text and in Fig. 4B, TcMAC21 mice fed a HFD show no difference in fecal frequency (movements/day), fecal weight (g/movement), fecal energy composition (cal/g) and total fecal energy (kcal/day). These data clearly indicate that the fecal energy content is not different between TcMAC21 and euploid mice. These results, along with the rest of the data in the paper, provide compelling evidence that hypermetabolism, and not reduced nutrient absorption in the gut, underlies the lean phenotype and resistance of TcMAC21 mice to weight gain when fed a high-fat diet.

      8) I cannot find any indication of sample size in any of the EE experiments, aside from the bar graph in Fig. 6D. In any case, this experiment only an n=4 to 5 per group. This is an extremely small number for these types of experiments, so how can the authors be sure of reproducibility with such a low sample size? Are all of the other EE experiments also of similarly small sample sizes?

      Sample size for all EE experiments were clearly indicated in the original text, figure legends, and figures themselves, as well as in all supplemental figures and Supplementary files. In addition, for transparency, we always include individual data points, whenever possible, for all our data figures. They were sufficiently powered (n = 8-9 per genotype) and the effect size was large. Sample size for all thermoneutral experiments were lower than both the chow-fed and HFD-fed experiments because these mice are hard to breed and in limited supply.

    1. Author Response

      Reviewer #1 (Public Review):

      How morphogens spread within tissues remains an important question in developmental biology. Here the authors revisit the role of glypicans in the formation of the Dpp gradient in wing imaginal discs of Drosophila. They first use sophisticated genome engineering to demonstrate that the two glypicans of Drosophila are not equivalent despite being redundant for viability. They show that Dally is the relevant glypican for Dpp gradient formation. They then provide genetic evidence that, surprisingly, the core domain of Dally suffices to trap Dpp at the cell surface (suggesting a minor role for GAGs). They conclude with a model that Dally modulates the range of Dpp signaling by interfering with Dpp's degradation by Tkv. These are important conclusions, but more independent (biochemical/cell biological) evidence is needed.

      As indicated above, the genetic evidence for the predominant role of Dally in Dpp protein/signalling gradient formation is strong. In passing, the authors could discuss why overexpressed Dlp has a negative effect on signaling, especially in the anterior compartment. The authors then move on to determine the role of GAG (=HS) chains of Dally. They find that in an overexpression assay, Dally lacking GAGs traps Dpp at the cell surface and, counterintuitively, suppresses signaling (fig 4 C, F). Both findings are unexpected and therefore require further validation and clarification, as outlined in a and b below.

      a) In loss of function experiments (dallyDeltaHS replacing endogenous dally), Dpp protein is markedly reduced (fig 4R), as much as in the KO (panel Q), suggesting that GAG chains do contribute to trapping Dpp at the cell surface. This is all the more significant that, according to the overexpression essays, DallyDeltaHS seems more stable than WT Dally (by the way, this difference should also be assessed in the knock-ins, which is possible since they are YFP-tagged). The authors acknowledge that HS chains of Dally are critical for Dpp distribution (and signaling) under physiological conditions. If this is true, one can wonder why overexpressed dally core 'binds' Dpp and whether this is a physiologically relevant activity.

      According to the overexpression assay, DallyDeltaHS seems more stable than WT Dally (Fig. 4B’, E’, 5H, I). As the reviewer suggested, we addressed the difference using the two knock-in alleles and found that DallyDeltaHS is more stable than WT Dally (Fig.4 L, M inset), further emphasizing the insufficient role of core protein of Dally for extracellular Dpp distribution.

      (During the revising our figure, we found labeling mistake in Fig. 4M, N and Fig. 4Q, R and corrected the genotypes.)

      In summary, we showed that, although Dally interacts with Dpp mainly through its core protein from the overexpression assay (Fig. 4E, I), HS chains are essential for extracellular Dpp distribution (Fig. 4R). Thus, the core protein of Dally alone is not sufficient for extracellular Dpp distribution under physiological conditions. These results raise a question about whether the interaction of core protein of Dally with Dpp is physiologically relevant. Since the increase of HS upon dally expression but not upon dlp expression resulted in the accumulation of extracellular Dpp (Fig. 2) and this accumulation was mainly through the core protein of Dally (Fig. 4E, I), we speculate that the interaction of the core protein of Dally with Dpp gives ligand specificity to Dally under physiological conditions.

      To understand the importance of the interaction of core protein of Dally with Dpp under physiological conditions, it is important to identify a region responsible for the interaction. Our preliminary results overexpressing a dally mutant lacking the majority of core protein (but keeping the HS modified region intact) showed that HS chains modification was also lost. Although this is consistent with our results that enzymes adding HS chains also interact with the core protein of Dally (Fig. 4D), the dally mutant allele lacking the core protein would hamper us from distinguishing the role of core protein of Dally from HS chains.

      Nevertheless, we can infer the importance of the interaction of core protein of Dally with Dpp using dally[3xHA-dlp, attP] allele, where dlp is expressed in dally expressing cells. Since Dally-like is modified by HS chains but does not interact with Dpp (Fig. 2, 4), dally[3xHA-dlp, attP] allele mimics a dally allele where HS chains are properly added but interaction of core protein with Dpp is lost. As we showed in Fig.3O, S, the allele could not rescue dallyKO phenotypes, consistent with the idea that interaction of core protein of Dally with Dpp is essential for Dpp distribution and signaling and HS chain alone is not sufficient for Dpp distribution.

      b) Although the authors' inference that dallycore (at least if overexpressed) can bind Dpp. This assertion needs independent validation by a biochemical assay, ideally with surface plasmon resonance or similar so that an affinity can be estimated. I understand that this will require a method that is outside the authors' core expertise but there is no reason why they could not approach a collaborator for such a common technique. In vitro binding data is, in my view, essential.

      We agree with the reviewer that a biochemical assay such as SPR helps us characterize the interaction of core protein of Dally and Dpp (if the interaction is direct), although the biochemical assay also would not demonstrate the interaction under the physiological conditions.

      However, SPR has never been applied in the case of Dpp, probably because purifying functional refolded Dpp dimer from bacteria has previously been found to be stable only in low pH and be precipitated in normal pH buffer (Groppe J, et al., 1998)(Matsuda et al., 2021). As the reviewer suggests, collaborating with experts is an important step in the future.

      Nevertheless, SPR was applied for the interaction between BMP4 and Dally (Kirkpatrick et al., 2006), probably because BMP4 is more stable in the normal buffer. Although the binding affinity was not calculated, SPR showed that BMP4 directly binds to Dally and this interaction was only partially inhibited by molar excess of exogenous HS, suggesting that BMP4 can interact with core protein of Dally as well as its HS chains. In addition, the same study applied Co-IP experiments using lysis of S2 cells and showed that Dpp and core protein of Dally are co-immunoprecipitated, although it does not demonstrate if the interaction is direct.

      In a subsequent set of experiments, the authors assess the activity of a form of Dpp that is expected not to bind GAGs (DppDeltaN). Overexpression assays show that this protein is trapped by DallyWT but not dallyDeltaHS. This is a good first step validation of the deltaN mutation, although, as before, an invitro binding assay would be preferable.

      Our overexpression assays actually showed that DppDeltaN is trapped by DallyWT and by dallyDeltaHS at similar levels (Fig. 5H-J), indicating that interaction of DppDeltaN and HS chains of Dally is largely lost but DppDeltaN can still interact with core protein of Dally.

      (Related to this, we found typo in the sentence “In contrast, the relative DppΔN accumulation upon DallyΔHS expression in JAX;dppΔN was comparable to that upon DallyΔHS expression in JAX;dppΔN (Fig. 5H-J).” and corrected as follows, “In contrast, the relative DppΔN accumulation upon Dally expression in JAX;dppΔN was comparable to that upon DallyΔHS expression in JAX;dppΔN (Fig. 5H-J).”

      We thank the reviewer for the suggesting the in vitro experiment. Although we decided not to develop biophysical experiments such as SPR for Dpp in this study due to the reasons discussed above, we would like to point out that our result is consistent with a previous Co-IP experiment using S2 cells showing that DppDeltaN loses interaction with heparin (Akiyama2008).

      However, in contrast to our results, the same study also proposed by Co-IP experiments using S2 cells that DppDeltaN loses interaction with Dally (Akiyama2008). Although it is hard to conclude since western blotting was too saturated without loading controls and normalization (Fig. 1C in Akiyama 2008), and negative in vitro experiments do not necessarily demonstrate the lack of interaction in vivo. One explanation why the interaction was missed in the previous study is that some factors required for the interaction of DppDeltaN with core protein of Dally are missing in S2 cells. In this case, in vivo interaction assay we used in this study has an advantage to robustly detect the interaction.

      Nevertheless, the authors show that DppDeltaN is surprisingly active in a knock-in strain. At face value (assuming that DeltaN fully abrogates binding to GAGs), this suggests that interaction of Dpp with the GAG chains of Dally is not required for signaling activity. This leads to authors to suggest (as shown in their final model) that GAG chains could be involved in mediating the interactions of Dally with Tkv (and not with Dpp. This is an interesting idea, which would need to be reconciled with the observation that the distribution of Dpp is affected in dallyDeltaHS knock-ins (item a above). It would also be strengthened by biochemical data (although more technically challenging than the experiments suggested above). In an attempt to determine the role of Dally (GAGs in particular) in the signaling gradient, the paper next addresses its relation to Tkv. They first show that reducing Tkv leads to Dpp accumulation at the cell surface, a clear indication that Tkv normally contributes to the degradation of Dpp. From this they suggest that Tkv could be required for Dpp internalisation although this is not shown directly. The authors then show that a Dpp gradient still forms upon double knockdown (Dally and Tkv). This intriguing observation shows that Dally is not strictly required for the spread of Dpp, an important conclusion that is compatible with early work by Lander suggesting that Dpp spreads by free diffusion. These result show that Dally is required for gradient formation only when Tkv is present. They suggest therefore that Dally prevents Tkv-mediated internalisation of Dpp. Although this is a reasonable inference, internalisation assays (e.g. with anti-Ollas or anti-HA Ab) would strengthen the authors' conclusions especially because they contradict a recent paper from the Gonzalez-Gaitan lab.

      Thanks for suggesting the internalization assay. As we discussed in the discussion, our results suggest that extracellular Dpp distribution is severely reduced in dally mutants due to Tkv mediated internalization of Dpp (Fig. 6). Thus, extracellular Dpp available for labelling with nanobody is severely reduced in dally mutants, which can explain the reduced internalization of Dpp in dally mutants in the internalization assay. Therefore, we think that the nanobody internalization assay would not distinguish the two contradicting possibilities.

      The paper ends with a model suggesting that HS chains have a dual function of suppressing Tkv internalisation and stimulating signaling. This constitutes a novel view of a glypican's mode of action and possibly an important contribution of this paper. As indicated above, further experiments could considerably strengthen the conclusion. Speculation on how the authors imagine that GAG chains have these activities would also be warranted.

      Thank you very much!

      Reviewer #2 (Public Review):

      The authors are trying to distinguish between four models of the role of glypicans (HSPGs) on the Dpp/BMP gradient in the Drosophila wing, schematized in Fig. 1: (1) "Restricted diffusion" (HSPGs transport Dpp via repetitive interaction of HS chains with Dpp); (2) "Hindered diffusion" (HSPGs hinder Dpp spreading via reversible interaction of HS chains with Dpp); (3) "Stabilization" (HSPGs stabilize Dpp on the cell surface via reversible interaction of HS chains with Dpp that antagonizes Tkv-mediated Dpp internalization); and (4) "Recycling" (HSPGs internalize and recycle Dpp).

      To distinguish between these models, the authors generate new alleles for the glypicans Dally and Dally-like protein (Dlp) and for Dpp: a Dally knock-out allele, a Dally YFP-tagged allele, a Dally knock-out allele with 3HA-Dlp, a Dlp knock-out allele, a Dlp allele containing 3-HA tags, and a Dpp lacking the HS-interacting domain. Additionally, they use an OLLAS-tag Dpp (OLLAS being an epitope tag against which extremely high affinity antibodies exist). They examine OLLAS-Dpp or HA-Dpp distribution, phospho-Mad staining, adult wing size.

      They find that over-expressed Dally - but not Dlp - expands Dpp distribution in the larval wing disc. They find that the Dally[KO] allele behaves like a Dally strong hypomorph Dally[MH32]. The Dally[KO] - but not the Dlp[KO] - caused reduced pMad in both anterior and posterior domains and reduced adult wing size (particularly in the Anterior-Posterior axis). These defects can be substantially corrected by supplying an endogenously tagged YFP-tagged Dally. By contrast, they were not rescued when a 3xHA Dlp was inserted in the Dally locus. These results support their conclusion that Dpp interacts with Dally but not Dlp.

      They next wanted to determine the relative contributions of the Dally core or the HS chains to the Dpp distribution. To test this, they over-expressed UAS-Dally or UAS-Dally[deltaHS] (lacking the HS chains) in the dorsal wing. Dally[deltaHS] over-expression increased the distribution of OLLAS-Dpp but caused a reduction in pMad. Then they write that after they normalize for expression levels, they find that Dally[deltaHS] only mildly reduces pMad and this result indicates a major contribution of the Dally core protein to Dpp stability.

      Thanks for the comments. We actually showed that compared with Dally overexpression, Dally[deltaHS] overexpression only mildly reduces extracellular Dpp accumulation (Fig. 4I). This indicates a major contribution of the Dally core protein to interaction with Dpp, although the interaction is not sufficient to sustain extracellular Dpp distribution and signaling gradient.

      The "normalization" is a key part of this model and is not mentioned how the normalization was done. When they do the critical experiment, making the Dally[deltaHS] allele, they find that loss of the HS chains is nearly as severe as total loss of Dally (i.e., Dally[KO]). Additionally, experimental approaches are needed here to prove the role of the Dally core.

      Since the expression level of Dally[deltaHS] is higher than Dally when overexpressed, we normalized extracellular Dpp distribution (a-Ollas staining) against GFP fluorescent signal (Dally or Dally[deltaHS]). To do this, we first extracted both signal along the A-P axis from the same ROI. The ratio was calculated by dividing the intensity of a-Ollas staining with the intensity of GFP fluorescent signal at a given position x. The average profile from each normalized profile was generated and plotted using the script described in the method (wingdisc_comparison.py) as other pMad or extracellular staining profiles.

      Although this analysis provides normalized extracellular Dpp accumulation at different positions along the A-P axis, we are more interested in the total amount of Dpp or DppDeltaN accumulation upon Dally or dallyDeltaHS expression. Therefore, we plan to analyze the normalized total amount of Dpp against GFP fluorescent signal (Dally or Dally[deltaHS]) in the revised ms. In this case, normalization will be performed by dividing total signal intensity of extracellular Dpp staining (ExOllas staining) divided by GFP fluorescent signal (Dally or Dally[deltaHS]) in ROI in each wing disc.

      We agree with the reviewer that additional experimental approaches are needed to address the role of the core protein of Dally. As we discussed in the response to the reviewer1, to understand the importance of the interaction of core protein of Dally with Dpp, it is important to identify a region responsible for the interaction. Our preliminary results overexpressing a dally mutant lacking the majority of core protein (but keeping the HS modified region intact) showed that HS chains modification was also lost. Although this is consistent with our results that enzymes adding HS chains also interact with the core protein of Dally (Fig. 4D), the dally mutant allele lacking the core protein would hamper us from distinguishing the role of the core protein of Dally from HS chains.

      Nevertheless, we can infer the importance of the interaction of core protein of Dally with Dpp using dally[3xHA-dlp, attP] allele, where dlp is expressed in dally expressing cells. Since Dally-like is modified by HS chains but does not interact with Dpp (Fig. 2, 4), dally[3xHA-dlp, attP] allele mimics a dally allele where HS chains are properly added but interaction of core protein with Dpp is lost. As we showed in Fig.3O, S, the allele could not rescue dallyKO phenotypes, consistent with the idea that interaction of core protein of Dally with Dpp is essential for Dpp distribution and signaling.

      Prior work has shown that a stretch of 7 amino acids in the Dpp N-terminal domain is required to interact with heparin but not with Dpp receptors (Akiyama, 2008). The authors generated an HA-tagged Dpp allele lacking these residues (HA-dpp[deltaN]). It is an embryonic lethal allele, but they can get some animals to survive to larval stages if they also supply a transgene called “JAX” containing dpp regulatory sequences. In the JAX; HA-dpp[deltaN] mutant background, they find that the distribution and signaling of this Dpp molecule is largely normal. While over-expressed Dally can increase the distribution of HA-dpp[deltaN], over-expression of Dally[deltaHS] cannot. These latter results support the model that the HS chains in Dally are required for Dpp function but not because of a direct interaction with Dpp.

      Our overexpression assays actually showed that both Dally and Dally[deltaHS] can accumulate Dpp upon overexpression and the accumulation of Dpp is comparable after normalization (Fig. 5H-J), consistent with the idea that interaction of DppdeltaN and HS chains are largely lost. As the reviewer pointed out, these results support the model that the HS chains in Dally are required for Dpp function but not because of a direct interaction with Dpp.

      In the last part of the results, they attempt to determine if the Dpp receptor Thickveins (Tkv) is required for Dally-HS chains interaction. The 2008 (Akiyama) model posits that Tkv activates pMad downstream of Dpp and also internalizes and degrades Dpp. A 2022 (Romanova-Michaelides) model proposes that Dally (not Tkv) internalizes Dpp.

      To distinguish between these models, the authors deplete Tkv from the dorsal compartment of the wing disc and found that extracellular Dpp increased and expanded in that domain. These results support the model that Tkv is required to internalize Dpp.

      They then tested the model that Dally antagonizes Tkv-mediated Dpp internalization by determining whether the defective extracellular Dpp distribution in Dally[KO] mutants could be rescued by depleting Tkv. Extracellular Dpp did increase in the D vs V compartment, potentially providing some support for their model. However, there are no statistics performed, which is needed for full confidence in the results. The lack of statistics is particularly problematic (1) when they state that extracellular Dpp does not rise in ap>tkv RNAi vs ap>tkv RNAi, dally[KO] wing discs (Fig. 6E) or (2) when they state that extracellular Dpp gradient expanded in the dorsal compartment when tkv was dorsally depleted in dally[deltaHS] mutants (Fig. 6I). These last two experiments are important for their model but the differences are assessed only visually. In fact, extracellular Dpp in ap>tkv RNAi, dally[KO] (Fig. 6B) appears to be lower than extracellular Dpp in ap>tkv RNAi (Fig. 6A) and the histogram of Dpp in ap>tkv RNAi, dally[KO] is actually a bit lower than Dpp in ap>tkv RNAi, But the author claim that there is no difference between the two. Their conclusion would be strengthened by statistical analyses of the two lines.

      We will provide the statistical analyses in the revised ms.

      Strengths:

      1) New genomically-engineered alleles

      A considerable strength of the study is the generation and characterization of new Dally, Dlp and Dpp alleles. These reagents will be of great use to the field.

      Thanks. We hope that these resources are indeed useful to the field.

      2) Surveying multiple phenotypes

      The authors survey numerous parameters (Dpp distribution, Dpp signaling (pMad) and adult wing phenotypes) which provides many points of analysis.

      Thanks!

      Weaknesses:

      1) Confusing discussion regarding the Dally core vs HS in Dpp stability. They don't provide any measurements or information on how they "normalize" for the level of Dally vs Dally[deltaHS]? This is important part of their model that currently is not supported by any measurements.

      We explained how we normalized in the above section. We will update the analysis in the revised ms.

      2) Lacking quantifications and statistical analyses:

      a) Why are statistical significance for histograms (pMad and Dpp distribution) not supplied? These histograms provide the key results supporting the authors' conclusions but no statistical tests/results are presented. This is a pervasive shortcoming in the current study.

      Thanks. We will provide statistics in the revised ms.

      b) dpp[deltaN] with JAX transgene - it would strengthen the study to supply quantitative data on the percent survival/lethal stage of dpp[deltaN] mutants with or without the JAK transgene

      In this study, we are interested in the role of dpp[deltaN] during the wing disc development. Therefore, we decided not to perform the detailed analysis on the percent survival/lethal stage of dpp[deltaN] mutants with or without the JAX transgene in the current study. Nevertheless, the fact that dpp[deltaN] allele is maintained with a balanced stock and JAX;dpp[deltaN] allele can be maintained as homozygous stock indicates that the lethality of dpp[deltaN] allele comes from the early stages. Indeed, our preliminary results showed that pMad signal is severely lost in the dpp[deltaN] embryo without JAX (data not shown), indicating that the allele is lethal at early embryonic stages.

      c) The graphs on wing size etc should start at zero.

      Thanks. We corrected this in the current ms.

      d) The sizes of histograms and graphs in each figure should be increased so that the reader can properly assess them. Currently, they are very small.

      Thanks. We changed the sizes in the current ms.

      The authors' model is that Dally (not Dlp) is required for Dpp distribution and signaling but that this is not due to a direct interaction with Dpp. Rather, they posit that Dally-HS antagonize Tkv-mediated Dpp internalization. Currently the results of the experiments could be considered consistent with their model, but as noted above, the lack of statistical analyses of some parameters is a weakness.

      Thanks. We will perform the statistical analyses in the revised ms.

      One problematic part of their result for me is the role of the Dally core protein (Fig. 7B). There is a mis-match between the over-expression results and Dally allele lacking HS (but containing the core). Finally, their results support the idea that one or more as-yet unidentified proteins interact with Dally-HS chains to control Dpp distribution and signaling in the wing disc.

      Our results simply suggest that Dpp can interact with Dally mainly through core protein but this interaction is not sufficient to sustain extracellular Dpp gradient formation under physiological conditions (dallyDeltaHS) (Fig. 4Q). We find that the mis-match is not problematic if the role of Dally is not simply mediated through interaction with Dpp. We speculate that interaction of Dpp and core protein of Dally is transient and not sufficient to sustain the Dpp gradient without HS chains of Dally stabilizing extracellular Dpp distribution by blocking Tkv-mediated Dpp internalization.

      There is much debate and controversy in the Dpp morphogen field. The generation of new, high quality alleles in this study will be useful to Drosophila community, and the results of this study support the concept that Tkv but not Dally regulate Dpp internalization. Thus the work could be impactful and fuel new debates among morphogen researchers.

      Thanks.

      The manuscript is currently written in a manner that really is only accessible to researchers who work on the Dpp gradient. It would be very helpful for the authors to re-write the manuscript and carefully explain in each section of the results (1) the exact question that will be asked, (2) the prior work on the topic, (3) the precise experiment that will be done, and (4) the predicted results. This would make the study more accessible to developmental biologists outside of the morphogen gradient and Drosophila communities.

      Thanks. We will modify our texts to help non-experts understand our story in the revised ms.

    1. Author Response

      Reviewer #2 (Public Review):

      Major points:

      1). This study does not provide any evidence about the cell death of the transplanted cells. The immunostaining of the Caspase-3 or TUNEL staining should be used to address this issue.

      We have conducted immunostaining of Caspase-3 at 7 days after transplantation using the human-specific STEM121 antibody to demonstrate the transplanted cells. We have added the results to Figure 3A and modified the text accordingly (Page 8, Line 156-165).

      2). The authors showed that the neurological functions (evaluated by balance beam, ladder lung, rotarod test and Modified Neurological Severity Score (mNSS) up to 8 weeks after treatment (Figure 1C)) were significantly improved in the NES+Exo group compared to their control groups. However, these cells (transplanted cells) are progenitors (Nestin+) or undifferentiated cells (Tuj1+) at this stage (Figure 3). Thus, I was curious about that how can the immature neurons play neurological functions? This point should be explained.

      We agree with the reviewer’s insightful comments. We have performed immunostaining using antibodies against the post-mitotic mature neuron marker RBFOX3/NeuN, post-synaptic marker PSD-95 and human-specific STEM121 at 4 weeks after transplantation. The results confirmed that NeuN+/STEM121+ and PSD-95+/STEM121+ mature neurons appeared in NSC group and increased in NSC+Exo group (Figure 3B and Figure 3 - supplement 1D). Furthermore, our additional data showed that the expression of presynaptic marker SYN1 was increased in both NSC and NSC+Exo groups at 8 weeks after treatment. Therefore, we believe that there are mature neurons and newly formed synapses involved in neurological functions.

      3). The authors used the Golgi staining to show the NES+Exo can improve dendritic density and length. How do you know these neurons are transplanted cells?

      Our data show that mature neurons and synapses are generated by the transplanted cells (please also see response to reviewer #2-major ponts #2). We believe that the newly generated neurons partly contribute to the improved dendritic density and length. However, we agree that the neurons with increased dendritic density and length may be both survived local neurons and those generated by the transplanted cells.

      4). The cell morphology of tdTomato+ cells is fuzzy and it is difficult to distinguish the cell body. It looks like that these cells out of whack.

      We have immunostaining using the human-specific STEM121 antibody to demonstrate the transplanted cells and more neuronal markers such as RBFOX3/NeuN to identify NSC differentiation (Figure 3A and 3B; Figure 3 - supplement 1C and 1D).

    1. Author Response

      Reviewer #1 (Public Review):

      Lemerle et al utilize elegant imaging and molecular biology approaches to convincingly demonstrate the presence of Bin1 and caveolae containing rings capable of tubulation in developing muscle. The data is of fundamental potential significance as it advances our understanding of t-tubule biogenesis, which represents a major knowledge gap in muscle biology. The paper will be of broad interest to skeletal and cardiac muscle biologists and physiologists. The paper is well written, with a comprehensive yet concise introduction, clearly presented results, and an appropriate discussion. The imaging is spectacular, and the use of CLEM provides compelling validation of the protein constituents of ring structures identified via EM. When combined with time-lapse imaging, the combination of approaches provides powerful nanoscale structural information alongside temporal dynamics and live-cell confirmation of tubulating ability by Bin1-Cav3 containing rings. The data indicate that Bin1 is sufficient to generate circular structures that are subsequently decorated by caveolae which facilitate tubule formation at the membrane, and they support the requirement of both Bin1 and Cav3 for efficient tubule initiation and elongation. The authors also utilize myotubes from patients with cav3 mutations to explore whether altered ring formation may contribute to muscle pathology - however, this section requires additional controls and validation to confer pathological insight. Further, additional quantification of imaging data across the study is required to increase the rigor and strength of the conclusions of this work.

      We would like to thank reviewer #1 for his appreciation of our work, in particular the imaging experiments and for deeming our overall conclusions convincing. We have now performed additional experiments on patient myotubes including a rescue of Cav3, performed rigorous quantifications of rings and tubules under our different experimental conditions and re-wrote corresponding parts of the of the discussion to increase the strength of our conclusions.

      Reviewer #2 (Public Review):

      In this work Lemerle et al. provide long-awaited insight into how transverse tubules develop in skeletal muscle. Together with the sarcoplasmic reticulum transverse tubules form the triad, a specialized structure required for excitation-contraction coupling in skeletal muscle. Defects in transverse tubules or the triad can lead to problems such as muscular dystrophy. Whilst the involvement of specialist membrane structures (caveolae) and the membrane-bending protein Bin1 have long been recognized the precise mechanism of how caveolae and Bin1 cause transverse tubules to form and extend has remained unknown. This work provides compelling evidence, correlating antibody labelling with electron microscopy, to support the concept that caveolae rings form underneath the cell membrane which is surrounded by the endo/sarcoplasmic reticulum. These rings contain caveolin-3 and Bin1 and the authors show Bin1 enriched tubes extend from multiple points on these rings. Their data suggest that Bin1 assembles to initially form these scaffolds that then recruit the caveolae to form the ring. In addition, tubules appear continuous with the extracellular environment which is necessary for their function of facilitating calcium release during excitationcontraction coupling. In patients with mutations in caveolin-3 the caveolin ring formation as well as Bin1 tubulation were defective which may play a role in the pathology. The elegant experiments including time-lapse work clearly support the conclusions of the authors.

      The ability of the authors to combine labelling studies with advanced microscopy to show the underlying structures provides very strong evidence for the proposed mechanisms. The authors suggest that the muscle-specific isoforms of BIN1 are key to tubule extension from caveolae rings but it would be interesting for them to discuss how this fits with studies suggesting that constitutive Bin1 isoforms can also form transverse tubules. It would also be interesting to understand the authors' views on whether caveolae rings are involved in the turnover of transverse tubules in adult myotubes as well as the initial formation and, additionally, if the caveolae rings are restricted to the region just under the surface membrane.

      Insight into how transverse tubules are formed sets the groundwork for future therapies. This is clearly important for skeletal muscle myopathies but should also be considered in the heart. Cardiac transverse tubule loss and disorder play an important role in dysfunction in heart failure and atrial fibrillation and as such lessons learned in skeletal muscle may be successfully applied to the heart.

      We would like to thank reviewer #2 for this appreciation of our work. We agree with the points raised and have updated our discussion section to highlight these points.

      Reviewer #3 (Public Review):

      T-tubules are an elaborate series of membrane invaginations that bring membrane voltageactivated Ca2+ channels in close apposition to the sarcoplasmic reticulum containing RyR, allowing for Ca2+-induced Ca2+ release. They serve as critical hubs of excitation-contraction coupling and play a central role in myopathies and inherited and acquired cardiomyopathies. Several membrane structures and proteins have been implicated in striated muscle t-tubule biogenesis, but the specific mechanisms of early t-tubule biogenesis are not defined. Lemerle et al here investigate the biogenesis of transverse tubules in skeletal muscle. They use skeletal myoblasts from murine and human muscle as well as sophisticated high-resolution microscopy, live cell imaging, and adenoviral targeting to forward a model of BIN1 mediated caveolae ring formation which give rise to DHPR enriched t-tubules and associate with SR. While they demonstrate that BIN1 and Cav3 enriched caveolae act together to form t-tubules, the precise pathophysiological mechanisms by which this process acts in disease remain unclear. Strengths of the study consist in the use of both murine and human skeletal muscle experiments, suggesting a conserved molecular mechanism; the innovative approach of correlative light and electron microscopy, and the use of pathological specimens. The live cell timelapse provides imaging evidence of Cav3-enriched caveolae-rings forming in centers of high BIN1 enrichment, from which t-tubules emanate. This is novel evidence in support of the biogenesis model proposed by the authors. The pathological correlation of their model is promising but limited. Specifically, while the study of Cav3 mutant specimens is used to show the Cav3 dependence of BIN 1 action (in experiments using BIN 1 overload), the authors have not tested the sufficiency of their proposed mechanism by rescuing the pathologic state. Moreover, the conditions of development likely have an important effect on the studied mechanism - such as mechanical loading, contractile state, neurohormonal environment, and so on. Furthermore, a more complete description of the precise molecular binding sites between BIN1 and Cav3 would be important. While exon11 is required for tubulation, BIN1 not expressing exon 11 appears sufficient to assemble caveolar rings, suggesting this is mediated by other specific BIN1 regions.

      Overall, the study provides new details on early t-tubule biogenesis in skeletal muscle (likely shared with other striated muscle) and lays the foundations for further definition of the precise molecular mechanisms.

      We would like to thank reviewer #3 for the appreciation of our work. We have now performed additional experiments on patient myotubes including rescue experiments, analysis of key excitationcontraction coupling proteins by Western blot and quantification of caveolae rings and tubules to strengthen our claims with patient myotubes.

    1. Author Response:

      Reviewer #1 (Public Review):

      In this manuscript, Mastrototaro et al. perform a series of experiments in transgenic murine models assessing the function of Palladin (PALLD) in the heart. Global PALLD KOs are embryonic lethal, precluding the assessment of the roles of this protein in adulthood. To circumvent this limitation, the authors generated a floxed Palld allele and ablated it with two cardiomyocyte-specific Cres: the constitutively active Myh6-Cre and the tamoxifen-inducible aMHC-MerCreMer. Interestingly, ablation with the constitutive Cre (cKO) did not produce any overt phenotype, but ablation in adulthood (cKOi) resulted in compromised cardiac function. These observations suggest a compensation mechanism that takes place when cardiomyocytes develop in the complete absence of this protein but not when cardiomyocytes develop in a wild-type background and are deprived of this protein after achieving full maturation. These experiments were complemented with yeast two-hybrid techniques to identify novel partners that bind to a region of PALLD for each no interactants had been previously identified. Experiments in human samples revealed an upregulation of PALLD transcripts in the hearts of patients.

      This manuscript adds important information to our understanding of sarcomeric proteins. Data are generally of good quality and well presented in figures. The numbers of animals in echocardiographic studies are also adequate for proper conclusions. Authors achieve most of their goals, including the identification of novel partners of PALLD and the identification of a requirement for PALLD in cardiomyocytes for normal heart function. However, given that all experiments performed in this study were focused on the loss-of-function of PALLD, it is not clear what is the relevance of the PALLD upregulation observed in human patients. Authors should clearly state this limitation in their results.

      Considering that authors have observed evidence for nuclear PALLD, which could hint at potential major gene expression changes when this protein is ablated, it would be interesting to perform an unbiased assessment of transcriptional alterations (RNA-seq) in cardiomyocytes isolated from control and cKOi hearts. In addition, to test if the compensation observed in the embryonic cKO involves mechanisms of transcriptional adaptation, it would be interesting to compare RNA-seq results from cKOi and cKO (genes encoding proteins similar to PALLD that are upregulated in cKO but not cKOi cardiomyocytes would be very strong candidates). However, these transcriptomic data are not essential to support current findings and can be performed in follow-up studies.

      We agree with the reviewer that it would be interesting to perform RNA-Seq on isolated cardiomyocytes from cPKOi mice and we are in fact planning to do this in a follow-up study.

      Reviewer #2 (Public Review):

      The role of the actin-binding protein palladin (PALLD) in cardiomyocyte development, growth, and function has not been defined. In order to address this question, the authors first identified that CARP and FHOD1 interact with PALLD in cardiomyocytes. They then performed cardiomyocyte selective deletion of PALLD in embryonic and adult mice and discovered that deletion of PALLD in adult mice leads to dilated cardiomyopathy (DCM) and intercalated disc ultrastructural changes. In contrast, embryonic deletion of cardiomyocyte PALLD did not cause a cardiomyopathy phenotype in neonatal or adult animals.

      1. The divergent cardiac phenotypes of the embryonic deletion of cardiomyocyte PALLD (no cardiomyopathy) versus the adult deletion of cardiomyocyte PALLD (dilated cardiomyopathy(DCM)) is an interesting result. The authors speculate that embryonic deletion of PALLD induces compensatory pathways that prevent the development of adult cardiomyopathy in these mice. However, these compensatory pathways remain unexplored.<br /> 2. The authors discovered that mice with adult cardiomyocyte deletion of PALLD had significant changes in the cardiomyocyte intercalated disc (ICD) ultrastructure. They suggest these changes in ICD ultrastructure contribute to DCM formation in the adult PALLD deletion mice (line 270). However, it remains unclear if these changes in ICD ultrastructure are specific to mice with adult deletion of PALLD.<br /> 3. The different transgenic Cre mouse lines may be an alternative explanation for the divergent cardiac phenotypes in the embryonic versus adult deletion of cardiomyocyte PALLD. The tamoxifen dose administered for the inducible Myh6:MerCreMer mice was 30mg/kg/day x 5 which has been reported to lead to the induction of cardiomyocyte DNA damage response pathways (Dis Model Mech. 2013 Nov; 6(6): 1459-1469, J Cardiovasc Aging 2022;2:8). The electron micrograph experiments in Figure 5 did not include a group of Myh6:MerCreMer mice administered tamoxifen. The authors only compared PALLD fl/fl and Myh6:MerCreMer/PALLD fl/fl mice.

      In the papers that the Reviewer refers to it was shown that administration of tamoxifen to Myh6:MerCreMer mice at a dose of 30 mg/kg/day for 3 (Bersell et al., Dis Model Mech. 6, 1459-1469, 2013) or 5 days (Rouhi et al., J Cardiovasc Aging 2, 8, 2022) is not associated with apoptosis. Bersell et al., found that amounts ≥40 mg/kg/day for 3 days is associated with apoptosis, and Rouhi et al., showed that injection of 30 mg/kg/day for 5 days causes transient minor changes in gene expression with no discernible effects on cardiac function, myocardial fibrosis, apoptosis, or induction of double-stranded DNA breaks. The reason that we chose to inject tamoxifen at an amount of 30 mg/kg/day for 5 days was in fact that this amount has been shown not to be associated with severe effects and has been widely used in the literature.

      4. The apoptosis assessment was performed 24 weeks after administration of tamoxifen to the Myh6:MerCreMer/PALLD fl/fl mice. However, cardiomyocyte apoptosis may have occurred much earlier if it was secondary to Myh6:MerCreMer tamoxifen-induced cardiotoxicity (or related to PALLD deletion).<br /> 5. The animal studies in Fig 3D show a DCM phenotype in mice with adult deletion of cardiomyocyte 200kDa PALLD which suggests a potential loss of function mechanism for DCM formation. However, the authors then report in Fig 6 that human DCM heart tissue samples have a ~2.5fold increase in mRNA expression of the 200kDa PALLD transcript which would suggest a possible gain of function mechanism for DCM formation. How do the authors reconcile these divergent results with regard to palladin's role in cardiomyocyte homeostasis and cardiomyopathy formation?

      In the revised manuscript we demonstrate that the transcriptional changes in PALLD expression are not reflected at the protein level.

      Reviewer #3 (Public Review):

      This study shows for the first time changes in palladin expression under disease conditions and mRNA alterations in human samples. The authors have identified novel binding partners for the protein as a first step toward determining how palladin mediates its effects in the heart. Finally, through the use of mouse models to decrease palladin expression they identify a crucial role for palladin in the cardiac response to pathological stress, with some interesting findings that show the effects of palladin depend on when the protein is altered.

      We appreciate that the Reviewer finds our study interesting. However, we did not show a role of PALLD in the cardiac response to pathological stress. On the contrary, we demonstrated that mice with constitutive knockout of PALLD in the heart (cPKO mice) show no pathological cardiac phenotype either under basal conditions or in response to mechanical pressure overload by transaortic constriction. On the other hand, deletion of PALLD in adult mice resulted in DCM under basal conditions within 8 weeks after tamoxifen induction.

      The novel findings of the study are supported by the data presented, but there are several instances where clarification is needed of the conclusions drawn from the data reach beyond what is presented in the Results section.

      The focus on only male mice is a significant limitation of the paper, as it is well known that there are profound sex differences in the response to pathological stressors. While the ability to obtain sufficient heart samples from male and female patients may be a reasonable justification for focusing on males, the preclinical mouse model should have been examined in both sexes and the limitation of this choice should be clearly noted in the paper.

      Due to the three Rs and the high costs associated with the breeding of the high amount mice required for the project, we chose to focus only on male mice.

      In line 537-539, we stated. “All experiments were performed on male mice as females often develop a less severe cardiac phenotype due to the cardioprotective role of estrogen (Brower, Gardner, & Janicki, 2003; Du, 2004).

      The changes in myopalladin expression were not measured in the disease model (TAC), which limits the ability to determine if myopalladin was altered in the disease state. This addition would strengthen the study.

      We have previously demonstrated that myopalladin protein levels are significantly reduced after TAC in wildtype mice (Figure 6K, L in Filomena et al., eLife 10:e58313, 2021). We did not measure myopalladin levels in cPKO subjected to TAC and unfortunately don’t have tissue from cPKO mice to perform the measurements.

      Finally, the myofilament data are presented as evidence that changes in the contractile apparatus are contributors to the observed contractile dysfunction at the organ level. But these studies were conducted using levels of calcium that far exceed what is seen in vivo and, therefore, do not support the conclusion drawn.

      The reviewer is right that the myofibril experiments were conducted at Ca2+ concentrations that cannot be reached under the physiological conditions of cardiac contraction. However, the result clearly demonstrates that the intrinsic force generating capacity of the cardiac sarcomeres of cPKOi mice is impaired 8 weeks after TAM independently from any changes in myofilament Ca2+ sensitivity and cardiomyocyte Ca2+ handling. Experiments at lower (more physiological) Ca2+ concentrations would have produced less clear results in the absence of a full investigation of the relation between force and [Ca2+]. Since data demonstrate that cross bridge mechanics and kinetics are not affected, the reported finding supports the idea that a myofibril structural defect is responsible for the lower maximal force of the KO sarcomeres.

    1. Author Response:

      Reviewer #1 (Public Review):

      This study presents a resource aiming to unify language and rules used in the literature to describe, curate and assess biology experiments, published or not. Focusing on host-pathogen interactions, the work presents a new ontology and controlled vocabulary, as well as rules to describe 'metagenotypes', a term coined for the joint description of interacting host-pathogen genotypes. 'PHI-Canto' extends a previous resource by also enabling using UniProtKB IDs to curate proteins. Among other important by-products, PHI-Canto could contribute to damping proliferating names and acronyms for genes, processes, and interactions; a chronic annoyance in the biosciences.

      The tool does give the impression that, with sufficient time and usage, it could become a rich and robust resource. Just addressing the Uniprot IDs issue is a nice move.

      We thank the reviewer for their positive comments and acknowledgement of the importance of using unified language in literature curation. We are pleased to see that our effort to improve interoperability and use existing resources has been recognized. We are also pleased that this reviewer recognizes the additional benefits of choosing to use UniProtKB accession numbers. 

      Reviewer #2 (Public Review):

      In this paper, the authors propose a system for annotating and curating scientific publications in the context of interspecies host-pathogen interactions. This system, called PHI-Canto (the Pathogen-Host Interaction Community Annotation Tool), is an extension of an existing tool (called Canto). In addition, they present the development of new concepts, controlled vocabularies, and an ontology for annotating relevant aspects in this domain, called PHIPO (Pathogen-Host Interaction Phenotype Ontology).

      The approach has been empirically validated by annotating ten publications. The application's source code is available, as well as the associated ontologies and vocabularies and an example of the data resulting from the annotation process.

      We thank the reviewer for their positive comments on our framework for curating interspecies interactions literature. We are pleased that the reviewer has recognized that the source code, associated ontologies and curated data are freely available for others to use. We are delighted that the reviewer found the curation of ten trial publications in PHI-Canto informative and benefited from the worked curation examples.

      Reviewer #3 (Public Review):

      In this work, the authors have built a framework for the annotation of interactions between species. The framework includes ontologies, methodologies, and an annotation tool called PHI-Canto. The framework makes use of multiple existing ontologies that are in wide use in the biocuration community. In addition, the authors have built their own project-specific controlled vocabularies and ontologies for the capture of pathogen-host interaction phenotypes (PHIPO), diseases (PHIDO), and environmental conditions (PHI-ECO). Their work builds on and extends methods that have been developed within the Gene Ontology Consortium and model organism databases. The tool PHI-Canto is an extension of the tool Canto developed by PomBase for curation. The authors used this framework to annotate pathogen-host interactions within the Pathogen-Host Interactions Database.

      Strengths: The manuscript is well-written and includes significant detail regarding curation policies/methods and the use of the actual PHI-Canto tool. The appendices are very detailed and provide useful illustrations of the annotation practices and tool interface. The work has built upon and extended well-established standards and methods that have proven their utility over many years of use in the biocuration community. The authors have rigorously tested their framework with the curation of a variety of publications providing a diverse assortment of annotation challenges. The concept of a "metagenotype" is important and providing such a structured system for the capture of this information is useful. All of the materials produced by the work are completely freely available for use by the wider community.

      Weaknesses: There are some areas of the manuscript and appendices which are a bit confusing and could be improved. The authors have developed their own set of disease terms (PHIDO) but do not comment on why existing disease terminologies (such as Mondo or DO) were not used or if the PHIDO terms relate to those other vocabularies. There is no discussion of the possible use of a graph representation for the capture of this complex information (which is being done in many settings including the Gene Ontology with GO Causal Activity Models (GO-CAMs)) or why such a structure was not used. Although the abstract talks about the use of the framework within the PHI database as a test case for broader use regarding interspecies interactions, there is no mention of extending the use of the tool to other species interaction communities beyond pathogen-host interactions.

      We thank the reviewer for their detailed response. We are pleased that the reviewer found the manuscript to be well-written and informative with useful examples. We thank the reviewer for their helpful suggestions to improve the appendices and manuscript text.

      We would like to clarify that PHIDO is not intended to compete with existing disease ontologies: it is instead being used as a placeholder, until the time when its terms can be replaced with terms from existing disease ontologies. PHIDO was an expedient solution, in the sense that it provided the fastest way for us to test the process of curating diseases with PHI-Canto. This is because we only had to convert the existing list of disease names already in PHI-base into a controlled vocabulary, thus removing the need to wait for maintainers of other ontologies to add terms for us (as reported in Urban et al., 2022).

      Additionally, we were required to use terms from PHIDO due to the lack of representation for plant and animal diseases in existing ontologies or vocabularies. Plant disease, in particular, is very underrepresented, with the ontologies we surveyed having either inappropriate semantics (e.g. the Plant Trait Ontology focusing on traits related to disease, rather than the diseases themselves) or still being in development (e.g. the Plant Stress Ontology). The majority of source ontologies used by MONDO are human-centric, and DO is exclusively for human disease, yet human disease represents only part of the focus of PHI-base (~35%). Furthermore, our choice of vocabularies is limited by the fact that Canto currently only supports ontologies in OBO format (for historical reasons).

      We have begun the process of harmonizing disease names in PHI-base with terms from existing disease ontologies – such as MONDO, DO, and the National Cancer Institute Thesaurus – with the ultimate aim of using terms from those ontologies in curation, instead of terms from PHIDO. As general vocabularies for animal and plant disease emerge or are identified, we will extend this procedure to those diseases.

      With regards to a graph representation of the data, we are aware of the examples the reviewer described, and we agree that this type of representation could be preferable. However, our data model is currently constrained by the developers of Canto, who use a relational data model and currently have no plans to implement a graph data model or a graph representation. We acknowledge that query languages like GraphQL can provide a graph-based interface to an existing relational data model, but we believe this would require a significant technological investment. For PHI-base, we plan to enable a graph representation of the data by integrating with existing knowledge graph tools, such as KnetMiner (www.knetminer.com;doi.org/10.1111/pbi.13583), which will provide graph-based queries on PHI-base (albeit only on select species for which knowledge graphs will be provided, i.e. Arabidopsis, rice, wheat, eight plant and human infecting fungal ascomycete pathogens, and two non-pathogenic yeast species). We will also use KnetMiner integration to embed subgraphs of the complete knowledge graph into the gene-centric pages on the PHI-base 5 website.

      We acknowledge the lack of discussion about extending the tool for broader interspecies interactions. These examples may have been omitted from a previous draft due to journal word count limits. Possible future uses of the PHI-Canto schema could include insect–plant interactions (both beneficial and detrimental), endosymbiotic relationships such as mycorrhiza–plant rhizosphere interactions, nodulating bacteria–plant rhizosphere interactions, fungi–fungi interactions, plant–plant interactions or bacteria–insect interactions, and non-pathogenic relationships in natural environments, such as bulk soil, rhizosphere, phyllosphere, air, freshwater, estuarine water or seawater, and tissues or organs (e.g. the gut, lungs, and skin of humans, birds, or other animals). The schema could also be extended to situations where phenotype relations to genes or genotypes have been established for predator–prey relationships, or where there is competition in herbivore–herbivore, predator–predator, or prey–prey relationships in the air, on land or in the water. Customizing Canto to use other ontologies and controlled vocabularies is as simple as editing a configuration file within the source code.

    1. Author Response:

      We appreciate the Reviewers’ feedback. The manuscript was extensively revised and ultimately accepted for publication (Petrican and Fornito, 2023, Developmental Cognitive Neuroscience). The revisions address the Reviewers’ key concerns, including the theoretical basis of the link between MDD and AD, the rationale for studying this link in adolescence, clear references to significant genetic associations between the two, detailed assessment of CCA and PLS model generalisability and reliability, quantification of resilience, residualization of confounders, and corrections for multiple comparisons. We also note that the details concerning the receptor density maps we use in our analysis have now been published (Hansen et al., 2022, Nature Neuroscience; Markello et al., 2022, Nature Methods).

    1. Author Response

      Reviewer #1 (Public Review):

      By performing immunopeptidomics of macrophages infected with virulent M. tuberculosis, the authors were able to appropriately address whether Mtb proteins are able to enter the MHC-I antigen processing pathway. Their interrogation provides convincing evidence that substrates of Mtb's type VII secretion systems (T7SS) are a significant contributor to the Mtb-derived peptides presented on MHC-I. Compelling data are provided to demonstrate that ESX-1 activity is required for the MHC-1 presentation of these newly identified peptides.

      Strength

      Employing a virulent strain of Mtb for infection of human monocyte-derived macrophages to identify Mtb proteins that access the MHC-I antigen processing pathways and the associated mechanisms.

      Weakness

      The immunogenicity of at least some of the identified peptides should have been evaluated.

      Although obtaining T cells from a cohort of TB-exposed patients was not within the scope of this study, we are also eager to assess the immunogenicity of the epitopes we identified in future work. In addition to the references we made in our initial submission to prior work showing that many of the proteins from which the epitopes we identified derive elicit T cell responses in Mtb-exposed humans, we’ve added references to prior studies that show that a few of the specific epitopes we identified are immunogenic, providing at least a preliminary indication that MHC-I peptides identified by MS can be immunogenic T cell epitopes (lines 420-423): “Individual peptides we identified by MS have also been previously shown to be recognized by human T cells, including EsxJ24-34 (Grotzke et al., 2010; Lewinsohn et al., 2013) and EsxA28-36 (Tully et al., 2005), providing a proof of concept that particular epitopes identified by MS can be immunogenic.”

    1. Author Response

      Reviewer #1 (Public Review):

      The authors have performed scATACseq on multiple timepoints during mouse male gonadogenesis and germ cell maturation during the fetal to neonatal transition (E18.5 and postnatal days 1,2,5). Clustering of thousands of cells revealed striking cellular diversity and led to the identification of cell populations that were not known before. This work may have far reaching implications, but additional validation is needed.

      We would like to start by expressing our appreciation to the reviewer’s valuable comments and feedback on our manuscript. We would also like to express our sincere apologies for the delay in submitting our revised manuscript. The COVID-19 pandemic has had a significant impact on academic research and publication, and we encountered several challenges during this time. Both co-first authors of this manuscript were promoted to new roles, which required additional time and effort to transition into these new positions. Furthermore, we experienced significant delays in obtaining the necessary research materials due to longer shipment times for antibodies and other reagents during the pandemic, which further contributed to the delay. We understand that our delay may have caused inconvenience but we want to assure you that we have carefully addressed all of the reviewer comments and we deeply appreciate your understanding and patience during these challenging times.

      The identification of novel transitional spermatogonia population in Figure 4D is intriguing. Independent validation by flow cytometry or in testis cross section to better allow the colocalization of nr5a1 and Oct4 and other germ cell markers would be important. Additional validation is needed to ensure that populations 1 and 2 in figure 4d are not to doublets. Providing violin plots for both soma and germ cell markers will be helpful. Is SF1 the only gene expressed in this unique germ cell population or are many other somatic markers expressed in the population. Do these cells express well recognized SPG markers like Oct4+ , PLZF, GFRA?

      We have performed immunostaining of NR5A1 in testicular sections and showed that NR5A1+ germ cells (TRA98+ cells) exist in P5.5 testis (Figure 4D). We appreciate the reviewer's comment and understand the concern regarding potential doublets in figure 4d. We examined the expression of various markers in both scATAC-seq (gene score) and scRNA-seq (mRNA) datasets and provided violin plots. Sertoli cell markers and germ cell markers showed variable levels in unknown 1 and 2 populations while the Leydig cell marker did not (Supplementary figure S6D).

      As additional evidence supporting our finding that a subset of somatic markers are expressed in the unique germ cell population we identified, we reference a study where cells in the spermatogonial signature 3 cluster showed high levels of mRNAs characteristic of Sertoli cells, including Nr5a1, Sox9, and Wt1 (PMID: 25568304). This indicates that cells with germ cell identity can express somatic cell genes, which is consistent with our findings. Additionally, another study reported the expression of the somatic cell marker WT1 in some germ cells through immunostaining (Figure 3B, PMID: 34815802). We have included this information in the revised manuscript to further support our conclusion (line 301). In addition, as we have isolated nuclei rather than whole cells, it is less likely that germ cells and sertoli cells are sticking together during single cell capture. We hope that the additional evidence and analysis provided will help to ease the reviewer's concerns and further support the conclusions drawn from our data.

      The IF validation in 5F is not as convincing that these cells are potentially Sertoli stem cells. IF in cross-sections will be easier to interpret- especially when co-stained with several germ, somatic, or novel markers of that population. purification of these cells and further characterization is needed. A hallmark of fetal Sertoli cells is to mediate the migration of endothelial cells to the seminiferous tubules during testicular cord formation. Is it possible to purify these cells to determine whether they have functional Sertoli cells properties in vitro using human umbilical vein endothelial cells (HUVECs). Do these cells have immune privilege properties - can they suppress proliferation of Jurkat E6 cells.

      Following the reviewer’s suggestions, we conducted further immunostaining of MBD3 and AMH in Sertoli cells (Figure 5F). The observed staining results not only confirm the properties of MBD3+ cells (MBD3-high/AMH-high) but also highlight the heterogeneity of Sertoli cells, as evidenced by the presence of various expression patterns such as MBD3-low/AMH-high (cluster SC3 in Figure 5A) and MBD3-low/AMH-low (cluster SC2/4/5/6 in Figure 5A). This further emphasizes the complexity and diversity within the Sertoli cell population.

      However, we understand that it is premature to definitively conclude that MBD3-high cells are Sertoli stem cells without functional studies. We appreciate the suggestion of using additional functional assays such as in vitro co-culture with HUVECs and immune privilege assays to further characterize the potential Sertoli stem cell population. These are valuable experiments to consider for future research in order to gain a deeper understanding of the properties and functions of these cells. To more accurately reflect the scope of our study and avoid potential misinterpretation, we have revised the language to reflect that we have identified subpopulations of Sertoli cells with unique characteristics, rather than using the term "stem cell". We hope that our revised data adequately addresses the reviewer’s concerns.

      Reviewer #2 (Public Review):

      Liao et at performed single cell ATAC sequencing to reveal chromatin status in various cell types in the perinatal mouse testes. The chromatin status was then used to define cell types and identify potential transcription factors that control the progress of differentiation. This work could provide new insights into how various cell types acquire their fate in early testis development and establish a genomic framework that can be used to correlate with human data for infertility. The strength lies on the novelty of single cell analyses. The weaknesses include a lack of statistical power, the uncertainty on the correlation between chromatin status, gene expression, and transcription factor activity, and insufficient information and confirmation on some of the experiments and results.

      We would like to start by expressing our appreciation to the reviewer’s valuable comments and feedback on our manuscript. We would also like to express our sincere apologies for the delay in submitting our revised manuscript. The COVID-19 pandemic has had a significant impact on academic research and publication, and we encountered several challenges during this time. Both co-first authors of this manuscript were promoted to new roles, which required additional time and effort to transition into these new positions. Furthermore, we experienced significant delays in obtaining the necessary research materials due to longer shipment times for antibodies and other reagents during the pandemic, which further contributed to the delay. We understand that our delay may have caused inconvenience but we want to assure you that we have carefully addressed all of the reviewer comments and we deeply appreciate your understanding and patience during these challenging times.

    1. Author Response

      Reviewer #1 (Public Review):

      This is a well-performed and carefully executed and quantified study. There is however a point that needs clarification:

      We thank the reviewer for these motivating comments and appreciate the careful reflection of our work.

      The authors state that acute regeneration occurs between 5-10dpt. However, the graphs in Fig 1D, F, and 2F indicate that most PC generation occurs from 20-30 days. What happens in this period? Does proliferation increase? Can the authors perform BrdU incorporation between 6 days and 1 month?

      The reviewer is right that PC regeneration seems to be more intense from 20-30 days. Yet during this stage also wildtype larvae add a number of PCs to their PC population pool, thus we would consider only PCs being added in surplus to the number of regularly added PCs as a contribution to regeneration, and here we see in quantified samples the largest increase of regenerating PCs during 8-10 days post-treatment with 20,9 and 23,2 additional (surplus) PCs on average respectively.

      This question also relates to the first comment of reviewer 3 who asked for a combined BrdU and EdU labeling approach to address the cell cycle length of PC progenitors. We have therefore performed this experiment with the first pulse of BrdU-labeling at 18 days after PC-ablation to include the request stated here for a BrdU-labeling at later stages of regeneration. Again, no significant difference between BrdU-positive PC progenitors was found at this later stage of PC regeneration, but a small number of PC progenitors underwent additional rounds of proliferation compared to controls, which provide an explanation of how the entire PC population is replenished and why complete PC regeneration requires several months. Please see also our answer to question 1 of reviewer 3. These new findings are now presented in an additional Supplementary Figure (Figure 1-figure supplement 3) and have been added to the last paragraph of the section reporting the findings presented in Figure 1.

      Related to this, as the authors indicate in lines 129-131, the regeneration of new PCs overlaps with normal development. Are other neuronal cell types generated in appropriate numbers?

      This is an interesting question raised by the reviewer. But it is very general relating to all cerebellar neuronal cell types, which is out of our possibilities to address. We considered eurydendroid cells as the most likely cell population, which could be affected in their numbers by PC ablation and regeneration, because eurydendroid cells share the same ptf1a+-expressing progenitor cells with Purkinje cells. Eurydendroid cells – the zebrafish equivalents to deep nuclei neurons in mammals – can be identified by their expression of olig2. We have therefore quantified the number of eurydendroid cells in the cerebellum of double transgenic PC-ATTAC/olig2:GFP larvae 15 days after PC ablation. No significant difference in olig2:GFP positive cells could be observed between PC-regenerating and control zebrafish suggesting that eurydendroid cells are not affected in their quantity and are generated in appropriate numbers in PC regenerating larvae. These findings are presented in a new Supplementary Figure (Figure 3-figure supplement 3) and are described together with findings about eurydendroid cells presented in the main Figure 3.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Gonzalez et al investigated the dynamics of dopamine signals, measured with optophysiological methods in the lateral shell of the nucleus accumbens (LNAc), in response to different types of visual stimuli. Contrary to most current theories of dopamine signaling, the authors found that LNAc dopamine transients tracked sensory transitions in visual stimulation rather than any immediately apparent motivational variable. This unorthodox finding is of potential interest to the field, as it suggests that dopamine in this particular area of the striatum supports a very different, albeit unclear behavioral function than what has been previously attributed to this neuromodulator. Many of the approaches used by the authors were very elegant, like the careful selection of visual stimuli parameters and the use of Gnat1/2 KO mice to demonstrate that the dopamine responses were directly dependent on the visual stimulation of rods and cones. That said, the authors did not discuss how their findings relate to much previously published work, many of which offer potential alternative explanations for their results. It is also not clear from the manuscript text which mice were used for which experiments, and how testing history might affect the results.

      We would like to thank the reviewer for their careful review of our manuscript. In our revised manuscript, we reworked our Materials and Methods to better detail the experimental workflow, which is highlighted in yellow. We have also added new data in stimulus-naïve animals to better examine the effect of exposure history on the dopaminergic response to light. To provide validation of our recording sites, we have included a new figure (Figure 1-Figure Supplement 1) that contains a representative histological image showing the location of the optical fiber/virus expression, as well as a schematic demonstrating optical fiber placements. Finally, the reviewer’s point about discussing the current results in the context of previous literature is well taken, and we have added three new paragraphs of text in the Discussion to highlight these findings.

      Reviewer #2 (Public Review):

      In this elegant work, the authors investigated dopamine release (measured by dLight sensor fiber photometry) in the nucleus accumbens shell, in response to salient luminance change. They show that abrupt visual stimuli - including stimuli not detectable by the human eye - can evoke robust dopamine release in the accumbens shell.

      The fact that dopamine signals can be evoked by salient sensory stimuli is not itself novel, but the paper manages to make several important and new findings:

      1) The authors show that the dopamine signal is not related to the level of threat evoked by the visual stimuli.

      2) They provide important detail about the stimuli parameters relevant to dopamine release. For instance, they show that the rate of luminance change (or abruptness) is a key factor in evoking dopamine responses.

      3) They show that robust dopamine responses can be evoked by visual stimuli of low intensity, including stimuli not perceptible by the human eye.

      4) They show that these dopamine responses can be evoked by all wavelengths in the visible spectrum (with some higher sensitivity at certain wavelengths).

      5) Finally, by recording dopamine responses in two knockout mice strains, the authors show that the light-evoked dopamine release critically relies on rod and cone photoreceptors, but not melanopsin phototransduction.

      These results add to a series of recent findings showing that dopamine signals are not restricted to the encoding of reward prediction error, but instead contribute to signaling environmental changes more broadly. The study has been skillfully executed, the results are clear and appropriately analyzed, and the manuscript is very well written. Although the work did not include control mice lacking the dLight sensor, the fact that light-evoked dopamine responses were not observed in mice lacking cone + rod phototransduction is strong evidence that the fiberphotometry signals were not due to direct light artifacts.

      We would like to thank the reviewer for taking their valuable time over the holidays to review our manuscript. We appreciate their feedback and have responded to their concerns below.

      Comment/concerns are minor:

      1) The authors show that the dopamine response evoked by a brief visual stimulus is drastically reduced when the visual stimulus is repeated in rapid succession (stimulus train). The authors interpret this as evidence for the HABITUATION of this light-evoked dopamine release. An alternative explanation is that it is the prediction of the stimulus that is responsible for canceling the dopamine response (i.e. sensory prediction error). The authors should discuss this alternative explanation for this finding.

      This is a valid point, which we have now addressed in the revised Discussion section (Paragraph 3).

      2) Although the study largely focuses on dopamine responses to visual stimuli, the results are largely consistent with previous studies showing dopamine signals encoding value-neutral changes in sensory inputs (i.e. sensory prediction errors) in different modalities (taste or odors; cf. Takahashi et al., 2017, Neuron; Howard & Kahnt, 2018, Nat. Comm.). The authors might want to cite those papers (note that I am not affiliated with those papers).

      This is similar to the point brought up by Reviewer 1, namely that several key pieces of literature were not discussed in the original manuscript. We agree that this was an oversight and hope we have remedied it in the revised Discussion, as detailed in the response to Reviewer 1. We have included both citations in the new text.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript describes efforts to understand how independence from ribonucleotide reduction might evolve in obligate intracellular bacterial pathogens using E. coli as a model for this process. The authors successfully deleted the three ribonucleotide reductase (RNR) operons present in E. coli and showed that growth of this knockout strain can be achieved with deoxyribonucleotide supplementation. They also performed evolutionary experiments and analysis of cell growth and morphology under conditions of low nucleotide availability. In this work, they established that certain genes are consistently mutated to compensate for the loss of RNR activity and the low availability of deoxynucleotides. Comparison to genomes of intracellular pathogens that lack RNR genes shows that these patterns are largely conserved.

      While the experimental results support the conclusions of the study, the authors do report changes in cell morphology upon the growth of the RNR knockout strains with low concentrations of nucleotides. It would be ideal to note this complication earlier in the manuscript. And to clarify how the possibility of cell elongation might affect the OD measurements in Figure 3 describing the experiments to establish that dC is necessary for growth in the knockout strain. It would also be ideal to provide a more detailed explanation for that observation in the discussion.

      Thank you for the feedback. We have now added mention of cell morphology in the final paragraph of the introduction, where we summarise key findings.

      For establishing if there is either growth or no growth under various conditions, as we have done, a qualitative assessment such as the one presented in Figure 3 is sufficient. The issue of whether OD is impacted by cell elongation has been documented by Stevenson et al. (https://www.nature.com/articles/srep38828), and becomes a problem if trying to quantify parameters such as doubling time or when trying to estimate cell counts. We do not do either of these, as calculation of both requires an assumption of normal cell morphology in E. coli. We have added a note to clarify this in the first paragraph of the Discussion section, as per the suggestion from Reviewer #1.

      Reviewer #2 (Public Review):

      Ribonucleotide reductase (RNR) is crucial for de novo synthesis of the dNTP building blocks needed for DNA synthesis and is essential in nearly all organisms. In the current study, all three E. coli RNRs have been removed and the essential function of the enzyme is bypassed by the introduction of an exogenous deoxyribonucleoside kinase that enables dNTP production via salvage synthesis. This leads to a complete dependency on exogenously supplied deoxyribonucleosides (dNs), loss of control of dNTP regulation, and a highly increased mutation rate. The bacteria could also grow with only supplied deoxycytidine (and no other dNs), indicating that all dNTPs could be synthesized from deoxycytidine. An evolutionary analysis of the recombinant E. coli strain grown in multiple generations showed that mutations accumulated in genes involved in the catabolism of deoxycytidine and deoxyribose-1-P, supporting a model that all the other deoxyribonucleosides can be produced by a phosphorylase using nucleobases and deoxyribose-1-P as substrates and that the deoxycytidine (besides being a precursor of dCTP) could be a substrate to produce the deoxyribose-1-P needed by the phosphorylase working in the opposite direction.

      The story is very interesting with novel findings, and the experiments are well performed. There are a few missing pieces of information, but on the other hand, it is many steps to cover if everything is going to be shown in a single paper and I came to the conclusion that the data is enough at this stage. One of the missing points for future research is to check what happens with the dNTP pools. RNR is a very important enzyme to control the dNTP levels and it is likely that it is unbalanced dNTP pools that lead to the increased mutation rates. However, it would be interesting to really measure the dNTP pools and connect them to the mutations reported. Another missing piece is to identify which nucleoside phosphorylase is involved and investigate its substrate specificity to better understand why the cells can live on deoxycytidine but not other dNs.

      We thank the reviewer for these comments. It is certainly possible that the mutational biases we observe across the genomes of our evolved lines are related to skewed pools. We hope to examine this in a follow-up study. Likewise, it will be interesting to investigate the biochemical basis for our lines being able to grow solely on deoxycytidine, and to ascertain how this might also impact mutation.

      Reviewer #3 (Public Review):

      The study focuses on a compelling question focusing on a largely indispensable mechanism, ribonucleotide reduction. The authors generate a unique specific bacterial strain where the ribonucleotide reducatase operon, entirely, is deleted. They grow the mutant strain in environments that have various amounts of the necessary deoxyribonucleoside levels, further, they perform evolution experiments to see whether and how the evolved lines would be able to adapt to the limited deoxyribonucleosides. Finally, researchers identify key mutations and generate key isogenic genetic constructs where target mutants are deleted. A summary postulation based on the evolutionary trajectory of ribonucleotide reduction by bacteria is presented. Overall, the study is well presented, well-justified, and builds on fairly classic genetic and evolution experiments. The select question and hypotheses and the overall framing of the story are fairly novel for the respective communities. The results should be interesting to evolutionary biology researchers, especially those interested in RNA>DNA directional evolution, as well as molecular microbiologists interested in the ribonucleotide reception dependence and selection by the environment. A discussion on the limitations of the laboratory study for the broader understanding of the host dependence during endosymbiosis and parasitism would be a good addition given the emphasis on this phenomenon as a part of the broader impacts of the study.

      We thank the reviewer for suggestion that we consider the broader implications of our work. We have now added a final paragraph which addresses the question of why loss of ribonucleotide reduction appears so rare.

    1. Author Response:

      What is novel here is that we calculated the time-varying retinal motion patterns generated during the gait cycle using a 3D reconstruction of the terrain. This allows calculation of the actual statistics of retinal motion experienced by walkers over a broad range of normal experience. We certainly do not mean to claim that stabilizing gaze is novel, and agree that the general patterns follow directly from the geometry as worked out very elegantly by Koenderink and others.  We spend time describing the terrain-linked gaze behavior because it is essential for understanding the paper. We do not claim that the basic saccade/stabilize/saccade behavior is novel and now make this clearer.

      The other novel aspect is that the motion patterns vary with gaze location which in turn varies with terrain in a way that depends on behavioral goals. So while some aspects of the general patterns are not unexpected, the quantitative values depend on the statistics of the behavior.  The actual statistics require these in situ measurements, and this has not previously been done, as stated in the abstract.

      The measured statistics provide a well-defined set of hypotheses about the pattern of direction and speed tuning across the visual field in humans. Points of comparison in the existing literature are hard to find because the stimuli have not been closely matched to actual retinal flow patterns, and the statistics will vary with the species in question. However, recent advances allow for neurophysiological measurements and eye tracking during experiments with head-fixed running, head-free, and freely moving animals. These emerging paradigms will allow the study of retinal optic flow processing in contexts that do not require simulated locomotion. While the exact the relation between the retinal motion statistics we have measured and the response properties of motion-sensitive cells remains unresolved, the emerging tools in neurophysiology and computation make similar approaches with different species more feasible.

      A more detailed description of the methods including the photogrammetry and the reference frames for the measurements has been added primarily to the Methods section.

      Reviewer #1 (Public Review):

      Much experimental work on understanding how the visual system processes optic flow during navigation has involved the use of artificial visual stimuli that do not recapitulate the complexity of optic flow patterns generated by actual walking through a natural environment. The paper by Muller and colleagues aims to carefully document "retinal" optic flow patterns generated by human participants walking a straight path in real terrains that differ in "smoothness". By doing so, they gain unique insights into an aspect of natural behavior that should move the field forward and allow for the development of new, more principled, computational models that may better explain the visual processing taking place during walking in humans.

      Strengths:

      Appropriate, state-of-the-art technology was used to obtain a simultaneous assessment of eye movements, head movements, and gait, together with an analysis of the scene, so as to estimate retinal motion maps across the central 90 deg of the visual field. This allowed the team to show that walkers stabilize gaze, causing low velocities to be concentrated around the fovea and faster velocities at the visual periphery (albeit more the periphery of the camera used than the actual visual field). The study concluded that the pattern of optic flow observed around the visual field was most likely related to the translation of the eye and body in space, and the rotations and counter-rotations this entailed to maintain stability. The authors were able to specify what aspects of the retinal motion flow pattern were impacted by terrain roughness, and why (concentration of gaze closer to the body, to control foot placement), and to differentiate this from the impact of lateral eye movements. They were also able to identify generalizable aspects of the pattern of retinal flow across terrains by subsampling identical behaviors in different conditions.

      Weaknesses:

      While the study has much to commend, it could benefit from additional methodological information about the computations performed to generate the data shown. In addition, an estimation of inter-individual variability, and the role of sex, age, and optical correction would increase our understanding of factors that could impact these results, thus providing a clearer estimate of how generalizable they are outside the confines of the present experiments.

      Properties of gait depend on the passive dynamics of the body and factors such as leg length and subject specific cost functions which are influenced by image quality and therefore by optical correction. In this experiment all subjects were normal acuity or corrected to normal (with no information regarding their uncorrected vision). This is now noted in the Methods. The goal of the present work was to calculate average statistics over a range of observers and conditions in order to constrain the experience-dependent properties one might see in neurophysiology. We have added between-subjects error bars to Figure 2 and added gaze angle distributions as a function of terrain for individual observers in the Supplementary materials. Figure 4 b and d now show standard errors across subjects. Individual subject plots are shown in the Supplementary materials. For Figure 2, most variability between subjects occurs in the Flat and Bark terrains where one might expect individual choices of energetic costs versus speed and stability etc might come into play. This is supported by our subsequent unpublished work on factors influencing foothold choice. We have also found that leg length determines path choices and thus will influence the retinal motion. Differences between observers are now noted in the text. These individual subject differences should indicate the range of variability that might be expected in the underlying neural properties and perhaps in behavioral sensitivity. Because of the size of our dataset (n=11) it is not feasible to make comparisons of sex or age. There were equal numbers of males and females and age ranged from 24 to 54. Now noted in the Methods section.

      Reviewer #2 (Public Review):

      The goal of this study was to provide in situ measurements of how combined eye and body movements interact with real 3D environments to shape the statistics of retinal motion signals. To achieve this, they had human walkers navigate different natural terrains while they measured information about eyes, body, and the 3D environment. They found average flow fields that resemble the Gibsonian view of optic flow, an asymmetry between upper and lower visual fields, low velocities at the fovea, a compression of directions near the horizontal meridian, and a preponderance of vertical directions modulated by lateral gaze positions.

      Strengths of the work include the methodological rigor with which the measurements were obtained. The 3D capture and motion capture systems, which have been tested and published before, are state-of-the-art. In addition, the authors used computer vision to reconstruct the 3D terrain structure from the recorded video.

      Together this setup makes for an exciting rig that should enable state-of-the-art measurements of eye and body movements during locomotion. The results are presented clearly and convincingly and reveal a number of interesting statistical properties (summarized above) that are a direct result of human walking behavior.

      A weakness of the article concerns tying the behavioral results and statistical descriptions to insights about neural organization. Although the authors relate their findings about the statistics of retinal motion to previous literature, the implications of their findings for neural organization remain somewhat speculative and inconclusive. An efficient coding theory of visual motion would indeed suggest that some of the statistics of retinal motion patterns should be reflected in the tuning of neural populations in the visual cortex, but as is the present findings could not be convincingly tied to known findings about the neural code of vision. Thus, the behavioral results remain strong, but the link to neural organization principles appears somewhat weak.

      We agree, but we think that strengthening the neural links requires future studies. As mentioned above, it is very difficult to relate the measured statistics to existing neurophysiological literature and we have tried to make this clearer in the Discussion (p14, 15, 16). This is because the stimuli chosen are typically arbitrary and not chosen to be realistic examples of patterns consistent with natural motion across a ground plane. Other stimuli are simply inconsistent with self-motion together with gaze stabilization (eg not zero velocity at the fovea). It has also been technically difficult to map cell properties across the visual field. We have made the comparisons we thought were useful. The point of the paper is to provide a hypothesis about the pattern of direction and speed tuning across the visual field. So the challenge for neurophysiology is to show how the observed cell properties vary across the visual field. Note also that the motion patterns will be influenced by the body motion of the animal in question, and because of this we are now collaborating with a group who are attempting to record from monkey MT/MST during locomotion while tracking eyes and body. Similarly we are training neural networks to learn the patterns generated by human gait to develop more specific hypotheses about receptive field properties.

      Reviewer #3 (Public Review):

      Gaze-stabilizing motor coordination and the resulting patterns of retinal image flow are computed from empirically recorded eye movement and motion capture data. These patterns are assessed in terms of the information that would be potentially useful for guiding locomotion that the retinal signals actually yield. (As opposed to the "ecological" information in the optic array, defined as independent of a particular sensor and sampling strategy).

      While the question posed is fundamental, and the concept of the methodology shows promise, there are some methodological details to resolve. Also, some terminological ambiguities remain, which are the legacy of the field not having settled on a standardized meaning for several technical terms that would be consistent across laboratory setups and field experiments.

      Technical limits and potential error sources should be discussed more. Additional ideas about how to extend/scale up the approach to tasks with more complex scenes, higher speed or other additional task demands and what that might reveal beyond the present results could be discussed.

      This issue is addressed in more detail in the Discussion, second paragraph, and also the second last paragraph.

    1. Author Response

      Reviewer #1 (Public Review):

      This work presents a unification model (of sorts) for explaining how the flow of evidence through networks can be controlled during decision-making. The authors combine two general frameworks previously used as neural models of cortical decision-making, dynamic normalization (that implement value encoding via firing activity) and recurrent network models (which capture winner-take-all selection processes) into a unified model called the local disinhibition-based decision model (LDDM). The simple motif of the LDDM allows for the disinhibition of excitatory cells that represent the engagement of individual actions that happens through a recurrent inhibitory loop (i.e., a leaky competing accumulator). The authors show how the LDDM works effectively well at explaining both decision dynamics and the properties of cortical cells during perceptual decision-making tasks.

      All in all, I thought this was an interesting study with an ambitious goal. But like any good study, there are some open issues worth noting and correcting.

      MAJOR CONCERNS

      1. Big picture

      This was a comprehensive and extremely well-vetted set of theoretical experiments. However, the scope and complexity also made the take-home message hard to discern. The abstract and most of the introduction focus on the framing of LDDM as a hybrid of dynamic normalization models (DNM) and recurrent network models (RNMs). This is sold as a unification of value normalization and selection into a novel unified framework. Then the focus shifts to the role of disinhibition in decision-making. Then in the Discussion, the goal is stated as to determine whether the LDDM generates persistent activity and does this activity differ from RNMs. As a reader, it seems like the paper jumps between two high- level goals: 1) the unification of DNM and RNM architectures, and 2) the role of disinhibition. This constant changing makes it hard to focus as the reader goes on. So what is the big picture goal specifically?

      Also, the framing of value normalization and WTA as a novel computational goal is a bit odd as this is a major focus of the field of reinforcement learning (both abstractly at the computational level and more concretely in models of the circuits that regulate it). I know that the authors do not think they are the first to unify value judgements with selection criteria. The writing just comes across that way and should be clarified.

      We thank the Reviewer for their thoughtful consideration of the overall framing of the big picture goals of the paper. Upon reflection, we agree that the paper really centers on the importance of incorporating disinhibition into computational circuit-based models of decision-making. Thus, we have significantly revised the Introduction and Discussion to focus on the theoretical and empirical importance of incorporating disinhibition into computational models of decision-making, and use the integration of value normalization and WTA selection as an example of how disinhibition increases the richness of circuit decision models. Please see the response to recommendations below for more detail on the changes.

      1. Link to other models

      The LDDM is described as a novel unification of value normalization and winner-take-all (WTA) selection, combining value processing and selection. While the authors do an excellent job of referencing a significant chunk of the decision neuroscience literature (160 references!) the motif they end up designing has a highly similar structure to a well-known neural circuit linked to decision-making: the cortico-basal ganglia pathways. Extensive work over the past 20+ years has highlighted how cortical-basal ganglia loops work via disinhibition of cortical decision units in a similar way as the LDDM (see the work by Michael Frank, Wei Wei, Jonathan Rubin, Fred Hamker, Rafal Bogacz, and many others). It was surprising to not see this link brought up in the paper as most of the framing was on the possibility of the LDDM representing cortical motifs, yet as far as I know, there does not exist evidence for such architectures in the cortex, but there is in these cortical-basal ganglia systems.

      We thank the Reviewer for the suggestion to link the LDDM to disinhibition in CBG models; this is indeed an important body of empirical and computational work that we overlooked in the original manuscript. We have now added text to the Discussion to highlight the link between LDDM and these CBL disinhibition models, focusing on how they are conceptually similar and how they differ. Please see our response to recommendations below for a more detailed discussion of the revisions.

      1. Model evaluations

      The authors do a great job of extensively probing the LDDM under different conditions and against some empirical data. However, most of the time there is no "control" model or current state-of-the-art model that the LDDM is being compared against. In a few of the simulation experiments, the LDDM is compared against the DNM and RNM alone, so as to show how the two components of the LDDM motif compare against the holistic model itself. But this component model comparison is inconsistently used across simulation experiments.

      Also, it is worth asking whether the DNM and RNM are appropriate comparison models to vet the LDDM against for two reasons. First, these are the components of the full LDDM. So these tests show us how the two underlying architectural systems that go into LDDM perform independently, but not necessarily how the LDDM compares against other architectures without these features. Second, as pointed out in my previous comment, the LDDM is a more complex model, with more parameters, than either the DNM or RNM. The field of decision neuroscience is awash in competing decision models (including probabilistic attractor models, non-recurrent integrators, etc.). If we really want to understand the utility of the LDDM, it would be good to know how it performs against similarly complex models, as opposed to its two underlying component models.

      We greatly appreciate the Reviewer’s comments on the point of model comparison, which points out that our original manuscript failed to clearly convey a very important difference between the LDDM and the existing RNM(s). In the revision, we now make it clearer that the fundamental difference between the LDDM and the RNMs is the architecture of disinhibition (see the revised Introduction, especially p. 8 lines 164-168). The LDDM is not simply a combination of the DNM model with RNM architecture (a point we may have mistakenly conveyed in the original manuscript): the introduction of disinhibition separates LDDM inhibition into option-selective subpopulations, as opposed to the single pooled inhibition of RNM models. Given this fact, the LDDM predicts unique selectiveinhibition dynamics shown in recent optogenetic and calcium imaging results, a finding inconsistent with the common-pooled and non-selective inhibition assumed in the existing RNMs and many of its variants. Thus, we believe that a comparison between the LDDM and the RNM, which share similar level of complexity and numbers of parameters, is important.

      We also appreciated the Reviewer’s concern about testing the LDDM against alternative models. In order to better connect to the existing literature, we now compare the LDDM to another standard circuit model of decision-making - the leaky competing accumulator (LCA) model. The LCA is a circuit model that captures many of the aspects of perceptual decision-making seen in the mathematical drift diffusion model (DDM), but with a construction that allows for fitting to behavioral data and comparison of underlying unit activities. Please see our response to recommendations below for further detail.

      1. Comparison to physiological data

      I quite enjoyed the comparisons of the excitatory cell activity to empirical data from the Shadlen lab experiments. However, these were largely qualitative in nature. In conjunction with my prior point on the models that the LDDM is being compared against, it would be ideal to have a direct measure of model fits that can be used to compare the performance of different competing "control" models. These measures would have to account for differences in model complexity (e.g., AIC or BIC), but such an analysis would help the reader understand the utility of the LDDM in connecting with empirical data much better.

      We agree with the Reviewer that a quantitative comparison of the match between model neural predictions and empirical neurophysiological data is important. First, we wish to clarify that the model neural predictions are simulated from models fit to the behavioral (choice and RT data), not from fits to the neural activity traces – a point we now clarify in the text. While directly fitting dynamic models (LDDM, RNM, or LCA) to the neurophysiological data is appealing, there are currently several obstacles to this approach. The first problem is the complexity of the dynamic neural traces. Despite the long history of the random-dot motion paradigm, detailed features of the dynamics are still not understood. For example, the stereotyped initial dip after stimulus onset may reflect a reset of the network state to improve signal to noise ratio (Conen and Padoa-Schioppa, 2015) or simply reflect a surround suppression-like lateral inhibition in visual processing. A second problem is that the primary difference between the models is the activity of inhibitory (and disinhibitory) neurons, which are typically not recorded in neurophysiological experiments; thus, there is a lack of empirical data to which to fit the models. In the revision, we clarified that the model fitting to the Roitman & Shadlen data is for behavioral data only, and model unit activity traces are derived from models fit to behavioral data.

      That being said, we agree that a quantitative comparison of model activity predictions is helpful. Because the models are fit not to the neural data but to the behavioral data, rather than using likelihood-based measures like AIC and BIC we used a simple RMSE measure to compare the match between predicted and neural activity patterns (revised Fig. 6E, Fig 6-S4E, Fig 6-S5E). Please see response to recommendations below for details.

      Reviewer #2 (Public Review):

      The aim of this article was to create a biologically plausible model of decision-making that can both represent a choice's value and reproduce winner-take-all ramping behavior that determines the choice, two fundamental components of value- based decision-making. Both of these aspects have been studied and modeled independently but empirical studies have found that single neurons can switch between both of the aspects (i.e., from representing value to winner-take-all ramping behavior) in ways that are not well described by current biological plausible models of decision making.

      The current article provides a thorough investigation of a new model (the local disinhibition decision model; LDDM) that has the goal of combining value representations and winner-takes-all ramping dynamics related to choice. Their model uses biologically plausible disinhibition to control the levels of inhibition in a local network of simulated neurons. Through a careful series of simulation experiments, they demonstrate that their network can first represent the value of different options, then switch to winner-takes-all ramping dynamics when a choice needs to be made. They further demonstrate that their single model reproduces key components of value-based and winner-takes-all dynamics found in both neural and behavioral data. They additionally conduct simulation studies to demonstrate that recurrent excitatory properties in their network produce value-persistence behavior that could be related to memory. They end by conducting a careful simulation study of the influence of GABA agonists that provide clear and testable predictions of their proposed role of inhibition in the neural processes that underlie decision-making. This last piece is especially important as it provides a clear set of predictions and experiments to help support or falsify their model.

      There are overall many strengths to this paper. As the authors note, current network models do not explain both value- based and ramping-like decision-making properties. Their thorough simulation studies and their validation against empirical neural and behavioral data will be of strong interest to neuroscientists and psychologists interested in value- based decision-making. The simulations related to persistence and the GABA-agonist experiments they propose also provide very clear guidelines for future research that would help advance the field of decision-making research.

      Although the methods and model were generally clear, there was a fair amount of emphasis on the role of recurrence in the LDDM, but very little evidence that recurrence was important or necessary for any of the empirical data examined. The authors do demonstrate the importance of recurrence in some of their simulation studies (particularly in their studies of persistence), but these would need to be compared against empirical data to be validated. Nevertheless, the model and thorough simulation investigations will likely help develop more precise theories of value-based decision-making.

      We appreciate the Reviewer’s thoughtful comments. These comments - especially about anatomic recurrence and its relationship to the parameter 𝛼 - inspired us to think more about the uniqueness of the current circuit to others, especially the implications related to the parameters 𝛼 (i.e., self-excitation) and 𝛽 (i.e., local disinhibition). Recurrence is required to drive winner-take-all competition in the standard RNM of decision-making. However, we show here with both analytical and numerical approaches that recurrence helps WTA competition but is not necessary in our model. Instead, the key feature of the LDDM is to utilize disinhibition in conjunction with lateral inhibition to realize winner-take-all competition. That leads to many different predictions of the current model from the existing models, such as selective inhibition and flexible control of dynamics.

      In response to the Reviewer’s points and after careful consideration of the differential equations, we realized that in our model fitting, the 𝛼 parameter fitting to zero does not necessarily mean recurrence should be zero. The 𝛼 parameter shares a lot of similarity to the baseline gain control (parameter BG in our revision), and thus is unidentifiable in the current dataset. In the interest of parsimony, we did not include the parameter BG in the original manuscript, but now include it because it reveals the difficulty of interpreting fit 𝛼 values as simply the level of recurrence.

      Overall, disinhibition (𝛽) in the LDDM is required for WTA activity while recurrence (𝛼) can contribute but is not necessary; however, 𝛼 is theoretically important for generating persistent activity, with the caveat that in the current framework there is an unclear relationship between fit 𝛼 and recurrence. Regardless, we agree that the contribution of 𝛼 to the LDDM framework is worth further testing and examining with future empirical data.

      Reviewer #3 (Public Review):

      Shen et al. attempt to reconcile two distinct features of neural responses in frontoparietal areas during perceptual and value-guided decision-making into a single biologically realistic circuit model. First, previous work has demonstrated that value coding in the parietal cortex is relative (dependent on the value of all available choice options) and that this feature can be explained by divisive normalization, implemented using adaptive gain control in a recurrently connected circuit model (Louie et al, 2011). Second, a wealth of previous studies on perceptual decision-making (Gold & Shadlen 2007) have provided strong evidence that competitive winner-take-all dynamics implemented through recurrent dynamics characterized by mutual inhibition (Wang 2008) can account for categorical choice coding. The authors propose a circuit model whose key feature is the flexible gating of 'disinhibition', which captures both types of computation - divisive normalization and winner-take-all competition. The model is qualitatively able to explain the 'early' transients in parietal neural responses, which show signatures of divisive normalization indicating a relative value code, persistent activity during delay periods, and 'late' accumulation-to-bound type categorical responses prior to the report of choice/action onset.

      The attempt to integrate these two sets of findings by a unified circuit model is certainly interesting and would be useful to those who seek a tighter link between biologically realistic recurrent neural network models and neural recordings. I also appreciate the effort undertaken by the authors in using analytical tools to gain an understanding of the underlying dynamical mechanism of the proposed model. However, I have two major concerns. First, the manuscript in its current form lacks sufficient clarity, specifically in how some of the key parameters of the model are supposed to be interpreted (see point 1 below). Second, the authors overlook important previous work that is closely related to the ideas that are being presented in this paper (see point 2 below).

      1) The behavior of the proposed model is critically dependent on a single parameter 'beta' whose value, the authors claim, controls the switch from value-coding to choice-coding. However, the precise definition/interpretation of 'beta' seems inconsistent in different parts of the text. I elaborate on this issue in sub-points (1a-b) below:

      1a). For instance, in the equations of the main text (Equations 1-3), 'beta' is used to denote the coupling from the excitatory units (R) to the disinhibitory units (D) in Equations 1-3. However, in the main figures (Fig 2) and in the methods (Equation 5-8), 'beta' is instead used to refer to the coupling between the disinhibitory (D) and the inhibitory gain control units (G). Based on my reading of the text (and the predominant definition used by the authors themselves in the main figures and the methods), it seems that 'beta' should be the coupling between the D and G units.

      1b). A more general and critical issue is the failure to clearly specify whether this coupling of D-G units (parameterized by 'beta') should be interpreted as a 'functional' one, or an 'anatomical' one. A straightforward interpretation of the model equations (Equations 5-8) suggests that 'beta' is the synaptic weight (anatomical coupling) between the D and G units/populations. However, significant portions of the text seem to indicate otherwise (i.e a 'functional' coupling). I elaborate on this in subpoints (i-iii) below:

      (1b-i). One of the main claims of the paper is that the value of 'beta' is under 'external' top-down control (Figure 2 caption, lines 124-126). When 'beta' equals zero, the model is consistent with the previous DNM model (dynamic normalization, Louie et al 2011), but for moderate/large non-zero values of 'beta', the network exhibits WTA dynamics. If 'beta' is indeed the anatomical coupling between D and G (as suggested by the equations of the model), then, are we to interpret that the synaptic weight between D-G is changed by the top-down control signal within a trial? My understanding of the text suggests that this is not in fact the case. Instead, the authors seem to want to convey that top-down input "functionally" gates the activity of D units. When the top-down control signal is "off", the disinhibitory units (D) are "effectively absent" (i.e their activity is clamped at zero as in the schematic in Fig 2B), and therefore do not drive the G units. This would in- turn be equivalent to there being no "anatomical coupling" between D and G. However when the top-down signal is "on", D units have non-zero activity (schematic in Fig 2B), and therefore drive the G units, ultimately resulting in WTA-like dynamics.

      (1b-ii). Therefore, it seems like when the authors say that beta equals zero during the value coding phase they are almost certainly referring to a functional coupling from D to G, or else it would be inconsistent with their other claim that the proposed model flexibly reconfigures dynamics only through a single topdown input but without a change to the circuit architecture (reiterated in lines 398-399, 442-444, 544-546, 557-558, 579-590). However, such a 'functional' definition of 'beta' would seem inconsistent with how it should actually be interpreted based on the model equations, and also somewhat misleading considering the claim that the proposed network is a biologically realistic circuit model.

      (1b-iii). The only way to reconcile the results with an 'anatomical' interpretation of 'beta' is if there is a way to clamp the values of the 'D' units to zero when the top-down control signal is 'off'. Considering that the D units also integrate feed- forward inputs from the excitatory R units (Fig 2, Equations 1-3 or 5-8), this can be achieved either via a non-linearity, or if the top-down control input multiplicatively gates the synapse (consistent with the argument made in lines 115-116 and 585-586 that this top-down control signal is 'neuromodulatory' in nature). Neither of these two scenarios seems to be consistent with the basic definition of the model (Equations 1-3), which therefore confirms my suspicion that the interpretation of 'beta' being used in the text is more consistent with a 'functional' coupling from D to G.

      We thank the reviewer for pointing out this confusion. We apologize that the original illustrations (Fig. 2A) and the differential equations in Methods (Eqs. 5-8) did not convey very well our ideas. 𝛽 is intended to reference the coupling from R to D, not a change in the weights between D and G units. We realize there was some confusion on this part due to inconsistency between our original figures, text, and supplementary material.

      Given the lack of clarity in the previous version as well as the Reviewer’s questions, we now emphasize that 𝛽 represents a functional coupling between the R and D neurons. The biological assumption of the disinhibitory architecture is built based on recent findings that VIP neurons in the cortex always inhibit other neighboring inhibitory cells, such as SST and PV neurons, and consequently disinhibit the neighboring primary neurons (e.g., Fu et al., 2014; Karnani et al., 2014, 2016). We did not see evidence in the literature of fast-changing (anatomic) connections between VIP and SST/PV. However, there is evidence that the responsiveness of VIP neurons to excitatory neurons can be modulated by changing the concentrations of neuromodulators, such as acetylcholine and serotonin (Prönneke et al., 2020). While the stereotype of neuromodulator action is slow dynamics, recent findings show that for example basal forebrain cholinergic neurons respond to reward and punishment with surprising speed and precision (18 ± 3ms) (Hangya et al., 2015) to modulate arousal, attention, and learning in the neocortex. Given the large number of studies that identify long-term projections and neuromodulatory inputs to VIP neurons (e.g., Pfeffer et al., 2013; Pi et al., 2013; Alitto & Dan, 2013; Tremblay et al., 2016), we believe that it will be more plausible to assume the connection weights between R and D in our case is quickly modulated within a trial.

      To clarify this issue in the revised manuscript, we made the following corrections:

      1. We repositioned the 𝛽 parameter in Fig. 2A between the connection from R to D, to align the description of 𝛽 modulating R to D in the main text.

      2. We modified the differential equations 5-8 (now numbered as Eqs. 28-32) in Methods (pp. 61) to include the disinhibitory unit D as an independent control from the inhibitory unit I, in order to be consistent with the disinhibitory D units in LDDM. Such a change makes tiny differences in the model predictions (please see dynamics simulated after the change in Fig. 2-figure supplement 1B).

      3. We updated the neural circuit motif in Fig. 2 -figure supplement 1A accordingly.

      2) The main contribution of the manuscript is to integrate the characteristics of the dynamic normalization model (Louie et al, 2011) and the winner-take-all behavior of recurrent circuit models that employ mutual inhibition (Wang, 2008), into a circuit motif that can flexibly switch between these two computations. The main ingredient for achieving this seems to be the dynamical 'gating' of the disinhibition, which produces a switch in the dynamics, from point-attractor-like 'stable' dynamics during value coding to saddle-point-like 'unstable' dynamics during categorical choice coding. While the specific use of disinhibition to switch between these two computations is new, the authors fail to cite previous work that has explored similar ideas that are closely related to the results being presented in their study. It would be very useful if the authors can elaborate on the relationship between their work and some of these previous studies. I elaborate on this point in (a-b) below:

      2a) While the authors may be correct in claiming that RNM models based on mutual inhibition are incapable of relative value coding, it has already been shown previously that RNM models characterized by mutual inhibition can be flexibly reconfigured to produce dynamical regimes other than those that just support WTA competition (Machens, Romo & Brody, 2005). Similar to the behavior of the proposed model (Fig 9), the model by Machens and colleagues can flexibly switch between point-attractor dynamics (during stimulus encoding), line-attractor dynamics (during working memory), and saddle-point dynamics (during categorical choice) depending on the task epoch. It achieves this via a flexible reconfiguration of the external inputs to the RNM. Therefore, the authors should acknowledge that the mechanism they propose may just be one of many potential ways in which a single circuit motif is reconfigured to produce different task dynamics. This also brings into question their claim that the type of persistent activity produced by the model is "novel", which I don't believe it is (see Machens et al 2005 for the same line-attractor-based mechanism for working memory)

      We thank the Reviewer for pointing out the conceptual similarities between the LDDM and the Machens Romo Brody model, and now include a discussion of the link between the two early in the revised Discussion (p. 38, lines 826-837). Please see response to recommendations below for a more detailed discussion of this point.

      2b) The authors also fail to cite or describe their work in relation to previous work that has used disinhibition-based circuit motifs to achieve all 3 proposed functions of their model - (i) divisive normalization (Litwin-Kumar et al, 2016), (ii) flexible gating/decision making (Yang et al, 2016), and working memory maintenance (Kim & Sejnowski,2021)

      The Reviewer notes several relevant papers, and we have now discussed them and their relationship to the LDDM in a revised Discussion section (pp. 35-36). Please see response to recommendations below for a more details.

    1. Author Response

      Reviewer #2 (Public Review):

      The two new micropeptides are well characterized in the manuscript and appear to be functionally important with some chromatin-level consequences of their loss (which can be either direct or indirect), but the finding that lincRNA sequences encode micropeptides is not novel, and the two described in the paper appear to be zebrafish-specific and their function was tested only in zebrafish, which limits the interest in these genes. The use of ribosome profile data along behavioral screening to identify micropeptides is interesting and important, but the scope of the screen, the candidates selected for testing, etc. are not clear enough as presented. The ChIP-seq analysis of the new proteins is very interesting but is not described in any detail. Overall, the experimental part is well designed and the phenotypes reported by the authors appear to be strong and convincing, but the mechanistic understanding of what the two new proteins do and how, and the general interest in the results given the current scope of understanding of micropeptide is limited.

      We apologize for the misunderstanding that these genes are zebrafish-specific. In this revision, we have clarified throughout the text and with additional data that these genes are not zebrafish-specific, but that linc-mipep and linc-wrb are homologous to human Hmgn1.

    1. Author Response

      Reviewer #1 (Public Review):

      Francou et al. examine the dynamics of cell ingression at the primitive streak during mouse gastrulation and correlate this with the localization of elements of the apical Crumbs complex and the actomyosin cytoskeleton. Using time-lapse live imaging, they show that cells at the primitive streak ingress in a stochastic manner, by constricting their apical surface through a ratcheting shrinkage of individual junctions. Meticulous evaluation of immunofluorescent staining for many elements of the actomyosin contractile process as well as junctional and apical domain elements reveals anisotropic localization of Crumbs2, ZO1, and ppMLC. In addition, the localization of two groups of proteins showed a close correlation - actomyosin regulators and apical and junctional components - but there was a lack of correlation of localization of these two groups of proteins to each other. The localization of actomyosin and its activity, was altered and more homogeneous in Crumbs2-/- embryos, and there was a significant decrease in aPKC and Rock1. The authors conclude from these observations that Crumbs2 regulates anisotropic actomyosin contractility to promote apical constriction and cell ingression.

      The strengths of this manuscript are the very detailed observations on the process of apical constriction and the meticulous evaluation of the localization of the many proteins likely to be involved in the process. While many of the general observations are not new, Francou et al. provide a much richer understanding of this process, as well as a paradigm with which to evaluate the effects of mutations on the gastrulation process. The figures are beautiful, clear, and informative, and support the conclusions made by the authors. The data provide a very compelling picture of both the dynamics of cell behavior and the anisotropies in protein localization associated with it.

      However, much of the Crumbs2 mutant phenotype is not sufficiently explained by the authors' data or conclusions. First, the loss of Crumbs2 does not prevent ingression, as there are mesoderm cells evident between the epiblast and endoderm (Ramkumar et al., 2016, Xiao et al., 2011). There are certainly fewer, and the biggest effect appears to be during the elongation of the axis from E7.75 onward and not during the earlier migratory period (E6.5-E7.75) according to data from both previously published work (Xiao et al., 2011; Ramkumar et al., 2015, 2016) and the data presented here.

      • The reviewer makes a good point regarding the defects observed in Crumbs2 mutant embryos. It is true that in this mutant, a first wave of gastrulation EMT, taking place around E6.5, does not appear to be affected. We interpret this to mean that the gastrulation EMT is a sequential process under differential regulation, and that Crumbs2 is not required for the first wave of cells ingression through the primitive streak, at the onset of gastrulation. Consequently, a small number of early mesodermal cells are produced in Crumbs2 mutants. However, within 24hours of the onset of gastrulation, corresponding to around E7.75, ingression defects are evident in Crumbs2 mutant embryos.

      • For simplicity, these distinct sequential phases of gastrulation regulation, initially independent of Crumbs2, but subsequently dependent, were not initially discussed in our manuscript. We have now elaborated these details in the revised manuscript.

      Nor does the loss of Crumbs2 prevent apical constriction. Ramkumar et al. in their 2016 paper show by live imaging that the major effect of the Crumbs2 mutation is to prevent the cells from detaching from the epithelium, but that the apical domain does undergo constriction, leading to many elongated flask-shaped cells still attached at the apical end. These observations do not fit well with the model proposed by the authors of Crumbs2 regulating anisotropic actomyosin contractility to promote apical constriction and suggest a more complicated story.

      • We thank the reviewer for bringing this up, as it is an important point that we now discuss in greater detail and clarify in the revised manuscript.

      • Importantly, we do not believe our data are in disagreement with the previous study of Ramkumar et al. The precise details of the defect observed in Crumbs2 mutants are still not totally clear. However, we would like to point out that in Ramkumar et al., the timelapse imaging data did not depict cells constricting their surfaces, but rather these data revealed that cells having small apical surfaces failed to detach and delaminate out of the epiblast layer. Thus, this previous study focused on the subsequent step in the process of ingression (delamination), to that being addressed in the present work.

      • Furthermore, epiblast cells outside the domain occupied by the primitive streak, and even some cells positioned on the lateral sides of the embryo, were reported by Ramkumar and colleagues to exhibit abnormally small apical surfaces in Crumbs2 mutants. These cells, at a distance from the primitive streak, will not normally constrict their apical surfaces, since they are not going to undergo the gastrulation EMT, a behavior restricted to the region of the primitive streak. Thus, these previous data do not directly address nor demonstrate that epiblast cells in Crumbs2 mutants undergo apical constriction.

      • Moreover, in Crumbs2 mutants a large number of cells were reported to fail to ingress at the primitive streak, and consequently they were seen to accumulate within the epiblast epithelial layer. Indeed, we believe that the small apical surfaces first reported in Crumbs2 mutants by Ramkumar and colleagues, most likely result from the crowding/jamming of cells within the epiblast layer, and that this causes changes in the shape and volume of cells due to them being spatially constrained. Thus, increased crowding of epithelial cells within a spatially constrained tissue, likely drives a reduction in apical surface area and extensive apico-basal elongation, as observed in Crumbs2 mutants.

      However, the complications of the Crumbs2 mutant do not detract from the value of the basic observations presented in this manuscript, which are solid and well-documented, and will be a valuable resource for the field.

      Reviewer #2 (Public Review):

      In their manuscript, Francou and colleagues study the delamination of epiblast cells into the mesodermal layers using live imaging of mouse embryos cultured ex vivo. By segmenting the apical area of delaminating cells, they quantify extensively the dynamic behavior of delaminating cells. Using immunostaining and crumbs2 mutants, they propose that apical constriction of cells results from pulsed contractions, which could be guided by crumbs2 signals.

      The manuscript is interesting and provides extremely valuable data for our understanding of mouse gastrulation. Occasionally, the manuscript can be a bit confusing and contains a few inaccuracies.

      However, the main issues I have are with some of the interpretations from the authors, which may be incorrect due to limited time resolution (with a 5 min time resolution that was used, it might be difficult to distinguish pulses from measurement noise) and the analysis of immunostaining data, which would require more rigorous quantification.

      • We acknowledge the reviewer’s comments and agree that a shorter time resolution would be ideal to facilitate the detection of constriction pulses of apical surfaces. However, we need to consider that imaging the apical surface of cells within the epiblast layer, which constitutes the most internal surface inside the embryo, is technically challenging in a gastrulating mouse embryo.

      • As suggested by the reviewer, we attempted to image with a shorter time interval than 5min on several different microscope systems and modalities available at our institution (including two different laser point scanning confocals, a spinning disc system, as well as light-sheet microscopes with both upright and inverted configurations) and were not successful in acquiring usable images (having a shorted time-resolution) with the ZO1GFP knock-in reporter. We also need to consider that single-copy GFP knock-in reporters are often dim, thereby exacerbating the issue. In our hands, a high-speed resonant scanning confocal (Nikon A1RHD25) was the system that gave us the best signal-to-noise ratio, spatial resolution and temporal resolution, and was the set-up we used for our most recent live imaging experiments. Using this system, we were able to acquire a limited number of time-lapses with a time resolution of 2min, but none with a shorter time interval, and from our analyses, we determined that movies with a 2min time interval did not yield increased detail over movies with 5min time intervals to warrant a detailed reanalysis. We have provided additional detail relating to these technical issues within the revised manuscript and edited some of the conclusions.

      • We acknowledge that immunostaining is not the most quantitative method, but we were unable to come up with alternative methods that can be used with our samples. We believe the junctional reduction of Myosin, aPKC and Rock1 is generally due to a nonrecruitment or activation of these proteins at junctions, and do not reflect their reduced expression at the gene or protein level. We do not believe that methods such as RTqPCR or Western blotting would be informative in the context in which we are looking, especially since they do not yield spatial resolution. Furthermore, we would need to isolate primitive streak cells to consider applying these methods, and we do not believe they would provide a sufficient improvement over immunostaining.

      • By contrast to the live imaging, which was performed by placing the objective at the posterior side of the embryo in closest proximity to the outer visceral endoderm layer, for fixed tissue imaging, embryos were microdissected to recover the posterior side containing the primitive streak. Microdissected posterior regions were imaged on the side of the cavity by placing the objective in closest proximity to the inner epiblast layer, which permitted direct access to the apical surface of epiblast cells at the primitive streak. In this fixed tissue imaging configuration, the apical surfaces of cells in WT and Crumbs2 mutants were in closest proximity to the imaging objective and thus directly accessible. Thus, any difference in tissue thickness on the other side of the epithelium did not interfere with light penetration. We have edited the figures and include schematics to clarify how the objective positions are flipped with respect to the primitive streak regions at the embryo’s posterior for live vs. fixed tissue imaging.

      • We have now measured the signal intensity in the cytoplasmic region of WT and Crumbs2 mutant embryos, and junctional intensity measurements have been normalized to cytoplasmic intensities.

      Reviewer #3 (Public Review):

      The manuscript by Francou et al investigated cellular mechanisms of epiblast ingression during mouse gastrulation. The authors wanted to know whether/how epiblast cell-cell junctional dynamics correlate with apical constriction and subsequent ingression. Because mouse gastrula adopts an inverted-cup morphology (as a result of differential invasive behavior of polar and mural trophoblast cells), epiblast cells are located in the innermost position and are difficult to image. This is more so when one wants to perform live imaging of epiblast cells' apical surface. The authors tackled such problems/limitations by using a combination of ZO-1 GFP line, confocal time-lapse microscopy, fixed embryo immunostaining, and Crumbs2 mutant embryos. The authors observed that apical constriction was associated with cell ingression, that this constriction occurred in a pulsed fashion (i.e., 2-4 cycles with phases of contraction and expansion, eventually leading to reduction of apical surface and ingression), that this constriction took place asynchronously (i.e., neighboring epiblast cells did not exhibit coordinated behavior) and that junctional shrinkage during apical constriction also occurred in a pulsed and asynchronous manner. The authors also investigated localization/co-localization of several apical proteins (Crumbs2, Myosin2B, pMLC, ppMLC, Rock1, F-actin, PatJ, and aPKC) in fixed samples, uncovering somewhat reciprocal distribution of two groups of proteins (represented by Myosin2B in one group, and Crumbs2 in the other). Finally, the authors showed that Crumbs2 -/- embryos had disturbed actomyosin distribution/levels without affecting junctional integrity (partially explaining the ingression defect reported in Crumbs2 -/- mutant embryos). Overall, this manuscript offers high-quality live imaging data on the dynamic remodeling of epiblast apical junctions during mouse gastrulation.

      It would be interesting to see whether phenomena reported in this manuscript can be extended to the entire primitive streak (or are they specific only to a subset of mesoderm precursors) and to the entire period of mesendoderm formation. More importantly, it would be interesting to see whether the ingression behavior seen here is representative of all eutherian mammals regardless of their gastrular topography.

      • The reviewer raises a very interesting and important point. We focused our data analysis on a middle region in the proximo-distal axis of the embryo, because this is the most optically accessible and the flattest region of the posterior of the embryo to analyze. We also focused on the E7.5 stage of development when the primitive streak is fully elongated, so as to capture as many ingression events within a single time-lapse experiment as possible. Due to the difficulties associated with live imaging the apical epiblast layer of embryos at these stages, we chose to focus our analysis on a defined region of the embryo and a defined period of time. We acknowledge that it will be important to analyze different regions of the primitive streak and at different stages of gastrulation to glean any general versus more distinct modes of epiblast cell ingression, but given the technical difficulties discussed we believe that any extended analysis is beyond the scope of the current study.

      • We also agree that it would be interesting to know if the ingression behavior we observe in the mouse embryo is representative of all mammals, and even more generally of amniotes, but this is beyond the scope of our study.

    1. Author Response

      Reviewer #2 (Public Review):

      Throughout the manuscript, the authors aim to distinguish signal from the lack of it. All conclusions depend on the success of this process. In such an endeavor, the sensitivity of the applied methods is critical. Thus, the authors must use the most sensitive tools to draw meaningful conclusions. The latest iGluSnFR has amazing sensitivity allowing the detection of single AP-evoked responses. This is not the case for vGpH, which requires hundred APs to get a meaningful signal. Similar, synthetic Ca2+ dyes have much better dynamic range, linearity and sensitivity compared to GCaMP6f.

      The rate of silent boutons at 2 mM [Ca2+]e is lower for a single AP compared to 20 or 200 APs. The overall failure rate cannot be increased with increasing the number of APs. This clearly indicates a technical issue (e.g. insufficient sensitivity of vGpH and GCaMP6f).

      We thank the reviewer for raising this concern. We attribute the relatively lower rate of silencing with 1 AP in [Ca2+]e 2.0 mM in neurons expressing iGluSnFr to its sensitivity to detect glutamate exocytosed from neighboring, possibly non-transfected terminals. This limitation is described in the manuscript (page 7, line 26 – page 8, line 5). The overall agreement in the proportion of silencing with iGluSnFr compared to physin-GCaMP or vGpH at lower [Ca2+]e, where the contributions from neighboring terminals is likely greatly diminished, supports this interpretation.

      The authors used three different measuring tools and used three different stimulation protocols, making the interpretation of the data challenging. It is impossible to tell how the failure rate changes from 1 to 20 APs without knowing the release probability, the pool size, depletion, recovery of SVs, and facilitation. These are all unknown.

      In an ideal world, a measure of release probability during a train of stimuli at varied [Ca2+]e would provide the most insight, but this is difficult to achieve with any of the existing methods, including the remarkable new iGluSnFR. The challenge we face is, for our approach, it is impossible to exclude signals from neighboring axons that are closely packed near the axon harboring the indicator. This limitation is described in the manuscript (page 7, line 26 – page 8, line 5). Given this, we felt that showing that silencing can be revealed with all the different techniques was the most conservative approach to address the issue. Because we have focused on this phenomenon, the number of APs is experimentally important only to ensure an adequate response could be detected. We have also included, in the discussion, an acknowledgement of the possibility that we are failing to detect minimal Ca2+ entry (see response to #8 from the synthesized review).

      The last experiment with the GABAB agonist has little novelty in its present form. The authors demonstrate that GABAB agonism increases the rate of silent terminals. The interesting issue would be to reveal how the effect of GABAB activation depends on the [Ca2+]e. This information is essential to see whether there is indeed a shoulder in its effectiveness curve.

      We are grateful to the reviewer for this recommendation and we have performed additional experiments (see response to #7 from the synthesized review).

      The authors refer to a theoretical set-point in [Ca2+]e below which the function of the terminals is fundamentally different. From the presented experiments, the reviewer does not see any data that is inconsistent with a continuum. 'Thus, as with Ca2+ influx, SV recycling is modulated in an all-or-none manner by modest changes in [Ca2+]e around the physiological set point.' This statement is not supported by the data. The reviewer cannot see a set point.

      We appreciate the reviewer’s criticism and wish to clarify that we mean the normal physiologic [Ca2+]e in the CSF. We have changed the text to clarify this point (page 7, line 20).