10,000 Matching Annotations
  1. Aug 2024
    1. eLife assessment

      Working with a diverse panel of field-grown rice accessions, this valuable study measures changes in transcript abundance, tests for patterns of selection on gene expression, and maps the genetic basis of variation in gene expression in normal and high salinity conditions. The authors provide solid evidence that salinity treatment increases the number of genes with mean expression levels away from the optimum, and that a relatively small number of genes are hotspots for genetic variants that affect genome-wide patterns of variation in gene expression under high salinity. The design, clarity, and interpretation of several statistical analyses can be improved, additional opportunities for integration among datasets and analyses could be realized, and genetic manipulation would be required to confirm the functional involvement of any specific genes in regulatory networks or organismal traits that confer adaptation to higher salinity conditions. The manuscript will not only be of interest to evolutionary biologists studying the genetics of complex traits, but it will also be a resource for plant biologists studying mechanisms of abiotic stress tolerance.

    2. Reviewer #1 (Public Review):

      Summary:

      Understanding the mechanisms of how organisms respond to environmental stresses is a key goal of biological research. Assessment of transcriptional responses to stress can provide some insights into those underlying mechanisms. The researchers quantified traits, fitness, and gene expression (transcriptional) response to salinity stress (control vs stress treatments) for 130 accessions of rice (three replicates for each accession), which were grown in the field in the Philippines. This experimental design allowed for many different types of downstream analyses to better understand the biology of the system. These analyses included estimating the strength of selection imposed on transcription in each environment, evaluating possible trade-offs in gene expression, testing whether salinity induces transcriptional decoherence, and conducting various eQTL-type analyses.

      Strengths:

      The study provides an extensive analysis of gene expression responses to stress in rice and offers some insights into underlying mechanisms of salinity responses in this important crop system. The fact that the study was conducted under field conditions is a major plus, as the gene expression responses to soil salinity are more realistic than if the study was conducted in a greenhouse or growth chamber. The preprint is generally well-written and the methods and results are mostly well-described.

      Weaknesses:

      While the study makes good use of analyzing the dataset, it is not clear how the current work advances our understanding of gene regulatory evolution or plant responses to soil salinity generally. Overall, the results are consistent with other prior studies of gene expression and studies of selection across environmental conditions. Some of the framing of the paper suggests that there is more novelty to this study than there is in reality. That said, the results will certainly be useful for those working in rice and should be interesting to scientists interested in how gene expression responses to stress occur under field conditions. I detail other concerns I had about the preprint below:

      The abstract on lines 33-35 illustrates some of my concerns about the overstatement of the novelty of the current study. For example, is it really true that the role of gene expression in mediating stress response and adaptation is largely unexplored? There have been numerous studies that have evaluated gene expression responses to stresses in a wide range of organisms. Perhaps, I am missing something critically different about this study. If so, I would recommend that the authors reword this sentence to clarify what gap is being filled by this study. Further, is it really the case that none of them have evaluated how the correlational structure of gene expression changes in response to stresses in plants, as implied in lines 263-265? Don't the various modules and PC analyses of gene expression get at this question?

      There were some places in the methods of the preprint that required more information to properly evaluate. For example, more information should be provided on lines 664-668 about how G, E, and GxE effects were established, especially since this is so central to this study. What programs/software (R? SAS? Other?) were used for these analyses? If R, how were the ANOVAs/models fit? What type of ANOVA was used? How exactly was significance determined for each term? Which effects were considered fixed and which were random? If the goal was to fit mixed models, why not use an approach like voom-limma (Law et al. 2014 Genome Biology)? More details should also be added to lines 688-709 about these analyses, including what software/programs were used for these analyses.

      One thing that I found a bit confusing throughout was the intermixing of different terms and types of selection. In particular, there seemed to be some inconsistencies with the usage of quantitative genetics terms for selection (e.g. directional, stabilizing) vs molecular evolution terms for selection (e.g. positive, purifying). I would encourage the authors to think carefully about what they mean by each of these terms and make sure that those definitions are consistently applied here.

      It would be useful to clarify the reasons for the inherent bias in the detection of conditional neutrality (CN) and antagonistic pleiotropy (AP; Lines 187-196). It is also not clear to me what the authors did to deal with the bias in terms of adjusting P-value thresholds for CN and AP the way it is currently written. Further, I found the discussion of antagonistic pleiotropy and conditional neutrality to be a bit confusing for a couple of reasons, especially around lines 489-491. First of all, does it really make sense to contrast gene expression versus local adaptation, when lots of local adaptation likely involves changes in gene expression? Second, the implication that antagonistic pleiotropy is more common for local adaptation than the results found in this study seems questionable. Conditional neutrality appears to be more common for local adaptation as well: see Table 2 of Wadgymar et al. 2017 Methods in Ecology and Evolution. That all said, it is always difficult to conclude that there are no trade-offs (antagonistic pleiotropy) for a particular locus, as the detecting trade-offs may only manifest in some years and not others and can require large sample sizes if they are subtle in effect.

    3. Reviewer #2 (Public Review):

      The authors investigate the gene expression variation in a rice diversity panel under normal and saline growth conditions to gain insight into the underlying molecular adaptive response to salinity. They present a convincing case to demonstrate that environmental stress can induce selective pressure on gene expression, which is in agreement to their earlier study (Groen et al, 2020). The data seems to be a good fit for their study and overall the analytic approach is robust.

      (1) The work started by investigating the effect of genotype and their interaction at each transcript level using 3'-end-biased mRNA sequencing, and detecting a wide-spread GXE effect. Later, using the total filled grain number as a proxy of fitness, they estimated the strength of selection on each transcript and reported stronger selective pressure in a saline environment. However, this current framework relies on precise estimation of fitness and, therefore can be sensitive to the choice of fitness proxy.

      (2) Furthermore, the authors decomposed the genetic architecture of expression variation into cis- and trans-eQTL in each environment separately and reported more unique environment-specific trans-eQTLs than cis-. The relative contribution of cis- and trans-eQTL depends on both the abundance and effect size. I wonder why the latter was not reported while comparing these two different genetic architectures. If the authors were to compare the variation explained by these two categories of eQTL instead of their frequency, would the inference that trans-eQTLs are primarily associated with expression variation still hold?

      (3) Next, the authors investigated the relationship between cis- and trans-eQTLs at the transcript level and revealed an excess of reinforcement over the compensation pattern. Here, I struggle to understand the motivation for testing the relationship by comparing the effect of cis-QTL with the mean effect of all trans-eQTLs of a given transcript. My concern is that taking the mean can diminish the effect of small trans-eQTLs potentially biasing the relationship towards the large-effect eQTLs.

    4. Reviewer #3 (Public Review):

      In this work, the authors conducted a large-scale field trial of 130 indica accessions in normal vs. moderate salt stress conditions. The experiment consists of 3 replicates for each accession in each treatment, making it 780 plants in total. Leaf transcriptome, plant traits, and final yield were collected. Starting from a quantitative genetics framework, the authors first dissected the heritability and selection forces acting on gene expression. After summarizing the selection force acting on gene expression (or plant traits) in each environment, the authors described the difference in gene expression correlation between environments. The final part consists of eQTL investigation and categorizing cis- and trans-effects acting on gene expression.

      Building on the group's previous study and using a similar methodology (Groen et al. 2020, 2021), the unique aspect of this study is in incorporating large-scale empirical field works and combining gene expression data with plant traits. Unlike many systems biology studies, this study strongly emphasizes the quantitative genetics perspective and investigates the empirical fitness effects of gene expression data. The large amounts of RNAseq data (one sample for each plant individual) also allow heritability calculation. This study also utilizes the population genetics perspective to test for traces of selection around eQTL. As there are too many genes to fit in multiple regression (for selection analysis) and to construct the G-matrix (for breeder's equation), grouping genes into PCs is a very good idea.

      Building on large amounts of data, this study conducted many analyses and described some patterns, but a central message or hypothesis would still be necessary. Currently, the selection analysis, transcript correlation structure change, and eQTL parts seem to be independent. The manuscript currently looks like a combination of several parallel works, and this is reflected in the Results, where each part has its own short introduction (e.g., 185-187, 261-266, 349-353). It would be great to discuss how these patterns observed could be translated to larger biological insights. On a related note, since this and the previous studies (focusing on dry-wet environments) use a similar methodology, one would also wonder what the conclusions from these studies would be. How do they agree or disagree with each other?

      Many analyses were done separately for each environment, and results from these two environments are listed together for comparison. Especially for the eQTL part, no specific comparison was discussed between the two environments. It would be interesting to consider whether one could fit the data in more coherent models specifically modeling the X-by-environment effects, where X might be transcripts, PCs, traits, transcript-transcript correlation, or eQTLs.

      As stated, grouping genes into PCs is a good idea, but although in theory, the PCs are orthogonal, each gene still has some loadings on each PC (ie. each PC is not controlled by a completely different set of genes). Another possibility is to use any gene grouping method, such as WGCNA, to group genes into modules and use the PC1 of each module. There, each module would consist of completely different sets of genes, and one would be more likely to separate the biological functions of each module. I wonder whether the authors could discuss the pros and cons of these methods.

    5. Reviewer #4 (Public Review):

      The manuscript examines how patterns of selection on gene expression differ between a normal field environment and a field environment with elevated salinity based on transcript abundances obtained from leaves of a diverse panel of rice germplasm. In addition, the manuscript also maps expression QTL (eQTL) that explains variation in each environment. One highlight from the mapping is that a small group of trans-mapping regulators explains some gene expression variation for large sets of transcripts in each environment. The overall scope of the datasets is impressive, combining large field studies that capture information about fecundity, gene expression, and trait variation at multiple sites. The finding related to patterns indicating increased LD among eQTLs that have cis-trans compensatory or reinforcing effects is interesting in the context of other recent work finding patterns of epistatic selection. However, other analyses in the manuscript are less compelling or do not make the most of the value of collected data. Revisions are also warranted to improve the precision with which field-specific terminology is applied and the language chosen when interpreting analytical findings.

      Selection of gene expression:<br /> One strength of the dataset is that gene expression and fecundity were measured for the same genotypes in multiple environments. However, the selection analyses are largely conducted within environments. The addition of phenotypic selection analyses that jointly analyze gene expression across environments and or selection on reaction norms would be worthwhile.

      Gene expression trade-offs:<br /> The terminology and possibly methods involved in the section on gene expression trade-offs need amendment. I specifically recommend discontinuing reference to the analysis presented as an analysis of antagonistic pleiotropy (rather than more general trade-offs) because pleiotropy is defined as a property of a genotype, not a phenotype. Gene expression levels are a molecular phenotype, influenced by both genotype and the environment. By conducting analyses of selection within environments as reported, the analysis does not account for the fact that the distribution of phenotypic values, the fitness surface, or both may differ across environments. Thus, this presents a very different situation than asking whether the genotypic effect of a QTL on fitness differs across environments, which is the context in which the contrasting terms antagonistic pleiotropy and conditional neutrality have been traditionally applied. A more interesting analysis would be to examine whether the covariance of phenotype with fitness has truly changed between environments or whether the phenotypic distribution has just shifted to a different area of a static fitness surface.

      Biological processes under selection / Decoherence: PCs are likely not the most ideal way to cluster genes to generate consolidated metrics for a selection gradient analysis. Because individual genes will contribute to multiple PCs, the current fractional majority-rule method applied to determine whether a PC is under direct or indirect selection for increased or decreased expression comes across as arbitrary and with the potential for double-counting genes. A gene co-expression network analysis could be more appropriate, as genes only belong to one module and one can examine how selection is acting on the eigengene of a co-expression module. Building gene co-expression modules would also provide a complementary and more concrete framework for evaluating whether salinity stress induces "decoherence" and which functional groups of genes are most impacted.

      Selection of traits:<br /> Having paired organismal and molecular trait data is a strength of the manuscript, but the organismal trait data are underutilized. The manuscript as written only makes weak indirect inferences based on GO categories or assumed gene functions to connect selection at the organismal and molecular levels. Stronger connections could be made for instance by showing a selection of co-expression module eigengene values that are also correlated with traits that show similar patterns of selection, or by demonstrating that GWAS hits for trait variation co-localize to cis-mapping eQTL.

      Genetic architecture of gene expression variation:<br /> The descriptive statistics of the eQTL analysis summarize counts of eQTLs observed in each environment, but these numbers are not broken down to the molecular trait level (e.g., what are the median and range of cis- and trans-eQTLs per gene). In addition, genetic architecture is a combination of the numbers and relative effect sizes of the QTLs. It would be useful to provide information about the relative distributions of phenotypic variance explained by the cis- vs. trans- eQTLs and whether those distributions vary by environment. The motivation for examining patterns of cis-trans compensation specifically for the results obtained under high salinity conditions is unclear to me. If the lines sampled have predominantly evolved under low salinity conditions and the hypothesis being evaluated relates to historical experience of stabilizing selection, then my intuition is that evaluating the eQTL patterns under normal conditions provides the more relevant test of the hypothesis.

    6. Author response:

      Reviewer #1:

      (1) Clarification of Novelty and Contribution:

      - We agree that the novelty of our study could have been better articulated. We will more clearly define the specific gaps in knowledge our study addresses. We will also clarify the novelty in our analysis of the correlational structure of gene expression under stress.

      (2) Methodological Details:

      - We acknowledge the need for additional detail in the methods section regarding the estimation of G, E, and GxE effects. We will expand this section to include the software used (R), the specific ANOVA models applied, and how significance was determined. We will also clarify which effects were treated as fixed or random effects.

      (3) Terminology Consistency:

      - We will thoroughly review the manuscript to ensure consistent use of selection-related terminology. This will involve distinguishing between quantitative genetics terms (e.g., irectional, stabilizing) and molecular evolution terms (e.g., positive, purifying) to avoid any confusion.

      (4) Bias in Conditional Neutrality and Antagonistic Pleiotropy:

      - We appreciate the suggestion to clarify the discussion around conditional neutrality (CN) and antagonistic pleiotropy (AP). We will elaborate on the inherent bias in detecting CN and P and specify how we adjusted P-value thresholds. Additionally, we will try to refine the discussion to address the concerns raised about the comparison of gene expression and local adaptation, incorporating relevant literature.

      Reviewer #2:

      (1) Sensitivity of Fitness Proxy:

      - We acknowledge the limitations of using the total filled grain number as a fitness proxy. We will include a discussion on the potential sensitivity of our results to this choice.

      (2) Cis- and trans-eQTL Contributions:

      - We appreciate the suggestion to report effect sizes in addition to the frequency of cis- and trans-eQTLs. We will incorporate this into our analysis and discuss whether our conclusions regarding the predominance of trans-eQTLs in expression variation hold when considering effect sizes.

      (3) Cis-Trans Relationship Analysis:

      - Since we wanted to estimate compensating vs. reinforcing effects, this essentially entails identifying genes that have opposing directionality of cis and trans-effects. To get the total trans-effect we decided to take the mean effect of trans-eQTLs. This mean was only used to identify the compensating/reinforcing genes and although the mean effects diminishes the effect of small trans-eQTLs, this metric was not used in downstream analyses.

      Reviewer #3:

      (1) Integration of Analyses:

      - We acknowledge that the manuscript currently presents some analyses in a somewhat independent manner. Although it would be ideal to have a central hypothesis/message, our study is meant to broadly outline the various responses and fitness effects of salinity stress on rice. Throughout the manuscript, we have also included comparisons between our findings and that of our previous studies on drought stress to highlight any consistent themes or novel insights.

      (2) X-by-Environment Effects:

      - We do plan to consider fitting models that explicitly incorporate X-by-environment interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.

      (3) Gene Grouping Methods:<br /> - We will try to discuss the pros and cons of using PCA versus gene co-expression network analysis (e.g., WGCNA) for grouping genes. We will also explore applying WGCNA in our analysis to see if it offers any additional insights or clarity.

      Reviewer #4:

      (1) Selection Analysis Across Environments:

      - We do plan to consider fitting models that explicitly incorporate G×E interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.

      (2) Gene Expression Trade-Offs Terminology:

      - We will revise our terminology to better reflect the nature of the trade-offs observed, and explore variation in covariance between phenotype and fitness between the two environments.

      (3) Biological Processes and Decoherence:

      - We will explore applying WGCNA in our analysis to see if it offers any additional insights or clarity.

      (4) Underutilization of Organismal Traits:

      - We did perform GWAS for all the traits measured in both environments, but did not find any significant hits. We will examine whether selection of co-expression modules are correlated with the traits, and may incorporate it in our manuscript depending on the results.

      (5) Detailed eQTL Analysis:

      - We will expand our eQTL analysis to include detailed statistics at the molecular trait level, including the phenotypic variance explained by cis- and trans-eQTLs and how these vary by environment.

      Although we focus on salinity conditions in our cis-trans compensation analysis in the main results, we have provided comparisons for all our eQTL analyses between normal and salinity conditions in the main text (with figures as supplementary).<br /> We are confident that these revisions will significantly strengthen our manuscript and address the concerns raised by the reviewers. We look forward to submitting a revised version that better communicates the significance and robustness of our findings.<br /> Thank you again for your valuable feedback.

    1. eLife assessment

      This study provides new insights into the expression profile of ILCs that demonstrate a history of RAG expression. It examines in part the potential intrinsic regulation of RAG expression and seeks to understand how the epigenetic state of ILCs is established, although a full understanding of intrinsic factors is incomplete. The work provides an important molecular dataset, and with further strengthening of the understanding of intrinsic regulation, this paper would be of interest more broadly to cell biologists seeking to understand immune cell development.

    2. Reviewer #1 (Public Review):

      The study starts with the notion that in an AD-like disease model, ILC2s in the Rag1 knock-out were expanded and contained relatively more IL-5+ and IL-13+ ILC2s. This was confirmed in the Rag2 knock-out mouse model.

      By using a chimeric mouse model in which wild-type knock-out splenocytes were injected into irradiated Rag1 knock-out mice, it was shown that even though the adaptive lymphocyte compartment was restored, there were increased AD-like symptoms and increased ILC2 expansion and activity. Moreover, in the reverse chimeric model, i.e. injecting a mix of wild-type and Rag1 knock-out splenocytes into irradiated wild-type animals, it was shown that the Rag1 knock-out ILC2s expanded more and were more active. Therefore, the authors could conclude that the RAG1 mediated effects were ILC2 cell-intrinsic.

      Subsequent fate-mapping experiments using the Rag1Cre;reporter mouse model showed that there were indeed RAGnaïve and RAGexp ILC2 populations within naïve mice. Lastly, the authors performed multi-omic profiling, using single-cell RNA sequencing and ATAC-sequencing, in which a specific gene expression profile was associated with ILC2. These included well-known genes but the authors notably also found expression of Ccl1 and Ccr8 within the ILC2. The authors confirmed their earlier observations that in the RAGexp ILC2 population, the Th2 regulome was more suppressed, i.e. more closed, compared to the RAGnaïve population, indicative of the suppressive function of RAG on ILC2 activity. I do agree with the authors' notion that the main weakness was that this study lacks the mechanism by which RAG regulates these changes in ILC2s.

      The manuscript is very well written and easy to follow, and the compelling conclusions are well supported by the data. The experiments are meticulously designed and presented. I wish to commend the authors for the study's quality.

      Even though the study is compelling and well supported by the presented data, some additional context could increase the significance:

      (1) The presence of the RAGnaïve and RAGexp ILC2 populations raises some questions on the (different?) origin of these populations. It is known that there are different waves of ILC2 origin (most notably shown in the Schneider et al Immunity 2019 publication, PMID 31128962). I believe it would be very interesting to further discuss or possibly show if there are different origins for these two ILC populations.

      Several publications describe the presence and origin of ILC2s in/from the thymus (PMIDs 33432227 24155745). Could the authors discuss whether there might be a common origin for the RAGexp ILC2 and Th2 cells from a thymic lineage? If true that the two populations would be derived from different populations, e.g. being the embryonic (possibly RAGnaïve) vs. adult bone marrow/thymus (possibly RAGexp), this would show a unique functional difference between the embryonic derived ILC2 vs. adult ILC2.

      (2) On line 104 & Figures 1C/G etc. the authors describe that in the RAG knock-out ILC2 are relatively more abundant in the lineage negative fraction. On line 108 they further briefly mentioned that this observation is an indication of enhanced ILC2 expansion. Since the study includes an extensive multi-omics analysis, could the authors discuss whether they have seen a correlation of RAG expression in ILC2 with regulation of genes associated with proliferation, which could explain this phenomenon?

    3. Reviewer #2 (Public Review):

      Summary:

      The study by Ver Heul et al., investigates the consequences of RAG expression for type 2 innate lymphoid cell (ILC2) function. RAG expression is essential for the generation of the receptors expressed by B and T cells and their subsequent development. Innate lymphocytes, which arise from the same initial progenitor populations, are in part defined by their ability to develop in the absence of RAG expression. However, it has been described in multiple studies that a significant proportion of innate lymphocytes show a history of Rag expression. In compelling studies several years ago, members of this research team revealed that early Rag expression during the development of Natural Killer cells (Karo et al., Cell 2014), the first described innate lymphocyte, had functional consequences.

      Here, the authors revisit this topic, a worthwhile endeavour given the broad history of Rag expression within all ILCs and the common use of RAG-deficient mice to specifically assess ILC function. Focusing on ILC2s and utilising state-of-the-art approaches, the authors sought to understand whether early expression of Rag during ILC2 development had consequences for activity, fitness, or function. Having identified cell-intrinsic effects in vivo, the authors investigated the causes of this, identifying epigenetic changes associated with the accessibility genes associated with core ILC2 functions.

      The manuscript is well written and does an excellent job of supporting the reader through reasonably complex transcriptional and epigenetic analyses, with considerate use of explanatory diagrams. Overall I think that the conclusions are fair, the topic is thought-provoking, and the research is likely of broad immunological interest. I think that the extent of functional data and mechanistic insight is appropriate.

      Strengths:

      - The logical and stepwise use of mouse models to first demonstrate the impact on ILC2 function in vivo and a cell-intrinsic role. Initial analyses show enhanced cytokine production by ILC2 from RAG-deficient mice. Then through two different chimeric mice (including BM chimeras), the authors convincingly show this is cell intrinsic and not simply as a result of lymphopenia. This is important given other studies implicating enhanced ILC function in RAG-/- mice reflect altered competition for resources (e.g. cytokines).

      - Use of Rag expression fate mapping to support analyses of how cells were impacted - this enables a robust platform supporting subsequent analyses of the consequences of Rag expression for ILC2.

      - Use of snRNA-seq supports gene expression and chromatin accessibility studies - these reveal clear differences in the data sets consistent with altered ILC2 function.

      - Convincing evidence of epigenetic changes associated with loci strongly linked to ILC2 function. This forms a detailed analysis that potentially helps explain some of the altered ILC2 functions observed in ex vivo stimulation assays.

      - Provision of a wealth of expression data and bioinformatics analyses that can serve as valuable resources to the field.

      Weaknesses:

      - Lack of insight into precisely how early RAG expression mediates its effects, although I think this is beyond the scale of this current manuscript. Really this is the fundamental next question from the data provided here.

      - The epigenetic analyses provide evidence of differences in the state of chromatin, but there is no data on what may be interacting or binding at these sites, impeding understanding of what this means mechanistically.

      - Focus on ILC2 from skin-draining lymph nodes rather than the principal site of ILC2 activity itself (the skin). This may well reflect the ease at which cells can be isolated from different tissues.

      - Comparison with ILC2 from other sites would have helped to substantiate findings and compensate for the reliance on data on ILC2 from skin-draining lymph nodes, which are not usually assessed amongst ILC2 populations.

      - The studies of how ILC2 are impacted are a little limited, focused exclusively on IL-13 and IL-5 cytokine expression.

    4. Reviewer #3 (Public Review):

      In this study, Ver Heul et al. investigate the role of RAG expression in ILC2 functions. While RAG genes are not required for the development of ILCs, previous studies have reported a history of expression in these cells. The authors aim to determine the potential consequences of this expression in mature cells. They demonstrate that ILC2s from RAG1 or RAG2 deficient mice exhibit increased expression of IL-5 and IL-13 and suggest that these cells are expanded in the absence of RAG expression. However, it is unclear whether this effect is due to a direct impact of RAG genes or a consequence of the lack of T and B cells in this condition. This ambiguity represents a key issue with this study: distinguishing the direct effects of RAG genes from the indirect consequences of a lymphopenic environment.

      The authors focus their study on ILC2s found in the skin-draining lymph nodes, omitting analysis of tissues where ILC2s are more enriched, such as the gut, lungs, and fat tissue. This approach is surprising given the goal of evaluating the role of RAG genes in ILC2s across different tissues. The study shows that ILC2s derived from RAG-/- mice are more activated than those from WT mice, and RAG-deficient mice show increased inflammation in an atopic dermatitis (AD)-like disease model. The authors use an elegant model to distinguish ILC2s with a history of RAG expression from those that never expressed RAG genes. However, this model is currently limited to transcriptional and epigenomic analyses, which suggest that RAG genes suppress the type 2 regulome at the Th2 locus in ILC2s.

      The authors report a higher frequency of ILC2s in RAG-/- mice in skin-draining lymph nodes, which is expected as these mice lack T and B cells, leading to ILC expansion. Previous studies have reported hyper-activation of ILCs in RAG-deficient mice, suggesting that this is not necessarily an intrinsic phenomenon. For example, RAG-/- mice exhibit hyperphosphorylation of STAT3 in the gut, leading to hyperactivation of ILC3s. This study does not currently provide conclusive evidence of an intrinsic role of RAG genes in the hyperactivation of ILC2s. The splenocyte chimera model is artificial and does not reflect a normal environment in tissues other than the spleen. Similarly, the mixed BM model does not demonstrate an intrinsic role of RAG genes, as RAG1-/- BM cells cannot contribute to the B and T cell pool, leading to an expected expansion of ILC2s. As the data are currently presented it is expected that a proportion of IL-5-producing cells will come from the RAG1-/- BM.

      Overall, the level of analysis could be improved. Total cell numbers are not presented, the response of other immune cells to IL-5 and IL-13 (except the eosinophils in the splenocyte chimera mice) is not analyzed, and the analysis is limited to skin-draining lymph nodes.

      The authors have a promising model in which they can track ILC2s that have expressed RAG or not. They need to perform a comprehensive characterization of ILC2s in these mice, which develop in a normal environment with T and B cells. Approximately 50% of the ILC2s have a history of RAG expression. It would be valuable to know whether these cells differ from ILC2s that never expressed RAG, in terms of proliferation and expression of IL-5 and IL-13. These analyses should be conducted in different tissues, as ILC2s adapt their phenotype and transcriptional landscape to their environment. Additionally, the authors should perform their AD-like disease model in these mice.

      The authors provide a valuable dataset of single-nuclei RNA sequencing (snRNA-seq) and ATAC sequencing (snATAC-seq) from RAGexp (RAG fate map-positive) and RAGnaïve (RAG fate map-negative) ILC2s. This elegant approach demonstrates that ILC2s with a history of RAG expression are epigenomically suppressed. However, key genes such as IL-5 and IL-13 do not appear to be differentially regulated between RAGexp and RAGnaïve ILC2s according to Table S5. Although the authors show that the regulome activity of IL-5 and IL-13 is decreased in RAGexp ILC2s, how do the authors explain that these genes are not differentially expressed between the RAGexp and RAGnaïve ILC2? I think that it is important to validate this in vivo.

    1. eLife assessment

      This study provides a valuable characterization of individual sarcomere's contractility and synchrony in spontaneously beating cardiomyocytes as a function of substrate stiffness. The authors, however, provide an incomplete explanation for the observed heterogeneous and stochastic dynamics, so that the work remains mainly descriptive. The work will be of interest to scientists working on muscle biophysics, nonlinear dynamics, and synchronization phenomena in biological systems.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors experimentally demonstrated the heterogeneous behavior of sarcomeres in cardiomyocytes and that a stochastic component exists in their contractile activity, which cancels out at the level of myofibrils.

      Strengths:

      The experiments and data analysis are robust and valid. With very good statistics and unbiased methods, they show cellular activity at the individual level and highlight the heterogeneity between biological networks. The similarity of the results to the study cited in [24] demonstrates the validity of the in vitro setup for answering these questions and the feasibility of such in-vitro systems to extend our knowledge of physiology.

      Weaknesses:

      Compared to the current literature ([24]), the study does not show a high degree of innovation. It mainly confirms what has been established in the past. The authors complemented the published experiments by developing an in vitro setup with stem cells and by changing the stiffness of the substrate to simulate pathological conditions. However, the experiments they performed do not allow them to explain more than the study in [24], and the conclusions of their study are based on interpretation and speculation about the possible mechanism underlying the observations.

    3. Reviewer #2 (Public Review):

      Summary:

      Sarcomeres, the contractile units of skeletal and cardiac muscle, contract in a concerted fashion to power myofibril and thus muscle fiber contraction.

      Muscle fiber contraction depends on the stiffness of the elastic substrate of the cell, yet it is not known how this dependence emerges from the collective dynamics of sarcomeres. Here, the authors analyze the contraction time series of individual sarcomeres using live imaging of fluorescently labeled cardiomyocytes cultured on elastic substrates of different stiffness. They find that reduced collective contractility of muscle fibers on unphysiologically stiff substrates is partially explained by a lack of synchronization in the contraction of individual sarcomeres.

      This lack of synchronization is at least partially stochastic, consistent with the notion of a tug-of-war between sarcomeres on stiff sarcomeres. A particular irregularity of sarcomere contraction cycles is 'popping', the extension of sarcomeres beyond their rest length. The statistics of 'popping' suggest that this is a purely random process.

      Strengths:

      This study thus marks an important shift of perspective from whole-cell analysis towards an understanding of the collective dynamics of coupled, stochastic sarcomeres.

      Weaknesses:

      Further insight into mechanisms could be provided by additional analyses and/or comparisons to mathematical models.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript of Haertter and coworkers studied the variation of length of a single sarcomere and the response of microfibrils made by sarcomeres of cardiomyocytes on soft gel substrates of varying stiffnesses.

      The measurements at the level of a single sarcomere are an important new result of this manuscript. They are done by combining the labeling of the sarcomeres z line using genetic manipulation and a sophisticated tracking program using machine learning. This single sarcomere analysis shows strong heterogeneities of the sarcomeres that can show fast oscillations not synchronized with the average behavior of the cell<br /> and what the authors call popping events which are large amplitude oscillations. Another important result is the fact that cardiomyocyte contractility decreases with the substrate stiffness although the properties of single sarcomeres do not seem to depend on substrate stiffness.

      The authors suggest that the cardiomyocyte cell behavior is dominated by sarcomere heterogeneity. They show that the heterogeneity between sarcomeres is stochastic and that the contribution of static heterogeneity (such as composition differences between sarcomeres)<br /> is small.

      Strengths:

      All the results are to my knowledge new and original and deserve attention.

      Weaknesses:

      However, I find the manuscript a bit frustrating because the authors only give very qualitative explanations of the phenomena that they observe. They mention that popping could be explained by a nonlinear force-velocity relation of the sarcomere leading to a rapid detachment of all motors. However, they do not explicitly provide a theoretical description. How would the popping depend on the parameters and in particular on the substrate stiffness? Would the popping statistics be affected by the stiffness? It is also not clear to me how the dependence on the soft gel stiffness of the cardiomyocyte cell can be explained by the stochasticity of the sarcomere properties. Can any of the results found by the authors be explained by existing theories of cardiomyocytes? The only one I know is that of Safran and coworkers.

      I also found the paper very difficult to read. The authors should perhaps reorganize the structure of the presentation in order to highlight what the new and important results are.

    1. eLife assessment

      This study demonstrates a novel role for SIRT4; a mitochondrial deacetylase, shown to translocate into nuclei where it regulates RNA alternative splicing by modulating U2AF2 and the gene expression of CCN2 in tubular cells in response to TGF-β. This fundamental work substantially advances our understanding of kidney fibrosis development and offers a potential therapeutic approach. The evidence supporting the conclusions of a SIRT4-U2AF2-CCN2 axis activated by TGF-β is compelling and adds a new layer of complexity to the pathogenesis of chronic kidney disease.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Yang et al report a novel regulatory role of SIRT4 in the progression of kidney fibrosis. The authors showed that in the fibrotic kidney, SIRT4 exhibited an increased nuclear localization. Deletion of Sirt4 in renal tubule epithelium attenuated the extent of kidney fibrosis following injury, while overexpression of SIRT4 aggravates kidney fibrosis. Employing a battery of in vitro and in vivo experiments, the authors demonstrated that SIRT4 interacts with U2AF2 in the nucleus upon TGF-β1 stimulation or kidney injury and deacetylates U2AF2 at K413, resulting in elevated CCN2 expression through alternative splicing of Ccn2 gene to promote kidney fibrosis. The authors further showed that the translocation of SIRT4 is through the BAX/BAK pore complex and is dependent on the ERK1/2-mediated phosphorylation of SIRT4 at S36, and consequently the binding of SIRT4 to importin α1. This fundamental work substantially advances our understanding of the progression of kidney fibrosis and uncovers a novel SIRT4-U2AF2-CCN2 axis as a potential therapeutic target for kidney fibrosis.

      Strengths:

      Overall, this is an extensive, well-performed study. The results are convincing, and the conclusions are mostly well supported by the data. The message is interesting to a wider community working on kidney fibrosis, protein acetylation, and SIRT4 biology.

      Weaknesses:

      The manuscript could be further strengthened if the authors could address a few points listed below:

      (1) In the results part 3.9, an in vitro deacetylation assay employing recombinant SIRT4 and U2AF2 should be included to support the conclusion that SIRT4 is a deacetylase of U2AF2. Similarly, an in vitro binding assay can be included to confirm whether SIRT4 and U2AF2 are directly interacted.

      (2) In Figure 6D, the Western Blot data using U2AF2-K453Q is confusing and is quite disconnected from the rest of the data and not explained. This data can be removed or explained why U2AF2-K453Q is employed here.

      (3) Although ERK inhibitor U0126 blocked the nuclear translocation of SIRT4 in vivo, have the authors checked whether treatment with U0126 could affect the expression of kidney fibrosis markers in UUO mice?

      (4) The format of gene and protein abbreviations in the manuscript should be standardized.

      (5) There are a few grammar issues throughout the manuscript. The English/grammar could be stronger, thus improving the overall accessibility of the science to readers.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript presents a novel and significant investigation into the role of SIRT4 For CCN2 expression in response to TGF-β by modulating U2AF2-mediated alternative splicing and its impact on the development of kidney fibrosis.

      Strengths:

      The authors' main conclusion is that SIRT4 plays a role in kidney fibrosis by regulating CCN2 expression via pre-mRNA splicing. Additionally, the study reveals that SIRT4 translocates from the mitochondria to the cytoplasm through the BAX/BAK pore under TGF-β stimulation. In the cytoplasm, TGF-β activated the ERK pathway and induced the phosphorylation of SIRT4 at Ser36, further promoting its interaction with importin α1 and subsequent nuclear translocation. In the nucleus, SIRT4 was found to deacetylate U2AF2 at K413, facilitating the splicing of CCN2 pre-mRNA to promote CCN2 protein expression. Overall, the findings are fully convincing. The current study, to some extent, shows potential importance in this field. 

      Weaknesses:

      (1) Exosomes containing anti-SIRT4 antibodies were found to effectively mitigate UUO-induced kidney fibrosis in mice. While the protein loading capacity and loading methods were not mentioned.

      (2) The method section is incomplete, and many methods like cell culture, cell transfection, gene expression profiling analysis, and splicing analysis, were not introduced in detail.

      (3) The authors should compare their results with previous studies and mention clearly how their work is important in comparison to what has already been reported in the Discussion section.

    4. Reviewer #3 (Public Review):

      Summary:

      Yang et al reported in this paper that TGF-beta induces SIRT4 activation, TGF-beta activated SIRT4 then modulates U2AF2 alternative splicing, U2AF2 in turn causes CCN2 for expression. The mechanism is described as this: mitochondrial SIRT4 transport into the cytoplasm in response to TGF-β stimulation, phosphorylated by ERK in the cytoplasm, and pathway and then undergo nuclear translocation by forming the complex with importin α1. In the nucleus, SIRT4 can then deacetylate U2AF2 at K413 to facilitate the splicing of CCN2 pre-mRNA to promote CCN2 protein expression. Moreover, they used exosomes to deliver Sirt4 antibodies to mitigate renal fibrosis in a mouse model. TGF-beta has been widely reported for its role in fibrosis induction.

      Strengths:

      TGF-beta induction of SIRT4 translocation from mitochondria to nuclei for epigenetics or gene regulation remains largely unknown. The findings presented here that SIRT4 is involved in U2AF2 deacetylation and CCN2 expression are interesting.

      Weaknesses:

      SIRT4 plays a critical role in mitochondria involved in respiratory chain reaction. This role of SIRT4 is critically involved in many cell functions. It is hard to rule out such a mitochondrial activity of SIRT4 in renal fibrosis. Moreover, the major concern is what kind of message mitochondrial SIRT4 proteins receive from TGF-beta. Although nuclear SIRT4 is increased in response to TNF treatment, it is likely de novo synthesized SIRT4 proteins can also undergo nuclear translocation upon cytokine stimulation. TGF-beta-induced mitochondrial calcium uptake and acetyl-CoA should be evaluated for calcium and acetyl-CoA may contribute to the gene expression regulation in nuclei.

    1. eLife assessment

      Zhao et al. report valuable adverse effects on cell proliferation, differentiation and gene expression, possibly linked to reduced binding activity of the transcription factor GTF2IRD1 to the transthyretin (TTR) promoter, in a human forebrain organoid model of Williams Syndrome (WS). The authors provide incomplete evidence of the effects of GTF2IRD1, a mutated gene in WS, on altering MAPK/ERK pathway activity, a well-recognized target in cell proliferation.

    2. Reviewer #1 (Public Review):

      Summary:

      Zhao et al. used the human forebrain organoid model, transgenic mice model, and embryonic neural progenitor cells to investigate the mutation previously identified in Williams Syndrome. They found abnormal proliferation and differentiation induced by this mutation, as well as altered expression profiles corresponding with aberrant cell clusters. This is regulated through the binding of GTF2IRD1 to transthyretin (TTR) promoter regions and tested on three models mentioned above on neurodevelopmental deficits.

      Strengths:

      Authors have applied both cell culture, organoid culture and in vivo model to test the previously reported mutation found in Williams Syndrome. They investigated cell behavior including proliferation and differentiation, while using the NGS technique to identify potential signaling pathways that are highly involved and can serve as a candidate to save the phenotype.

    3. Reviewer #2 (Public Review):

      Summary:

      The study by Xingsen Zhao et al on "A human forebrain organoid model reveals the essential function of GTF2IRD1-TTR-ERK axis for the neurodevelopmental deficits of Williams Syndrome" presents a forebrain organoid model for WS and has identified defects in neurogenesis. The authors have performed scRNAseq from these patients' derived forebrain organoids showing upregulation expression in genes related to cell proliferation while genes involved in neuronal differentiation were downregulated. The major findings presented in this study are an increase in the size of SOX2+ ventricular zone in WS forebrain organoids with an altered developmental trajectory and aberrant excitatory neurogenesis. The study also presents evidence that transthyretin (TTR) has a reduced expression in WS organoids, and its expression is regulated by the transcription factor -GTF2IRD1. The authors then go on identity mechanistic details of TTR function on MAPK/ERK pathway which has been known to be involved in brain development. Overall, this is a well-constructed study revealing the function of one of the key genes that is deleted in WS and provides novel insights into mechanisms underlying the abnormal neurogenesis in WS brain.

      Strengths:

      WS patients have neurocognitive disorders which most likely stem from defects in early neurodevelopment. This study has investigated a WS forebrain organoid model with scRNAseq and identified differences in cell proliferation and differentiation. This study has presented some new evidence regarding the function and regulation of TTR and its regulator GTF2IRD1 during brain development.

      Weaknesses:

      Though the evidence presented for the mechanism of action of TTR on the MAPK pathway is unclear and lacks depth. It would require identifying downstream targets of TTR and how it interacts with the MAPK pathway.

    1. eLife assessment

      The authors present a potentially useful approach of broad interest arguing that anterior cingulate cortex (ACC) tracks option values in decisions involving delayed rewards. The authors introduce the idea of a resource-based cognitive effort signal in ACC ensembles and link ACC theta oscillations to a resistance-based strategy. The evidence supporting these new ideas is incomplete and would benefit from additional detail and more rigorous analyses and computational methods.

    2. Reviewer #1 (Public Review):

      Summary:

      Young (2.5 mo [adolescent]) rats were tasked to either press one lever for immediate reward or another for delayed reward. The task had a complex structure in which (1) the number of pellets provided on the immediate reward lever changed as a function of the decisions made, (2) rats were prevented from pressing the same lever three times in a row. Importantly, this task is very different from most intertemporal choice tasks which adjust delay (to the delayed lever), whereas this task held the delay constant and adjusted the number of 20 mg sucrose pellets provided on the immediate value lever.

      Analyses are based on separating sessions into groups, but group membership includes arbitrary requirements and many sessions have been dropped from the analyses. Computational modeling is based on an overly simple reinforcement learning model, as evidenced by fit parameters pegging to the extremes. The neural analysis is overly complex and does not contain the necessary statistics to assess the validity of their claims.

      Strengthes:

      The task is interesting.

      Weaknesses:

      Behavior:

      The basic behavioral results from this task are not presented. For example, "each recording session consisted of 40 choice trials or 45 minutes". What was the distribution of choices over sessions? Did that change between rats? Did that change between delays? Were there any sequence effects? (I recommend looking at reaction times.) Were there any effects of pressing a lever twice vs after a forced trial? This task has a very complicated sequential structure that I think I would be hard pressed to follow if I were performing this task. Before diving into the complex analyses assuming reinforcement learning paradigms or cognitive control, I would have liked to have understood the basic behaviors the rats were taking. For example, what was the typical rate of lever pressing? If the rats are pressing 40 times in 45 minutes, does waiting 8s make a large difference?

      For that matter, the reaction time from lever appearance to lever pressing would be very interesting (and important). Are they making a choice as soon as the levers appear? Are they leaning towards the delay side, but then give in and choose the immediate lever? What are the reaction time hazard distributions?

      It is not clear that the animals on this task were actually using cognitive control strategies on this task. One cannot assume from the task that cognitive control is key. The authors only consider a very limited number of potential behaviors (an overly simple RL model). On this task, there are a lot of potential behavioral strategies: "win-stay/lose-shift", "perseveration", "alternation", even "random choices" should be considered.

      The delay lever was assigned to the "non-preferred side". How did side bias affect the decisions made?

      The analyses based on "group" are unjustified. The authors compare the proportion of delayed to immediate lever press choices on the non-forced trials and then did k-means clustering on this distribution. But the distribution itself was not shown, so it is unclear whether the "groups" were actually different. They used k=3, but do not describe how this arbitrary number was chosen. (Is 3 the optimal number of clusters to describe this distribution?) Moreover, they removed three group 1 sessions with an 8s delay and two group 2 sessions with a 4s delay, making all the group 1 sessions 4s delay sessions and all group 2 sessions 8s delay sessions. They then ignore group 3 completely. These analyses seem arbitrary and unnecessarily complex. I think they need to analyze the data by delay. (How do rats handle 4s delay sessions? How do rats handle 6s delay sessions? How do rats handle 8s delay sessions?). If they decide to analyze the data by strategy, then they should identify specific strategies, model those strategies, and do model comparison to identify the best explanatory strategy. Importantly, the groups were session-based, not rat based, suggesting that rats used different strategies based on the delay to the delayed lever.

      The reinforcement learning model used was overly simple. In particular, the RL model assumes that the subjects understand the task structure, but we know that even humans have trouble following complex task structures. Moreover, we know that rodent decision-making depends on much more complex strategies (model-based decisions, multi-state decisions, rate-based decisions, etc). There are lots of other ways to encode these decision variables, such as softmax with an inverse temperature rather than epsilon-greedy. The RL model was stated as a given and not justified. As one critical example, the RL model fit to the data assumed a constant exponential discounting function, but it is well-established that all animals, including rodents, use hyperbolic discounting in intertemporal choice tasks. Presumably this changes dramatically the effect of 4s and 8s. As evidence that the RL model is incomplete, the parameters found for the two groups were extreme. (Alpha=1 implies no history and only reacting to the most recent event. Epsilon=0.4 in an epsilon-greedy algorithm is a 40% chance of responding randomly.)

      The authors do add a "dbias" (which is a preference for the delayed lever) term to the RL model, but note that it has to be maximal in the 4s condition to reproduce group 2 behavior, which means they are not doing reinforcement learning anymore, just choosing the delayed lever.

      Neurophysiology:

      The neurophysiology figures are unclear and mostly uninterpretable; they do not show variability, statistics or conclusive results.

      As with the behavior, I would have liked to have seen more traditional neurophysiological analyses first. What do the cells respond to? How do the manifolds change aligned to the lever presses? Are those different between lever presses? Are there changes in cellular information (both at the individual and ensemble level) over time in the session? How do cellular responses differ during that delay while both levers are out, but the rats are not choosing the immediate lever?

      Figure 3, for example, claims that some of the principal components tracked the number of pellets on the immediate lever ("ival"), but they are just two curves. No statistics, controls, or justification for this is shown. BTW, on Figure 3, what is the event at 200s?

      I'm confused. On Figure 4, the number of trials seems to go up to 50, but in the methods, they say that rats received 40 trials or 45 minutes of experience.

      At the end of page 14, the authors state that the strength of the correlation did not differ by group and that this was "predicted" by the RL modeling, but this statement is nonsensical, given that the RL modeling did not fit the data well, depended on extreme values. Moreover, this claim is dependent on "not statistically detectable", which is, of course, not interpretable as "not different".

      There is an interesting result on page 16 that the increases in theta power were observed before a delayed lever press but not an immediate lever press, and then that the theta power declined after an immediate lever press. These data are separated by session group (again group 1 is a subset of the 4s sessions, group 2 is a subset of the 8s sessions, and group 3 is ignored). I would much rather see these data analyzed by delay itself or by some sort of strategy fit across delays. That being said, I don't see how this description shows up in Figure 6. What does Figure 6 look like if you just separate the sessions by delay?

      Discussion:

      Finally, it is unclear to what extent this task actually gets at the questions originally laid out in the goals and returned to in the discussion. The idea of cognitive effort is interesting, but there is no data presented that this task is cognitive at all. The idea of a resourced cognitive effort and a resistance cognitive effort is interesting, but presumably the way one overcomes resistance is through resource-limited components, so it is unclear that these two cognitive effort strategies are different.

      The authors state that "ival-tracking" (neurons and ensembles that presumably track the number of pellets being delivered on the immediate lever - a fancy name for "expectations") "taps into a resourced-based form of cognitive effort", but no evidence is actually provided that keeping track of the expectation of reward on the immediate lever depends on attention or mnemonic resources. They also state that a "dLP-biased strategy" (waiting out the delay) is a "resistance-based form of cognitive effort" but no evidence is made that going to the delayed side takes effort.

      The authors talk about theta synchrony, but never actually measure theta synchrony, particularly across structures such as amygdala or ventral hippocampus. The authors try to connect this to "the unpleasantness of the delay", but provide no measures of pleasantness or unpleasantness. They have no evidence that waiting out an 8s delay is unpleasant.

      The authors hypothesize that the "ival-tracking signal" (the expectation of number of pellets on the immediate lever) "could simply reflect the emotional or autonomic response". Aside from the fact that no evidence for this is provided, if this were to be true, then, in what sense would any of these signals be related to cognitive control?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the neuronal signals that underlie resistance vs resource-based models of cognitive effort. The authors use a delayed discounting task and computational models to explore these ideas. The authors find that the ACC strongly tracks value and time, which is consistent with prior work. Novel contributions include quantification of a resource-based control signal among ACC ensembles, and linking ACC theta oscillations to a resistance-based strategy.

      Strengths:

      The experiments and analyses are well done and have the potential to generate an elegant explanatory framework for ACC neuronal activity. The inclusion of local-field potential / spike-field analyses is particularly important because these can be measured in humans.

      Weaknesses:

      I had questions that might help me understand the task and details of neuronal analyses.

      (1) The abstract, discussion, and introduction set up an opposition between resource and resistance-based forms of cognitive effort. It's clear that the authors find evidence for each (ACC ensembles = resource, theta=resistance?) but I'm not sure where the data fall on this dichotomy.<br /> a. An overall very simple schematic early in the paper (prior to the MCML model? or even the behavior) may help illustrate the main point.<br /> b. In the intro, results, and discussion, it may help to relate each point to this dichotomy.<br /> c. What would resource-based signals look like? What would resistance based signals look like? Is the main point that resistance-based strategies dominate when delays are short, but resource-based strategies dominate when delays are long?<br /> d. I wonder if these strategies can be illustrated? Could these two measures (dLP vs ival tracking) be plotted on separate axes or extremes, and behavior, neuronal data, LFP, and spectral relationships be shown on these axes? I think Figure 2 is working towards this. Could these be shown for each delay length? This way, as the evidence from behavior, model, single neurons, ensembles, and theta is presented, it can be related to this framework, and the reader can organize the findings.

      (2) The task is not clear to me.<br /> a. I wonder if a task schematic and a flow chart of training would help readers.<br /> b. This task appears to be relatively new. Has it been used before in rats (Oberlin and Grahame is a mouse study)? Some history / context might help orient readers.<br /> c. How many total sessions were completed with ascending delays? Was there criteria for surgeries? How many total recording sessions per animal (of the 54?)<br /> d. How many trials completed per session (40 trials OR 45 minutes)? Where are there errors? These details are important for interpreting Figure 1.

      (3) Figure 1 is unclear to me.<br /> a. Delayed vs immediate lever presses are being plotted - but I am not sure what is red, and what is blue. I might suggest plotting each animal.<br /> b. How many animals and sessions go into each data point?<br /> c. Table 1 (which might be better referenced in the paper) refers to rats by session. Is it true that some rats (2 and 8) were not analyzed for the bulk of the paper? Some rats appear to switch strategies, and some stay in one strategy. How many neurons come from each rat?<br /> d. Task basics - RT, choice, accuracy, video stills - might help readers understand what is going into these plots<br /> e. Does the animal move differently (i.e., RTs) in G1 vs. G2?

      (4) I wasn't sure how clustered G1 vs. G2 vs G3 are. To make this argument, the raw data (or some axis of it) might help.<br /> a. This is particularly important because G3 appears to be a mix of G1 and G2, although upon inspection, I'm not sure how different they really are<br /> b. Was there some objective clustering criteria that defined the clusters?<br /> c. Why discuss G3 at all? Can these sessions be removed from analysis?

      (5) The same applies to neuronal analyses in Fig 3 and 4<br /> a. What does a single neuron peri-event raster look like? I would include several of these.<br /> b. What does PC1, 2 and 3 look like for G1, G2, and G3?<br /> c. Certain PCs are selected, but I'm not sure how they were selected - was there a criteria used? How was the correlation between PCA and ival selected? What about PCs that don't correlate with ival?<br /> d. If the authors are using PCA, then scree plots and PETHs might be useful, as well as comparisons to PCs from time-shuffled / randomized data.

      (6) I had questions about the spectral analysis<br /> a. Theta has many definitions - why did the authors use 6-12 Hz? Does it come from the hippocampal literature, and is this the best definition of theta?. What about other bands (delta - 1-4 Hz), theta (4-7 Hz); and beta - 13- 30 Hz? These bands are of particular importance because they have been associated with errors, dopamine, and are abnormal in schizophrenia and Parkinson's disease.<br /> b. Power spectra and time-frequency analyses may justify the authors focus. I would show these (y-axis - frequency, x-axis - time, z-axis, power).

      (7) PC3 as an autocorrelation doesn't seem the to be right way to infer theta entrainment or spike-field relationships, as PCA can be vulnerable to phantom oscillations, and coherence can be transient. It is also difficult to compare to traditional measures of phase-locking. Why not simply use spike-field coherence? This is particularly important with reference to the human literature, which the authors invoke.

    4. Reviewer #3 (Public Review):

      Summary:

      The study investigated decision making in rats choosing between small immediate rewards and larger delayed rewards, in a task design where the size of the immediate rewards decreased when this option was chosen and increased when it was not chosen. The authors conceptualise this task as involving two different types of cognitive effort; 'resistance-based' effort putatively needed to resist the smaller immediate reward, and 'resource-based' effort needed to track the changing value of the immediate reward option. They argue based on analyses of the behaviour, and computational modelling, that rats use different strategies in different sessions, with one strategy in which they consistently choose the delayed reward option irrespective of the current immediate reward size, and another strategy in which they preferentially choose the immediate reward option when the immediate reward size is large, and the delayed reward option when the immediate reward size is small. The authors recorded neural activity in anterior cingulate cortex (ACC) and argue that ACC neurons track the value of the immediate reward option irrespective of the strategy the rats are using. They further argue that the strategy the rats are using modulates their estimated value of the immediate reward option, and that oscillatory activity in the 6-12Hz theta band occurs when subjects use the 'resistance-based' strategy of choosing the delayed option irrespective of the current value of the immediate reward option. If solid, these findings will be of interest to researchers working on cognitive control and ACCs involvement in decision making. However, there are some issues with the experiment design, reporting, modelling and analysis which currently preclude high confidence in the validity of the conclusions.

      Strengths:

      The behavioural task used is interesting and the recording methods should enable the collection of good quality single unit and LFP electrophysiology data. The authors recorded from a sizable sample of subjects for this type of study. The approach of splitting the data into sessions where subjects used different strategies and then examining the neural correlates of each is in principle interesting, though I have some reservations about the strength of evidence for the existence of multiple strategies.

      Weaknesses:

      The dataset is very unbalanced in terms of both the number of sessions contributed by each subject, and their distribution across the different putative behavioural strategies (see table 1), with some subjects contributing 9 or 10 sessions and others only one session, and it is not clear from the text why this is the case. Further, only 3 subjects contribute any sessions to one of the behavioural strategies, while 7 contribute data to the other such that apparent differences in brain activity between the two strategies could in fact reflect differences between subjects, which could arise due to e.g. differences in electrode placement. To firm up the conclusion that neural activity is different in sessions where different strategies are thought to be employed, it would be important to account for potential cross-subject variation in the data. The current statistical methods don't do this as they all assume fixed effects (e.g. using trials or neurons as the experimental unit and ignoring which subject the neuron/trial came from).

      It is not obvious that the differences in behaviour between the sessions characterised as using the 'G1' and 'G2' strategies actually imply the use of different strategies, because the behavioural task was different in these sessions, with a shorter wait (4 seconds vs 8 seconds) for the delayed reward in the G1 strategy sessions where the subjects consistently preferred the delayed reward irrespective of the current immediate reward size. Therefore the differences in behaviour could be driven by difference in the task (i.e. external world) rather than a difference in strategy (internal to the subject). It seems plausible that the higher value of the delayed reward option when the delay is shorter could account for the high probability of choosing this option irrespective of the current value of the immediate reward option, without appealing to the subjects using a different strategy.

      Further, even if the differences in behaviour do reflect different behavioural strategies, it is not obvious that these correspond to allocation of different types of cognitive effort. For example, subjects' failure to modify their choice probabilities to track the changing value of the immediate reward option might be due simply to valuing the delayed reward option higher, rather than not allocating cognitive effort to tracking immediate option value (indeed this is suggested by the neural data). Conversely, if the rats assign higher value to the delayed reward option in the G1 sessions, it is not obvious that choosing it requires overcoming 'resistance' through cognitive effort.

      The RL modelling used to characterise the subject's behavioural strategies made some unusual and arguably implausible assumptions:

      i) The goal of the agent was to maximise the value of the immediate reward option (ival), rather than the standard assumption in RL modelling that the goal is to maximise long-run (e.g. temporally discounted) reward. It is not obvious why the rats should be expected to care about maximising the value of only one of their two choice options rather than distributing their choices to try and maximise long run reward.

      ii) The modelling assumed that the subject's choice could occur in 7 different states, defined by the history of their recent choices, such that every successive choice was made in a different state from the previous choice. This is a highly unusual assumption (most modelling of 2AFC tasks assumes all choices occur in the same state), as it causes learning on one trial not to generalise to the next trial, but only to other future trials where the recent choice history is the same.

      iii) The value update was non-standard in that rather than using the trial outcome (i.e. the amount of reward obtained) as the update target, it instead appeared to use some function of the value of the immediate reward option (it was not clear to me from the methods exactly how the fival and fqmax terms in the equation are calculated) irrespective of whether the immediate reward option was actually chosen.

      iv) The model used an e-greedy decision rule such that the probability of choosing the highest value option did not depend on the magnitude of the value difference between the two options. Typically, behavioural modelling uses a softmax decision rule to capture a graded relationship between choice probability and value difference.

      v) Unlike typical RL modelling where the learned value differences drive changes in subjects' choice preferences from trial to trial, to capture sensitivity to the value of the immediately rewarding option the authors had to add in a bias term which depended directly on this value (not mediated by any trial-to-trial learning). It is not clear how the rat is supposed to know the current trial ival if not by learning over previous trials, nor what purpose the learning component of the model serves if not to track the value of the immediate reward option.

      Given the task design, a more standard modelling approach would be to treat each choice as occurring in the same state, with the (temporally discounted) value of the outcomes obtained on each trial updating the value of the chosen option, and choice probabilities driven in a graded way (e.g. softmax) by the estimated value difference between the options. It would be useful to explicitly perform model comparison (e.g. using cross-validated log-likelihood with fitted parameters) of the authors proposed model against more standard modelling approaches to test whether their assumptions are justified. It would also be useful to use logistic regression to evaluate how the history of choices and outcomes on recent trials affects the current trial choice, and compare these granular aspects of the choice data with simulated data from the model.

      There were also some issues with the analyses of neural data which preclude strong confidence in their conclusions:

      Figure 4I makes the striking claim that ACC neurons track the value of the immediately rewarding option equally accurately in sessions where two putative behavioural strategies were used, despite the behaviour being insensitive to this variable in the G1 strategy sessions. The analysis quantifies the strength of correlation between a component of the activity extracted using a decoding analysis and the value of the immediate reward option. However, as far as I could see this analysis was not done in a cross-validated manner (i.e. evaluating the correlation strength on test data that was not used for either training the MCML model or selecting which component to use for the correlation). As such, the chance level correlation will certainly be greater than 0, and it is not clear whether the observed correlations are greater than expected by chance.

      An additional caveat with the claim that ACC is tracking the value of the immediate reward option is that this value likely correlates with other behavioural variables, notably the current choice and recent choice history, that may be encoded in ACC. Encoding analyses (e.g. using linear regression to predict neural activity from behavioural variables) could allow quantification of the variance in ACC activity uniquely explained by option values after controlling for possible influence of other variables such as choice history (e.g. using a coefficient of partial determination).

      Figure 5 argues that there are systematic differences in how ACC neurons represent the value of the immediate option (ival) in the G1 and G2 strategy sessions. This is interesting if true, but it appears possible that the effect is an artefact of the different distribution of option values between the two session types. Specifically, due to the way that ival is updated based on the subjects' choices, in G1 sessions where the subjects are mostly choosing the delayed option, ival will on average be higher than in G2 sessions where they are choosing the immediate option more often. The relative number of high, medium and low ival trials in the G1 and G2 sessions will therefore be different, which could drive systematic differences in the regression fit in the absence of real differences in the activity-value relationship. I have created an ipython notebook illustrating this, available at: https://notebooksharing.space/view/a3c4504aebe7ad3f075aafaabaf93102f2a28f8c189ab9176d4807cf1565f4e3. To verify that this is not driving the effect it would be important to balance the number of trials at each ival level across sessions (e.g. by subsampling trials) before running the regression.

    5. Author response:

      eLife assessment

      The authors present a potentially useful approach of broad interest arguing that anterior cingulate cortex (ACC) tracks option values in decisions involving delayed rewards. The authors introduce the idea of a resource-based cognitive effort signal in ACC ensembles and link ACC theta oscillations to a resistance-based strategy. The evidence supporting these new ideas is incomplete and would benefit from additional detail and more rigorous analyses and computational methods.

      The reviewers have provided several excellent suggestions and pointed out important shortcomings of our manuscript. We are grateful for their efforts. To address these concerns, we are planning a major revision to the manuscript. In the revision, our goal is to address each of the reviewer’s concerns and codify the evidence for resistance- and resource-based control signals in the rat anterior cingulate cortex. We have provided a nonexhaustive list we plan to address in the point by point responses below.   

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Young (2.5 mo [adolescent]) rats were tasked to either press one lever for immediate reward or another for delayed reward.

      Please note that at the time of testing and training that the rats were > 4 months old.

      The task had a complex structure in which (1) the number of pellets provided on the immediate reward lever changed as a function of the decisions made, (2) rats were prevented from pressing the same lever three times in a row. Importantly, this task is very different from most intertemporal choice tasks which adjust delay (to the delayed lever), whereas this task held the delay constant and adjusted the number of 20 mg sucrose pellets provided on the immediate value lever.

      Several studies parametrically vary the immediate lever (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183). While most versions of the task will yield qualitatively similar estimates of discounting, the adjusting amount is preferred as it provides the most consistent estimates (PMID: 22445576). More specifically this version of the task avoids contrast effects of that result from changing the delay during the session (PMID: 23963529, 24780379, 19730365, 35661751) which complicates value estimates.

      Analyses are based on separating sessions into groups, but group membership includes arbitrary requirements and many sessions have been dropped from the analyses.

      We are in discussions about how to address this valid concern. This includes simply splitting the data by delay. This approach, however, has conceptual problems that we will also lay out in a full revision.  

      Computational modeling is based on an overly simple reinforcement learning model, as evidenced by fit parameters pegging to the extremes.

      We apologize for not doing a better job of explaining the advantages of this type of model for the present purposes. Nevertheless, given the clear lack of enthusiasm, we felt it was better to simply update the model as suggested by the Reviewers. The straightforward modifications have now been implemented and we are currently in discussion about how the new results fit into the larger narrative.

      The neural analysis is overly complex and does not contain the necessary statistics to assess the validity of their claims.

      We plan to streamline the existing analysis and add statistics, where required, to address this concern.

      Strengths:

      The task is interesting.

      Thank you for the positive comment

      Weaknesses:

      Behavior:

      The basic behavioral results from this task are not presented. For example, "each recording session consisted of 40 choice trials or 45 minutes". What was the distribution of choices over sessions? Did that change between rats? Did that change between delays? Were there any sequence effects? (I recommend looking at reaction times.) Were there any effects of pressing a lever twice vs after a forced trial?

      Animals tend to make more immediate choices as the delay is extended, which is reflected in Figure 1. We will add more detail and additional statistics to address these questions. 

      This task has a very complicated sequential structure that I think I would be hard pressed to follow if I were performing this task.

      Human tasks implement a similar task structure (PMID: 26779747). Please note the response above that outlines the benefits of using of this task.   

      Before diving into the complex analyses assuming reinforcement learning paradigms or cognitive control, I would have liked to have understood the basic behaviors the rats were taking. For example, what was the typical rate of lever pressing? If the rats are pressing 40 times in 45 minutes, does waiting 8s make a large difference?

      This is a good suggestion. However, rats do not like waiting for rewards, even small delays. Going from the 4 à 8 sec delay results in more immediate choices, indicating that the rats will forgo waiting for a smaller reinforcer at the 8 sec delay as compared to the 4 sec.  

      For that matter, the reaction time from lever appearance to lever pressing would be very interesting (and important). Are they making a choice as soon as the levers appear? Are they leaning towards the delay side, but then give in and choose the immediate lever? What are the reaction time hazard distributions?

      These are excellent suggestions. We are looking into implementing them.

      It is not clear that the animals on this task were actually using cognitive control strategies on this task. One cannot assume from the task that cognitive control is key. The authors only consider a very limited number of potential behaviors (an overly simple RL model). On this task, there are a lot of potential behavioral strategies: "win-stay/lose-shift", "perseveration", "alternation", even "random choices" should be considered.

      The strategies the Reviewer mentioned are descriptors of the actual choices the rats made. For example, perseveration means the rat is choosing one of the levers at an excessively high rate whereas alternation means it is choosing the two levers more or less equally, independent of payouts. But the question we are interested in is why? We are arguing that the type of cognitive control determines the choice behavior but cognitive control is an internal variable that guides behavior, rather than simply a descriptor of the behavior. For example, the animal opts to perseverate on the delayed lever because the cognitive control required to track ival is too high. We then searched the neural data for signatures of the two types of cognitive control.

      The delay lever was assigned to the "non-preferred side". How did side bias affect the decisions made?

      The side bias clearly does not impact performance as the animals prefer the delay lever at shorter delays, which works against this bias.

      The analyses based on "group" are unjustified. The authors compare the proportion of delayed to immediate lever press choices on the non-forced trials and then did k-means clustering on this distribution. But the distribution itself was not shown, so it is unclear whether the "groups" were actually different. They used k=3, but do not describe how this arbitrary number was chosen. (Is 3 the optimal number of clusters to describe this distribution?) Moreover, they removed three group 1 sessions with an 8s delay and two group 2 sessions with a 4s delay, making all the group 1 sessions 4s delay sessions and all group 2 sessions 8s delay sessions. They then ignore group 3 completely. These analyses seem arbitrary and unnecessarily complex. I think they need to analyze the data by delay. (How do rats handle 4s delay sessions? How do rats handle 6s delay sessions? How do rats handle 8s delay sessions?). If they decide to analyze the data by strategy, then they should identify specific strategies, model those strategies, and do model comparison to identify the best explanatory strategy. Importantly, the groups were session-based, not rat based, suggesting that rats used different strategies based on the delay to the delayed lever.

      These are excellent points and, as stated above, we are in the process revisiting the group assignments in an effort allay these criticisms.

      The reinforcement learning model used was overly simple. In particular, the RL model assumes that the subjects understand the task structure, but we know that even humans have trouble following complex task structures. Moreover, we know that rodent decision-making depends on much more complex strategies (model-based decisions, multi-state decisions, rate-based decisions, etc). There are lots of other ways to encode these decision variables, such as softmax with an inverse temperature rather than epsilon-greedy. The RL model was stated as a given and not justified. As one critical example, the RL model fit to the data assumed a constant exponential discounting function, but it is well-established that all animals, including rodents, use hyperbolic discounting in intertemporal choice tasks. Presumably this changes dramatically the effect of 4s and 8s. As evidence that the RL model is incomplete, the parameters found for the two groups were extreme. (Alpha=1 implies no history and only reacting to the most recent event. Epsilon=0.4 in an epsilon-greedy algorithm is a 40% chance of responding randomly.)

      Please see our response above. We agree that the approach was not justified, but we do not agree that it is invalid. Simply stated, a softmax approach gives the best fit to the choice behavior, whereas our epsilon-greedy approach attempted to reproduce the choice behavior using a naïve agent that progressively learns the values of the two levers on a choice-by-choice basis. The epsilon-greedy approach can therefore tell us whether it is possible to reproduce the choice behavior by an agent that is only tracking ival. Given our discovery of an ival-tracking signal in ACC, we believed that this was a critical point (although admittedly we did a poor job of communicating it). However, we also appreciate that important insights can be gained by fitting a model to the data as suggested. In fact, we had implemented this approach initially and are currently reconsidering what it can tell us in light of the Reviewers comments.

      The authors do add a "dbias" (which is a preference for the delayed lever) term to the RL model, but note that it has to be maximal in the 4s condition to reproduce group 2 behavior, which means they are not doing reinforcement learning anymore, just choosing the delayed lever.

      Exactly. The model results indicated that a naïve agent that relied only on ival tracking would not behave in this manner. Hence it therefore was unlikely that the G1 animals were using an ival-tracking strategy, even though a strong ival-tracking signal was present in ACC.

      Neurophysiology:

      The neurophysiology figures are unclear and mostly uninterpretable; they do not show variability, statistics or conclusive results.

      While the reviewer is justified in criticizing the clarity of the figures, the statement that “they do not show variability, statistics or conclusive results” is demonstrably false. Each of the figures presented in the manuscript, except Figure 3, are accompanied by statistics and measures of variability. This comment is hyperbolic and not justified.  

      Figure 3 was an attempt to show raw neural data to better demonstrate how robust the ivalue tracking signal is.

      As with the behavior, I would have liked to have seen more traditional neurophysiological analyses first. What do the cells respond to? How do the manifolds change aligned to the lever presses? Are those different between lever presses?

      We provide several figures describing how neurons change firing rates in response to varying reward. We are unsure what the reviewer means by “traditional analysis”, especially since this is immediately followed by a request for an assessment of neural manifolds. That said, we are developing ways to make the analysis more intuitive and, hopefully, more “traditional”.

      Are there changes in cellular information (both at the individual and ensemble level) over time in the session?

      We provide several analyses of how firing rate changes over trials in relation to ival over time in the session.

      How do cellular responses differ during that delay while both levers are out, but the rats are not choosing the immediate lever?

      It is not clear to us how this analysis addresses our hypothesis regarding control signals in ACC.

      Figure 3, for example, claims that some of the principal components tracked the number of pellets on the immediate lever ("ival"), but they are just two curves. No statistics, controls, or justification for this is shown. BTW, on Figure 3, what is the event at 200s?

      Figure 3 will be folded into one of the other figures that contains the summary statistics.

      I'm confused. On Figure 4, the number of trials seems to go up to 50, but in the methods, they say that rats received 40 trials or 45 minutes of experience.

      This analysis included force trials. The max of the session is 40 choice trials. We will clarify in the revised manuscript. 

      At the end of page 14, the authors state that the strength of the correlation did not differ by group and that this was "predicted" by the RL modeling, but this statement is nonsensical, given that the RL modeling did not fit the data well, depended on extreme values. Moreover, this claim is dependent on "not statistically detectable", which is, of course, not interpretable as "not different".

      We plan to revisit this analysis and the RL model.

      There is an interesting result on page 16 that the increases in theta power were observed before a delayed lever press but not an immediate lever press, and then that the theta power declined after an immediate lever press.

      Thank you for the positive comment.

      These data are separated by session group (again group 1 is a subset of the 4s sessions, group 2 is a subset of the 8s sessions, and group 3 is ignored). I would much rather see these data analyzed by delay itself or by some sort of strategy fit across delays.

      Provisional analysis indicates that the results hold up over delays, rather than the groupings in the paper. We will address this in a full revision of the manuscript.

      That being said, I don't see how this description shows up in Figure 6. What does Figure 6 look like if you just separate the sessions by delay?

      We are unclear what the reviewer means by “this description”.

      Discussion:

      Finally, it is unclear to what extent this task actually gets at the questions originally laid out in the goals and returned to in the discussion. The idea of cognitive effort is interesting, but there is no data presented that this task is cognitive at all. The idea of a resourced cognitive effort and a resistance cognitive effort is interesting, but presumably the way one overcomes resistance is through resource-limited components, so it is unclear that these two cognitive effort strategies are different.

      We view the strong evidence for ival tracking presented herein as a potentially critical component of resource based cognitive effort. We hope to clarify how this task engaged cognitive effort more clearly.  

      The authors state that "ival-tracking" (neurons and ensembles that presumably track the number of pellets being delivered on the immediate lever - a fancy name for "expectations") "taps into a resourced-based form of cognitive effort", but no evidence is actually provided that keeping track of the expectation of reward on the immediate lever depends on attention or mnemonic resources. They also state that a "dLP-biased strategy" (waiting out the delay) is a "resistance-based form of cognitive effort" but no evidence is made that going to the delayed side takes effort.

      There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers. We contend that enduring something you don’t like takes effort.

      The authors talk about theta synchrony, but never actually measure theta synchrony, particularly across structures such as amygdala or ventral hippocampus. The authors try to connect this to "the unpleasantness of the delay", but provide no measures of pleasantness or unpleasantness. They have no evidence that waiting out an 8s delay is unpleasant.

      We will better clarify how our measure of Theta power relates to synchrony. There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers.

      The authors hypothesize that the "ival-tracking signal" (the expectation of number of pellets on the immediate lever) "could simply reflect the emotional or autonomic response". Aside from the fact that no evidence for this is provided, if this were to be true, then, in what sense would any of these signals be related to cognitive control?

      This is proposed as an alternative explanation to the ivalue signal. We provide this as a possibility, never a conclusion. We will clarify this in the revised text. 

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the neuronal signals that underlie resistance vs resource-based models of cognitive effort. The authors use a delayed discounting task and computational models to explore these ideas. The authors find that the ACC strongly tracks value and time, which is consistent with prior work. Novel contributions include quantification of a resource-based control signal among ACC ensembles, and linking ACC theta oscillations to a resistance-based strategy.

      Strengths:

      The experiments and analyses are well done and have the potential to generate an elegant explanatory framework for ACC neuronal activity. The inclusion of local-field potential / spike-field analyses is particularly important because these can be measured in humans.

      Thank you for the endorsement of our work.

      Weaknesses:

      I had questions that might help me understand the task and details of neuronal analyses.

      (1) The abstract, discussion, and introduction set up an opposition between resource and resistance based forms of cognitive effort. It's clear that the authors find evidence for each (ACC ensembles = resource, theta=resistance?) but I'm not sure where the data fall on this dichotomy.

      a. An overall very simple schematic early in the paper (prior to the MCML model? or even the behavior) may help illustrate the main point.

      b. In the intro, results, and discussion, it may help to relate each point to this dichotomy.

      c. What would resource-based signals look like? What would resistance based signals look like? Is the main point that resistance-based strategies dominate when delays are short, but resource-based strategies dominate when delays are long?

      d. I wonder if these strategies can be illustrated? Could these two measures (dLP vs ival tracking) be plotted on separate axes or extremes, and behavior, neuronal data, LFP, and spectral relationships be shown on these axes? I think Figure 2 is working towards this. Could these be shown for each delay length? This way, as the evidence from behavior, model, single neurons, ensembles, and theta is presented, it can be related to this framework, and the reader can organize the findings.

      These are excellent suggestions, and we intend to implement each of them, where possible.

      (2) The task is not clear to me.

      a. I wonder if a task schematic and a flow chart of training would help readers.

      Yes, excellent idea, we intend to include this.

      b. This task appears to be relatively new. Has it been used before in rats (Oberlin and Grahame is a mouse study)? Some history / context might help orient readers.

      Indeed, this task has been used in rats in several prior studies in rats. Please see the following references (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183).

      c. How many total sessions were completed with ascending delays? Was there criteria for surgeries? How many total recording sessions per animal (of the 54?)

      Please note that the delay does not change within a session. There was no criteria for surgery. In addition, we will update Table 1 to make the number of recording sessions more clear.

      d. How many trials completed per session (40 trials OR 45 minutes)? Where are there errors? These details are important for interpreting Figure 1.

      Every animal in this data set completed 40 trials. We will update the task description to clarify this issue. There are no errors in this task, but rather the task is designed to the tendency to make an impulsive choice (smaller reward now). We will provide clarity to this issue in the revision of the manuscript.   

      (3) Figure 1 is unclear to me.

      a. Delayed vs immediate lever presses are being plotted - but I am not sure what is red, and what is blue. I might suggest plotting each animal.

      We will clarify the colors and look into schemes to graph the data set.

      b. How many animals and sessions go into each data point?

      This information is in Table 1, but this could be clearer, and we will update the manuscript.

      c. Table 1 (which might be better referenced in the paper) refers to rats by session. Is it true that some rats (2 and 8) were not analyzed for the bulk of the paper? Some rats appear to switch strategies, and some stay in one strategy. How many neurons come from each rat?

      Table 1 is accurate, and we can add the number of neurons from each animal.

      d. Task basics - RT, choice, accuracy, video stills - might help readers understand what is going into these plots

      e. Does the animal move differently (i.e., RTs) in G1 vs. G2?

      We will look into ways to incorporate this information.

      (4) I wasn't sure how clustered G1 vs. G2 vs G3 are. To make this argument, the raw data (or some axis of it) might help.

      a. This is particularly important because G3 appears to be a mix of G1 and G2, although upon inspection, I'm not sure how different they really are

      b. Was there some objective clustering criteria that defined the clusters?

      c. Why discuss G3 at all? Can these sessions be removed from analysis?

      These are all excellent suggestions and points. We plan to revisit the strategy to assign sessions to groups, which we hope will address each of these points.

      (5) The same applies to neuronal analyses in Fig 3 and 4

      a. What does a single neuron peri-event raster look like? I would include several of these.

      b. What does PC1, 2 and 3 look like for G1, G2, and G3?

      c. Certain PCs are selected, but I'm not sure how they were selected - was there a criteria used? How was the correlation between PCA and ival selected? What about PCs that don't correlate with ival?

      d. If the authors are using PCA, then scree plots and PETHs might be useful, as well as comparisons to PCs from time-shuffled / randomized data.

      We will make several updates to enhance clarity of the neural data analysis, including adding more representative examples. We feel the need to balance the inclusion of representative examples with groups stats given the concerns raised by R1.

      (6) I had questions about the spectral analysis

      a. Theta has many definitions - why did the authors use 6-12 Hz? Does it come from the hippocampal literature, and is this the best definition of theta?. What about other bands (delta - 1-4 Hz), theta (4-7 Hz); and beta - 13- 30 Hz? These bands are of particular importance because they have been associated with errors, dopamine, and are abnormal in schizophrenia and Parkinson's disease.

      This designation comes mainly from the hippocampal and ACC literature in rodents. In addition, this range best captured the peak in the power spectrum in our data. Note that we focus our analysis on theta give the literature regarding theta in the ACC as a correlate of cognitive controls (references in manuscript). We did interrogate other bands as a sanity check and the results were mostly limited to theta. Given the scope of our manuscript and the concerns raised regarding complexity we are concerned that adding frequency analyses beyond theta obfuscates the take home message. However, we think this is worthy, and we will determine if this can be done in a brief, clear, and effective manner.

      b. Power spectra and time-frequency analyses may justify the authors focus. I would show these (y-axis - frequency, x-axis - time, z-axis, power).

      This is an excellent suggestion that we look forward to incorporating. 

      (7) PC3 as an autocorrelation doesn't seem the to be right way to infer theta entrainment or spike-field relationships, as PCA can be vulnerable to phantom oscillations, and coherence can be transient. It is also difficult to compare to traditional measures of phase-locking. Why not simply use spike-field coherence? This is particularly important with reference to the human literature, which the authors invoke.

      Excellent suggestion. We will look into the phantom oscillation issue. Note that PCA provided a way to classify neurons that exhibited peaks in the autocorrelation at theta frequencies. While spike-field coherence is a rigorous tool, it addresses a slightly different question (LFP entrainment). Notwithstanding, we plan to address this issue.  

      Reviewer #3 (Public Review):

      Summary:

      The study investigated decision making in rats choosing between small immediate rewards and larger delayed rewards, in a task design where the size of the immediate rewards decreased when this option was chosen and increased when it was not chosen. The authors conceptualise this task as involving two different types of cognitive effort; 'resistance-based' effort putatively needed to resist the smaller immediate reward, and 'resource-based' effort needed to track the changing value of the immediate reward option. They argue based on analyses of the behaviour, and computational modelling, that rats use different strategies in different sessions, with one strategy in which they consistently choose the delayed reward option irrespective of the current immediate reward size, and another strategy in which they preferentially choose the immediate reward option when the immediate reward size is large, and the delayed reward option when the immediate reward size is small. The authors recorded neural activity in anterior cingulate cortex (ACC) and argue that ACC neurons track the value of the immediate reward option irrespective of the strategy the rats are using. They further argue that the strategy the rats are using modulates their estimated value of the immediate reward option, and that oscillatory activity in the 6-12Hz theta band occurs when subjects use the 'resistance-based' strategy of choosing the delayed option irrespective of the current value of the immediate reward option. If solid, these findings will be of interest to researchers working on cognitive control and ACCs involvement in decision making. However, there are some issues with the experiment design, reporting, modelling and analysis which currently preclude high confidence in the validity of the conclusions.

      Strengths:

      The behavioural task used is interesting and the recording methods should enable the collection of good quality single unit and LFP electrophysiology data. The authors recorded from a sizable sample of subjects for this type of study. The approach of splitting the data into sessions where subjects used different strategies and then examining the neural correlates of each is in principle interesting, though I have some reservations about the strength of evidence for the existence of multiple strategies.

      Thank you for the positive comments.

      Weaknesses:

      The dataset is very unbalanced in terms of both the number of sessions contributed by each subject, and their distribution across the different putative behavioural strategies (see table 1), with some subjects contributing 9 or 10 sessions and others only one session, and it is not clear from the text why this is the case. Further, only 3 subjects contribute any sessions to one of the behavioural strategies, while 7 contribute data to the other such that apparent differences in brain activity between the two strategies could in fact reflect differences between subjects, which could arise due to e.g. differences in electrode placement. To firm up the conclusion that neural activity is different in sessions where different strategies are thought to be employed, it would be important to account for potential cross-subject variation in the data. The current statistical methods don't do this as they all assume fixed effects (e.g. using trials or neurons as the experimental unit and ignoring which subject the neuron/trial came from).

      This is an important issue that we plan to address with additional analysis in the manuscript update.

      It is not obvious that the differences in behaviour between the sessions characterised as using the 'G1' and 'G2' strategies actually imply the use of different strategies, because the behavioural task was different in these sessions, with a shorter wait (4 seconds vs 8 seconds) for the delayed reward in the G1 strategy sessions where the subjects consistently preferred the delayed reward irrespective of the current immediate reward size. Therefore the differences in behaviour could be driven by difference in the task (i.e. external world) rather than a difference in strategy (internal to the subject). It seems plausible that the higher value of the delayed reward option when the delay is shorter could account for the high probability of choosing this option irrespective of the current value of the immediate reward option, without appealing to the subjects using a different strategy.

      Further, even if the differences in behaviour do reflect different behavioural strategies, it is not obvious that these correspond to allocation of different types of cognitive effort. For example, subjects' failure to modify their choice probabilities to track the changing value of the immediate reward option might be due simply to valuing the delayed reward option higher, rather than not allocating cognitive effort to tracking immediate option value (indeed this is suggested by the neural data). Conversely, if the rats assign higher value to the delayed reward option in the G1 sessions, it is not obvious that choosing it requires overcoming 'resistance' through cognitive effort.

      The RL modelling used to characterise the subject's behavioural strategies made some unusual and arguably implausible assumptions:

      i) The goal of the agent was to maximise the value of the immediate reward option (ival), rather than the standard assumption in RL modelling that the goal is to maximise long-run (e.g. temporally discounted) reward. It is not obvious why the rats should be expected to care about maximising the value of only one of their two choice options rather than distributing their choices to try and maximise long run reward.

      ii) The modelling assumed that the subject's choice could occur in 7 different states, defined by the history of their recent choices, such that every successive choice was made in a different state from the previous choice. This is a highly unusual assumption (most modelling of 2AFC tasks assumes all choices occur in the same state), as it causes learning on one trial not to generalise to the next trial, but only to other future trials where the recent choice history is the same.

      iii) The value update was non-standard in that rather than using the trial outcome (i.e. the amount of reward obtained) as the update target, it instead appeared to use some function of the value of the immediate reward option (it was not clear to me from the methods exactly how the fival and fqmax terms in the equation are calculated) irrespective of whether the immediate reward option was actually chosen.

      iv) The model used an e-greedy decision rule such that the probability of choosing the highest value option did not depend on the magnitude of the value difference between the two options. Typically, behavioural modelling uses a softmax decision rule to capture a graded relationship between choice probability and value difference.

      v) Unlike typical RL modelling where the learned value differences drive changes in subjects' choice preferences from trial to trial, to capture sensitivity to the value of the immediately rewarding option the authors had to add in a bias term which depended directly on this value (not mediated by any trial-to-trial learning). It is not clear how the rat is supposed to know the current trial ival if not by learning over previous trials, nor what purpose the learning component of the model serves if not to track the value of the immediate reward option.

      Given the task design, a more standard modelling approach would be to treat each choice as occurring in the same state, with the (temporally discounted) value of the outcomes obtained on each trial updating the value of the chosen option, and choice probabilities driven in a graded way (e.g. softmax) by the estimated value difference between the options. It would be useful to explicitly perform model comparison (e.g. using cross-validated log-likelihood with fitted parameters) of the authors proposed model against more standard modelling approaches to test whether their assumptions are justified. It would also be useful to use logistic regression to evaluate how the history of choices and outcomes on recent trials affects the current trial choice, and compare these granular aspects of the choice data with simulated data from the model.

      Each of the issues outlined above with the RL model a very important. We are currently re-evaluating the RL modeling approach in light of these comments. Please see comments to R1 regarding the model as they are relevant for this as well.

      There were also some issues with the analyses of neural data which preclude strong confidence in their conclusions:

      Figure 4I makes the striking claim that ACC neurons track the value of the immediately rewarding option equally accurately in sessions where two putative behavioural strategies were used, despite the behaviour being insensitive to this variable in the G1 strategy sessions. The analysis quantifies the strength of correlation between a component of the activity extracted using a decoding analysis and the value of the immediate reward option. However, as far as I could see this analysis was not done in a cross-validated manner (i.e. evaluating the correlation strength on test data that was not used for either training the MCML model or selecting which component to use for the correlation). As such, the chance level correlation will certainly be greater than 0, and it is not clear whether the observed correlations are greater than expected by chance.

      This is an astute observation and we plan to address this concern. We agree that cross-validation may provide an appropriate tool here.

      An additional caveat with the claim that ACC is tracking the value of the immediate reward option is that this value likely correlates with other behavioural variables, notably the current choice and recent choice history, that may be encoded in ACC. Encoding analyses (e.g. using linear regression to predict neural activity from behavioural variables) could allow quantification of the variance in ACC activity uniquely explained by option values after controlling for possible influence of other variables such as choice history (e.g. using a coefficient of partial determination).

      This is also an excellent point that we plan to address the manuscript update.

      Figure 5 argues that there are systematic differences in how ACC neurons represent the value of the immediate option (ival) in the G1 and G2 strategy sessions. This is interesting if true, but it appears possible that the effect is an artefact of the different distribution of option values between the two session types. Specifically, due to the way that ival is updated based on the subjects' choices, in G1 sessions where the subjects are mostly choosing the delayed option, ival will on average be higher than in G2 sessions where they are choosing the immediate option more often. The relative number of high, medium and low ival trials in the G1 and G2 sessions will therefore be different, which could drive systematic differences in the regression fit in the absence of real differences in the activity-value relationship. I have created an ipython notebook illustrating this, available at: https://notebooksharing.space/view/a3c4504aebe7ad3f075aafaabaf93102f2a28f8c189ab9176d4807cf1565f4e3. To verify that this is not driving the effect it would be important to balance the number of trials at each ival level across sessions (e.g. by subsampling trials) before running the regression.

      Excellent point and thank you for the notebook. We explored a similar approach previously but did not pursue it to completion. We will re-investigate this issue.

    1. Joint Public Review:

      Summary:

      This study retrospectively analyzed clinical data to develop a risk prediction model for pulmonary hypertension in high-altitude populations. This finding holds clinical significance as it can be used for intuitive and individualized prediction of pulmonary hypertension risk in these populations. The strength of evidence is high, utilizing a large cohort of 6,603 patients and employing statistical methods such as LASSO regression. The model demonstrates satisfactory performance metrics, including AUC values and calibration curves, enhancing its clinical applicability.

      Strengths:

      (1) Large Sample Size: The study utilizes a substantial cohort of 6,603 subjects, enhancing the reliability and generalizability of the findings.

      (2) Robust Methodology: The use of advanced statistical techniques, including least absolute shrinkage and selection operator (LASSO) regression and multivariate logistic regression, ensures the selection of optimal predictive features.

      (3) Clinical Utility: The developed nomograms are user-friendly and can be easily implemented in clinical settings, particularly in resource-limited high-altitude regions.

      (4) Performance Metrics: The models demonstrate satisfactory performance, with strong AUC values and well-calibrated curves, indicating accurate predictions.

      Weaknesses:

      (1) Lack of External Validation: The models were validated internally, but external validation with cohorts from other high-altitude regions is necessary to confirm their generalizability.

      (2) Simplistic Predictors: The reliance on ECG and basic demographic data may overlook other potential predictors that could improve the models' accuracy and predictive power.

      (3) Regional Specificity: The study's cohort is limited to Tibet, and the findings may not be directly applicable to other high-altitude populations without further validation.

    1. Reviewer #5 (Public Review):

      After reading the manuscript and the concerns raised by reviewer 2 I see both sides of the argument - the relative location of trigeminal nucleus versus the inferior olive is quite different in elephants (and different from previous studies in elephants), but when there is a large disproportionate magnification of a behaviorally relevant body part at most levels of the nervous system (certainly in the cortex and thalamus), you can get major shifting in the location of different structures. In the case of the elephant, it looks like there may be a lot of shifting. Something that is compelling is that the number of modules separated but the myelin bands correspond to the number of trunk folds which is different in the different elephants. This sort of modular division based on body parts is a general principle of mammalian brain organization (demonstrated beautifully for the cuneate and gracile nucleus in primates, VP in most of species, S1 in a variety of mammals such as the star nosed mole and duck-billed platypus). I don't think these relative changes in the brainstem would require major genetic programming - although some surely exist. Rodents and elephants have been independently evolving for over 60 million years so there is a substantial amount of time for changes in each l lineage to occur.

      I agree that the authors have identified the trigeminal nucleus correctly, although comparisons with more out-groups would be needed to confirm this (although I'm not suggesting that the authors do this). I also think the new figure (which shows previous divisions of the brainstem versus their own) allows the reader to consider these issues for themselves. When reviewing this paper, I actually took the time to go through atlases of other species and even look at some of my own data from highly derived species. Establishing homology across groups based only on relative location is tough especially when there appears to be large shifts in the relative location of structures. My thoughts are that the authors did an extraordinary amount of work on obtaining, processing and analyzing this extremely valuable tissue. They document their work with images of the tissue and their arguments for their divisions are solid. I feel that they have earned the right to speculate - with qualifications - which they provide.

    2. eLife assessment

      This valuable study uses neuroanatomical techniques to investigate somatosensory projections from the elephant trunk to the brainstem. Given its unique specializations, understanding how the elephant trunk is represented within the brain is of general interest to evolutionary and comparative neuroscientists. The authors present solid evidence for the existence of a novel isomorphism in which the folds of the trunk are mapped onto the trigeminal nucleus; however, due to their unusual structure, some uncertainty remains about the identification and anatomical organization of nuclei within the elephant brainstem.

    3. Reviewer #1 (Public Review):

      This manuscript remains an intriguing investigation of the elephant brainstem, with particular attention drawn to possible sensory and motor representation of the renowned trunk of African and Asian elephants. As the authors note, this area has traditionally been identified as part of the superior olivary complex and associated with the fine motor control of the trunk; however, notable patterns within myelin stripes suggest that its parcellation may relate to specific regions/folds found along the long axis of the trunk, including elaborated regions for the trunk "finger" distal end.

      In this iteration of the manuscript, the researchers have provided peripherin antibody staining within the regions they have identified as the trigeminal nucleus and the superior olive. These data, with abundant peripherin expression within climbing fibers of the presumed superior olive and relatively lower expression within the trigeminal nucleus, bolster their interpretation of having comprehensively identified the trigeminal nucleus and trunk representation via a battery of neuroanatomical methods.

      All other conclusions remain the same, and these data have provoked intriguing and animated discussion on classification of neuroanatomical structure, particularly in species with relatively limited access to specimens. Most significantly, these discussions have underscored the fundamental nature of comparative methods (from protein to cellular to anatomical levels), including interpreting homologous structures among species of varying levels of relatedness.

    4. Reviewer #2 (Public Review):

      Here I submit my previous review and a great deal of additional information following on from the initial review and the response by the authors.

      * Initial Review *

      Assessment:

      This manuscript is based upon the unprecedented identification of an apparently highly unusual trigeminal nuclear organization within the elephant brainstem, related to a large trigeminal nerve in these animals. The apparently highly specialized elephant trigeminal nuclear complex identified in the current study has been classified as the inferior olivary nuclear complex in four previous studies of the elephant brainstem. The entire study is predicated upon the correct identification of the trigeminal sensory nuclear complex and the inferior olivary nuclear complex in the elephant, and if this is incorrect, then the remainder of the manuscript is merely unsupported speculation. There are many reasons indicating that the trigeminal nuclear complex is misidentified in the current study, rendering the entire study, and associated speculation, inadequate at best, and damaging in terms of understanding elephant brains and behaviour at worst.

      Original Public Review:

      The authors describe what they assert to be a very unusual trigeminal nuclear complex in the brainstem of elephants, and based on this, follow with many speculations about how the trigeminal nuclear complex, as identified by them, might be organized in terms of the sensory capacity of the elephant trunk.<br /> The identification of the trigeminal nuclear complex/inferior olivary nuclear complex in the elephant brainstem is the central pillar of this manuscript from which everything else follows, and if this is incorrect, then the entire manuscript fails, and all the associated speculations become completely unsupported.

      The authors note that what they identify as the trigeminal nuclear complex has been identified as the inferior olivary nuclear complex by other authors, citing Shoshani et al. (2006; 10.1016/j.brainresbull.2006.03.016) and Maseko et al (2013; 10.1159/000352004), but fail to cite either Verhaart and Kramer (1958; PMID 13841799) or Verhaart (1962; 10.1515/9783112519882-001). These four studies are in agreement, but the current study differs.

      Let's assume for the moment that the four previous studies are all incorrect and the current study is correct. This would mean that the entire architecture and organization of the elephant brainstem is significantly rearranged in comparison to ALL other mammals, including humans, previously studied (e.g. Kappers et al. 1965, The Comparative Anatomy of the Nervous System of Vertebrates, Including Man, Volume 1 pp. 668-695) and the closely related manatee (10.1002/ar.20573). This rearrangement necessitates that the trigeminal nuclei would have had to "migrate" and shorten rostrocaudally, specifically and only, from the lateral aspect of the brainstem where these nuclei extend from the pons through to the cervical spinal cord (e.g. the Paxinos and Watson rat brain atlases), the to the spatially restricted ventromedial region of specifically and only the rostral medulla oblongata. According to the current paper the inferior olivary complex of the elephant is very small and located lateral to their trigeminal nuclear complex, and the region from where the trigeminal nuclei are located by others appears to be just "lateral nuclei" with no suggestion of what might be there instead.

      Such an extraordinary rearrangement of brainstem nuclei would require a major transformation in the manner in which the mutations, patterning, and expression of genes and associated molecules during development occur. Such a major change is likely to lead to lethal phenotypes, making such a transformation extremely unlikely. Variations in mammalian brainstem anatomy are most commonly associated with quantitative changes rather than qualitative changes (10.1016/B978-0-12-804042-3.00045-2).

      The impetus for the identification of the unusual brainstem trigeminal nuclei in the current study rests upon a previous study from the same laboratory (10.1016/j.cub.2021.12.051) that estimated that the number of axons contained in the infraorbital branch of the trigeminal nerve that innervate the sensory surfaces of the trunk is approximately 400 000. Is this number unusual? In a much smaller mammal with a highly specialized trigeminal system, the platypus, the number of axons innervating the sensory surface of the platypus bill skin comes to 1 344 000 (10.1159/000113185). Yet, there is no complex rearrangement of the brainstem trigeminal nuclei in the brain of the developing or adult platypus (Ashwell, 2013, Neurobiology of Monotremes), despite the brainstem trigeminal nuclei being very large in the platypus (10.1159/000067195). Even in other large-brained mammals, such as large whales that do not have a trunk, the number of axons in the trigeminal nerve ranges between 400,000 and 500,000 (10.1007/978-3-319-47829-6_988-1). The lack of comparative support for the argument forwarded in the previous and current study from this laboratory, and that the comparative data indicates that the brainstem nuclei do not change in the manner suggested in the elephant, argues against the identification of the trigeminal nuclei as outlined in the current study. Moreover, the comparative studies undermine the prior claim of the authors, informing the current study, that "the elephant trigeminal ganglion ... point to a high degree of tactile specialization in elephants" (10.1016/j.cub.2021.12.051). While clearly the elephant has tactile sensitivity in the trunk, it is questionable as to whether what has been observed in elephants is indeed "truly extraordinary".

      But let's look more specifically at the justification outlined in the current study to support their identification of the unusually located trigeminal sensory nuclei of the brainstem.

      (1) Intense cytochrome oxidase reactivity<br /> (2) Large size of the putative trunk module<br /> (3) Elongation of the putative trunk module<br /> (4) Arrangement of these putative modules correspond to elephant head anatomy<br /> (5) Myelin stripes within the putative trunk module that apparently match trunk folds<br /> (6) Location apparently matches other mammals<br /> (7) Repetitive modular organization apparently similar to other mammals.<br /> (8) The inferior olive described by other authors lacks the lamellated appearance of this structure in other mammals

      Let's examine these justifications more closely.

      (1) Cytochrome oxidase histochemistry is typically used as an indicative marker of neuronal energy metabolism. The authors indicate, based on the "truly extraordinary" somatosensory capacities of the elephant trunk, that any nuclei processing this tactile information should be highly metabolically active, and thus should react intensely when stained for cytochrome oxidase. We are told in the methods section that the protocols used are described by Purkart et al (2022) and Kaufmann et al (2022). In neither of these cited papers is there any description, nor mention, of the cytochrome oxidase histochemistry methodology, thus we have no idea of how this histochemical staining was done. In order to obtain the best results for cytochrome oxidase histochemistry, the tissue is either processed very rapidly after buffer perfusion to remove blood or in recently perfusion-fixed tissue (e.g., 10.1016/0165-0270(93)90122-8). Given: (1) the presumably long post-mortem interval between death and fixation - "it often takes days to dissect elephants"; (2) subsequent fixation of the brains in 4% paraformaldehyde for "several weeks"; (3) The intense cytochrome oxidase reactivity in the inferior olivary complex of the laboratory rat (Gonzalez-Lima, 1998, Cytochrome oxidase in neuronal metabolism and Alzheimer's diseases); and (4) The lack of any comparative images from other stained portions of the elephant brainstem; it is difficult to support the justification as forwarded by the authors. It is likely that the histochemical staining observed is background reactivity from the use of diaminobenzidine in the staining protocol. Thus, this first justification is unsupported.<br /> Justifications (2), (3), and (4) are sequelae from justification (1). In this sense, they do not count as justifications, but rather unsupported extensions.

      (4) and (5) These are interesting justifications, as the paper has clear internal contradictions, and (5) is a sequelae of (4). The reader is led to the concept that the myelin tracts divide the nuclei into sub-modules that match the folding of the skin on the elephant trunk. One would then readily presume that these myelin tracts are in the incoming sensory axons from the trigeminal nerve. However, the authors note that this is not the case: "Our observations on trunk module myelin stripes are at odds with this view of myelin. Specifically, myelin stripes show no tapering (which we would expect if axons divert off into the tissue). More than that, there is no correlation between myelin stripe thickness (which presumably correlates with axon numbers) and trigeminal module neuron numbers. Thus, there are numerous myelinated axons, where we observe few or no trigeminal neurons. These observations are incompatible with the idea that myelin stripes form an axonal 'supply' system or that their prime function is to connect neurons. What do myelin stripe axons do, if they do not connect neurons? We suggest that myelin stripes serve to separate rather than connect neurons." So, we are left with the observation that the myelin stripes do not pass afferent trigeminal sensory information from the "truly extraordinary" trunk skin somatic sensory system, and rather function as units that separate neurons - but to what end? It appears that the myelin stripes are more likely to be efferent axonal bundles leaving the nuclei (to form the olivocerebellar tract). This justification is unsupported.

      (6) The authors indicate that the location of these nuclei matches that of the trigeminal nuclei in other mammals. This is not supported in any way. In ALL other mammals in which the trigeminal nuclei of the brainstem have been reported they are found in the lateral aspect of the brainstem, bordered laterally by the spinal trigeminal tract. This is most readily seen and accessible in the Paxinos and Watson rat brain atlases. The authors indicate that the trigeminal nuclei are medial to the facial nerve nucleus, but in every other species, the trigeminal sensory nuclei are found lateral to the facial nerve nucleus. This is most salient when examining a close relative, the manatee (10.1002/ar.20573), where the location of the inferior olive and the trigeminal nuclei matches that described by Maseko et al (2013) for the African elephant. This justification is not supported.

      (7) The dual to quadruple repetition of rostro-caudal modules within the putative trigeminal nucleus as identified by the authors relies on the fact that in the neurotypical mammal, there are several trigeminal sensory nuclei arranged in a column running from the pons to the cervical spinal cord, these include (nomenclature from Paxinos and Watson in roughly rostral to caudal order) the Pr5VL, Pr5DM, Sp5O, Sp5I, and Sp5C. But, these nuclei are all located far from the midline and lateral to the facial nerve nucleus, unlike what the authors describe in the elephants. These rostrocaudal modules are expanded upon in Figure 2, and it is apparent from what is shown that the authors are attributing other brainstem nuclei to the putative trigeminal nuclei to confirm their conclusion. For example, what they identify as the inferior olive in figure 2D is likely the lateral reticular nucleus as identified by Maseko et al (2013). This justification is not supported.

      (8) In primates and related species, there is a distinct banded appearance of the inferior olive, but what has been termed the inferior olive in the elephant by other authors does not have this appearance, rather, and specifically, the largest nuclear mass in the region (termed the principal nucleus of the inferior olive by Maseko et al, 2013, but Pr5, the principal trigeminal nucleus in the current paper) overshadows the partial banded appearance of the remaining nuclei in the region (but also drawn by the authors of the current paper). Thus, what is at debate here is whether the principal nucleus of the inferior olive can take on a nuclear shape rather than evince a banded appearance. The authors of this paper use this variance as justification that this cluster of nuclei could not possibly be the inferior olive. Such a "semi-nuclear/banded" arrangement of the inferior olive is seen in, for example, giraffe (10.1016/j.jchemneu.2007.05.003), domestic dog, polar bear, and most specifically the manatee (a close relative of the elephant) (brainmuseum.org; 10.1002/ar.20573). This justification is not supported.

      Thus, all the justifications forwarded by the authors are unsupported. Based on methodological concerns, prior comparative mammalian neuroanatomy, and prior studies in the elephant and closely related species, the authors fail to support their notion that what was previously termed the inferior olive in the elephant is actually the trigeminal sensory nuclei. Given this failure, the justifications provided above that are sequelae also fail. In this sense, the entire manuscript and all the sequelae are not supported.

      What the authors have not done is to trace the pathway of the large trigeminal nerve in the elephant brainstem, as was done by Maseko et al (2013), which clearly shows the internal pathways of this nerve, from the branch that leads to the fifth mesencephalic nucleus adjacent to the periventricular grey matter, through to the spinal trigeminal tract that extends from the pons to the spinal cord in a manner very similar to all other mammals. Nor have they shown how the supposed trigeminal information reaches the putative trigeminal nuclei in the ventromedial rostral medulla oblongata. These are but two examples of many specific lines of evidence that would be required to support their conclusions. Clearly tract tracing methods, such as cholera toxin tracing of peripheral nerves cannot be done in elephants, thus the neuroanatomy must be done properly and with attention to detail to support the major changes indicated by the authors.

      So what are these "bumps" in the elephant brainstem?

      Four previous authors indicate that these bumps are the inferior olivary nuclear complex. Can this be supported?

      The inferior olivary nuclear complex acts "as a relay station between the spinal cord (n.b. trigeminal input does reach the spinal cord via the spinal trigeminal tract) and the cerebellum, integrating motor and sensory information to provide feedback and training to cerebellar neurons" (https://www.ncbi.nlm.nih.gov/books/NBK542242/). The inferior olivary nuclear complex is located dorsal and medial to the pyramidal tracts (which were not labelled in the current study by the authors but are clearly present in Fig. 1C and 2A) in the ventromedial aspect of the rostral medulla oblongata. This is precisely where previous authors have identified the inferior olivary nuclear complex and what the current authors assign to their putative trigeminal nuclei. The neurons of the inferior olivary nuclei project, via the olivocerebellar tract to the cerebellum to terminate in the climbing fibres of the cerebellar cortex.

      Elephants have the largest (relative and absolute) cerebellum of all mammals (10.1002/ar.22425), this cerebellum contains 257 x109 neurons (10.3389/fnana.2014.00046; three times more than the entire human brain, 10.3389/neuro.09.031.2009). Each of these neurons appears to be more structurally complex than the homologous neurons in other mammals (10.1159/000345565; 10.1007/s00429-010-0288-3). In the African elephant, the neurons of the inferior olivary nuclear complex are described by Maseko et al (2013) as being both calbindin and calretinin immunoreactive. Climbing fibres in the cerebellar cortex of the African elephant are clearly calretinin immunopositive and also are likely to contain calbindin (10.1159/000345565). Given this, would it be surprising that the inferior olivary nuclear complex of the elephant is enlarged enough to create a very distinct bump in exactly the same place where these nuclei are identified in other mammals?

      What about the myelin stripes? These are most likely to be the origin of the olivocerebellar tract and probably only have a coincidental relationship to the trunk. Thus, given what we know, the inferior olivary nuclear complex as described in other studies, and the putative trigeminal nuclear complex as described in the current study, is the elephant inferior olivary nuclear complex. It is not what the authors believe it to be, and they do not provide any evidence that discounts the previous studies. The authors are quite simply put, wrong. All the speculations that flow from this major neuroanatomical error are therefore science fiction rather than useful additions to the scientific literature.

      What do the authors actually have?<br /> The authors have interesting data, based on their Golgi staining and analysis, of the inferior olivary nuclear complex in the elephant.

      * Review of Revised Manuscript *

      Assessment:

      There is a clear dichotomy between the authors and this reviewer regarding the identification of specific structures, namely the inferior olivary nuclear complex and the trigeminal nuclear complex, in the brainstem of the elephant. The authors maintain the position that in the elephant alone, irrespective of all the published data on other mammals and previously published data on the elephant brainstem, these two nuclear complexes are switched in location. The authors maintain that their interpretation is correct, but this reviewer maintains that this interpretation is erroneous. The authors expressed concern that the remainder of the paper was not addressed by the reviewer, but the reviewer maintains that these sequelae to the misidentification of nuclear complexes in the elephant brainstem render any of these speculations irrelevant as the critical structures are incorrectly identified. It is this reviewer's opinion that this paper is incorrect. I provide a lot of detail below in order to provide support to the opinion I express.

      Public Review of Current Submission:

      As indicated in my previous review of this manuscript (see above), it is my opinion that the authors have misidentified, and indeed switched, the inferior olivary nuclear complex (IO) and the trigeminal nuclear complex (Vsens). It is this specific point only that I will address in this second review, as this is the crucial aspect of this paper - if the identification of these nuclear complexes in the elephant brainstem by the authors is incorrect, the remainder of the paper does not have any scientific validity.

      The authors, in their response to my initial review, claim that I "bend" the comparative evidence against them. They further claim that as all other mammalian species exhibit a "serrated" appearance of the inferior olive, and as the elephant does not exhibit this appearance, what was previously identified as the inferior olive is actually the trigeminal nucleus and vice versa.

      For convenience, I will refer to IOM and VsensM as the identification of these structures according to Maseko et al (2013) and other authors and will use IOR and VsensR to refer to the identification forwarded in the study under review.<br /> The IOM/VsensR certainly does not have a serrated appearance in elephants. Indeed, from the plates supplied by the authors in response (Referee Fig. 2), the cytochrome oxidase image supplied and the image from Maseko et al (2013) shows a very similar appearance. There is no doubt that the authors are identifying structures that closely correspond to those provided by Maseko et al (2013). It is solely a contrast in what these nuclear complexes are called and the functional sequelae of the identification of these complexes (are they related to the trunk sensation or movement controlled by the cerebellum?) that is under debate.

      Elephants are part of the Afrotheria, thus the most relevant comparative data to resolve this issue will be the identification of these nuclei in other Afrotherian species. Below I provide images of these nuclear complexes, labelled in the standard nomenclature, across several Afrotherian species.

      (A) Lesser hedgehog tenrec (Echinops telfairi)

      Tenrecs brains are the most intensively studied of the Afrotherian brains, these extensive neuroanatomical studies were undertaken primarily by Heinz Künzle. Below I append images (coronal sections stained with cresol violet) of the IO and Vsens (labelled in the standard mammalian manner) in the lesser hedgehog tenrec. It should be clear that the inferior olive is located in the ventral midline of the rostral medulla oblongata (just like the rat) and that this nucleus is not distinctly serrated. The Vsens is located in the lateral aspect of the medulla skirted laterally by the spinal trigeminal tract (Sp5). These images and the labels indicating structures correlate precisely with that provided by Künzle (1997, 10.1016/S0168- 0102(97)00034-5), see his Figure 1K,L. Thus, in the first case of a related species, there is no serrated appearance of the inferior olive, the location of the inferior olive is confirmed through connectivity with the superior colliculus (a standard connection in mammals) by Künzle (1997), and the location of Vsens is what is considered to be typical for mammals. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Peer Review Image 1.

      (B) Giant otter shrew (Potomogale velox)

      The otter shrews are close relatives of the Tenrecs. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see hints of the serration of the IO as defined by the authors, but we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Peer Response Image 2.

      (C) Four-toed sengi (Petrodromus tetradactylus)

      The sengis are close relatives of the Tenrecs and otter shrews, these three groups being part of the Afroinsectiphilia, a distinct branch of the Afrotheria. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see vague hints of the serration of the IO (as defined by the authors), and we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Peer Response Image 3.

      (D) Rock hyrax (Procavia capensis)

      The hyraxes, along with the sirens and elephants form the Paenungulata branch of the Afrotheria. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per the standard mammalian anatomy. Here we see hints of the serration of the IO (as defined by the authors), but we also see evidence of a more "bulbous" appearance of subnuclei of the IO (particularly the principal nucleus), and we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Peer Review Image 4.

      (E) West Indian manatee (Trichechus manatus)

      The sirens are the closest extant relatives of the elephants in the Afrotheria. Below I append images of cresyl violet (top) and myelin (bottom) stained coronal sections (taken from the University of Wisconsin-Madison Brain Collection, https://brainmuseum.org, and while quite low in magnification they do reveal the structures under debate) through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see the serration of the IO (as defined by the authors). Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Peer Review Image 5.

      These comparisons and the structural identification, with which the authors agree as they only distinguish the elephants from the other Afrotheria, demonstrate that the appearance of the IO can be quite variable across mammalian species, including those with a close phylogenetic affinity to the elephants. Not all mammal species possess a "serrated" appearance of the IO. Thus, it is more than just theoretically possible that the IO of the elephant appears as described prior to this study.

      So what about elephants? Below I append a series of images from coronal sections through the African elephant brainstem stained for Nissl, myelin, and immunostained for calretinin. These sections are labelled according to standard mammalian nomenclature. In these complete sections of the elephant brainstem, we do not see a serrated appearance of the IOM (as described previously and in the current study by the authors). Rather the principal nucleus of the IOM appears to be bulbous in nature. In the current study, no image of myelin staining in the IOM/VsensR is provided by the authors. However, in the images I provide, we do see the reported myelin stripes in all stains - agreement between the authors and reviewer on this point. The higher magnification image to the bottom left of the plate shows one of the IOM/VsensR myelin stripes immunostained for calretinin, and within the myelin stripes axons immunopositive for calretinin are seen (labelled with an arrow). The climbing fibres of the elephant cerebellar cortex are similarly calretinin immunopositive (10.1159/000345565). In contrast, although not shown at high magnification, the fibres forming the Sp5 in the elephant (in the Maseko description, unnamed in the description of the authors) show no immunoreactivity to calretinin.

      Peer Review Image 6.

      Peripherin Immunostaining

      In their revised manuscript the authors present immunostaining of peripherin in the elephant brainstem. This is an important addition (although it does replace the only staining of myelin provided by the authors which is unusual as the word myelin is in the title of the paper) as peripherin is known to specifically label peripheral nerves. In addition, as pointed out by the authors, peripherin also immunostains climbing fibres (Errante et al., 1998). The understanding of this staining is important in determining the identification of the IO and Vsens in the elephant, although it is not ideal for this task as there is some ambiguity. Errante and colleagues (1998; Fig. 1) show that climbing fibres are peripherin-immunopositive in the rat. But what the authors do not evaluate is the extensive peripherin staining in the rat Sp5 in the same paper (Errante et al, 1998, Fig. 2). The image provided by the authors of their peripherin immunostaining (their new Figure 2) shows what I would call the Sp5 of the elephant to be strongly peripherin immunoreactive, just like the rat shown in Errant et al (1998), and moreover in the precise position of the rat Sp5! This makes sense as this is where the axons subserving the "extraordinary" tactile sensitivity of the elephant trunk would be found (in the standard model of mammalian brainstem anatomy). Interestingly, the peripherin immunostaining in the elephant is clearly lamellated...this coincides precisely with the description of the trigeminal sensory nuclei in the elephant by Maskeo et al (2013) as pointed out by the authors in their rebuttal. Errante et al (1998) also point out peripherin immunostaining in the inferior olive, but according to the authors this is only "weakly present" in the elephant IOM/VsensR. This latter point is crucial. Surely if the elephant has an extraordinary sensory innervation from the trunk, with 400,000 axons entering the brain, the VsensR/IOM should be highly peripherin-immunopositive, including the myelinated axon bundles?! In this sense, the authors argue against their own interpretation - either the elephant trunk is not a highly sensitive tactile organ, or the VsensR is not the trigeminal nuclei it is supposed to be.

      Summary:

      (1) Comparative data of species closely related to elephants (Afrotherians) demonstrates that not all mammals exhibit the "serrated" appearance of the principal nucleus of the inferior olive.

      (2) The location of the IO and Vsens as reported in the current study (IOR and VsensR) would require a significant, and unprecedented, rearrangement of the brainstem in the elephants independently. I argue that the underlying molecular and genetic changes required to achieve this would be so extreme that it would lead to lethal phenotypes. Arguing that the "switcheroo" of the IO and Vsens does occur in the elephant (and no other mammals) and thus doesn't lead to lethal phenotypes is a circular argument that cannot be substantiated.

      (3) Myelin stripes in the subnuclei of the inferior olivary nuclear complex are seen across all related mammals as shown above. Thus, the observation made in the elephant by the authors in what they call the VsensR, is similar to that seen in the IO of related mammals, especially when the IO takes on a more bulbous appearance. These myelin stripes are the origin of the olivocerebellar pathway and are indeed calretinin immunopositive in the elephant as I show.

      (4) What the authors see aligns perfectly with what has been described previously, the only difference being the names that nuclear complexes are being called. But identifying these nuclei is important, as any functional sequelae, as extensively discussed by the authors, is entirely dependent upon accurately identifying these nuclei.

      (4) The peripherin immunostaining scores an own goal - if peripherin is marking peripheral nerves (as the authors and I believe it is), then why is the VsensR/IOM only "weakly positive" for this stain? This either means that the "extraordinary" tactile sensitivity of the elephant trunk is non-existent, or that the authors have misinterpreted this staining. That there is extensive staining in the fibre pathway dorsal and lateral to the IOR (which I call the spinal trigeminal tract), supports the idea that the authors have misinterpreted their peripherin immunostaining.

      (5) Evolutionary expediency. The authors argue that what they report is an expedient way in which to modify the organisation of the brainstem in the elephant to accommodate the "extraordinary" tactile sensitivity. I disagree. As pointed out in my first review, the elephant cerebellum is very large and comprised of huge numbers of morphologically complex neurons. The inferior olivary nuclei in all mammals studied in detail to date, give rise to the climbing fibres that terminate on the Purkinje cells of the cerebellar cortex. It is more parsimonious to argue that, in alignment with the expansion of the elephant cerebellum (for motor control of the trunk), the inferior olivary nuclei (specifically the principal nucleus) have had additional neurons added to accommodate this cerebellar expansion. Such an addition of neurons to the principal nucleus of the inferior olive could readily lead to the loss of the serrated appearance of the principal nucleus of the inferior olive and would require far less modifications in the developmental genetic program that forms these nuclei. This type of quantitative change appears to be the primary way in which structures are altered in the mammalian brainstem.

    5. Reviewer #3 (Public Review):

      Summary:

      The study claims to investigate trunk representations in elephant trigeminal nuclei located in the brainstem. The researchers identify large protrusions visible from the ventral surface of the brainstem, which they examined using a range of histological methods. However, this ventral location is usually where the inferior olivary complex is found, which challenges the author's assertions about the nucleus under analysis. They find that this brainstem nucleus of elephants contains repeating modules, with a focus on the anterior and largest unit which they define as the putative nucleus principalis trunk module of the trigeminal. The nucleus exhibits low neuron density, with glia outnumbering neurons significantly. The study also utilizes synchrotron X-ray phase contrast tomography to suggest that myelin-stripe-axons traverse this module. The analysis maps myelin-rich stripes in several specimens and concludes that based on their number and patterning they likely correspond with trunk folds; however this conclusion is not well supported if the nucleus has been misidentified.

      Strengths:

      The strength of this research lies in its comprehensive use of various anatomical methods, including Nissl staining, myelin staining, Golgi staining, cytochrome oxidase labeling, and synchrotron X-ray phase contrast tomography. The inclusion of quantitative data on cell numbers and sizes, dendritic orientation and morphology, and blood vessel density across the nucleus adds a quantitative dimension. Furthermore, the research is commendable for its high-quality and abundant images and figures, effectively illustrating the anatomy under investigation.

      Weaknesses:

      While the research provides potentially valuable insights if revised to focus on the structure that appears to be an inferior olivary nucleus, there are certain additional weaknesses that warrant further consideration. First, the suggestion that myelin stripes solely serve to separate sensory or motor modules rather than functioning as an "axonal supply system" lacks substantial support due to the absence of information about the neuronal origins and the termination targets of the axons. Postmortem fixed brain tissue limits the ability to trace full axon projections. While the study acknowledges these limitations, it is important to exercise caution in drawing conclusions about the precise role of myelin stripes without a more comprehensive understanding of their neural connections.

      Second, the quantification presented in the study lacks comparison to other species or other relevant variables within the elephant specimens (i.e., whole brain or brainstem volume). The absence of comparative data to different species limits the ability to fully evaluate the significance of the findings. Comparative analyses could provide a broader context for understanding whether the observed features are unique to elephants or more common across species. This limitation in comparative data hinders a more comprehensive assessment of the implications of the research within the broader field of neuroanatomy. Furthermore, the quantitative comparisons between African and Asian elephant specimens should include some measure of overall brain size as a covariate in the analyses. Addressing these weaknesses would enable a richer interpretation of the study's findings.

    6. Reviewer #4 (Public Review):

      Summary:

      The authors report a novel isomorphism in which the folds of the elephant trunk are recognizably mapped onto the principal sensory trigeminal nucleus in the brainstem. Further, they identify the enlarged nucleus as being situated in this species in an unusual ventral midline position.

      Strengths:

      The identity of the purported trigeminal nucleus and the isomorphic mapping with the trunk folds is supported by multiple lines of evidence: enhanced staining for cytochrome oxidase, an enzyme associated with high metabolic activity; dense vascularization, consistent with high metabolic activity; prominent myelinated bundles that partition the nucleus in a 1:1 mapping of the cutaneous folds in the trunk periphery; near absence of labeling for the anti-peripherin antibody, specific for climbing fibers, which can be seen as expected in the inferior olive; and a high density of glia.

      Weaknesses:

      Despite the supporting evidence listed above, the identification of the gross anatomical bumps, conspicuous in the ventral midline, is problematic. This would be the standard location of the inferior olive, with the principal trigeminal nucleus occupying a more dorsal position. This presents an apparent contradiction which at a minimum needs further discussion. Major species-specific specializations and positional shifts are well-documented for cortical areas, but nuclear layouts in the brainstem have been considered as less malleable.

    7. Author Response:

      The following is the authors’ response to the previous reviews.

      We carefully read through the second-round reviews and the additional reviews. To us, the review process is somewhat unusual and very much dominated by referee 2, who aggressively insists that we mixed up the trigeminal nucleus and inferior olive and that as a consequence our results are meaningless. We think the stance of referee 2 and the focus on one single issue (the alleged mix-up of trigeminal nucleus and inferior olive) is somewhat unfortunate, leaves out much of our findings and we debated at length on how to deal with further revisions. In the end, we decided to again give priority to addressing the criticism of referees 2, because it is hard to go on with a heavily attacked paper without resolving the matter at stake. The following is a summary of, what we did:

      Additional experimental work:

      (1) We checked if the peripherin-antibody indeed reliably identifies climbing fibers.

      To this end, we sectioned the elephant cerebellum and stained sections with the peripherin-antibody. We find: (i) the cerebellar white matter is strongly reactive for peripherin-antibodies, (ii) cerebellar peripherin-antibody staining of has an axonal appearance. (iii) Cerebellar Purkinje cell somata appear to be ensheated by peripherin-antibody staining. (iv) We observed that the peripherin-antibody reactivity gradually decreases from Purkinje cell somata to the pia in the cerebellar molecular layer. This work is shown in our revised Figure 2. All these four features align with the distribution of climbing fibers (which arrive through the white matter, are axons, ensheat Purkinje cell somata, and innervate Purkinje cell proximally not reaching the pia). In line with previous work, which showed similar cerebellar staining patterns in several species (Errante et al. 1998), we conclude that elephant climbing fibers are strongly reactive for peripherin-antibodies.

      (2) We delineated the elephant olivo-cerebellar tract.

      The strong peripherin-antibody reactivity of elephant climbing fibers enabled us to delineate the elephant olivo-cerebellar tract. We find the elephant olivo-cerebellar tract is a strongly peripherin-antibody reactive, well-delineated fiber tract several millimeters wide and about a centimeter in height. The unstained olivo-cerebellar tract has a greyish appearance. In the anterior regions of the olivo-cerebellar tract, we find that peripherin-antibody reactive fibers run in the dorsolateral brainstem and approach the cerebellar peduncle, where the tract gradually diminishes in size, presumably because climbing fibers discharge into the peduncle. Indeed, peripherin-antibody reactive fibers can be seen entering the cerebellar peduncle. Towards the posterior end of the peduncle, the olivo-cerebellar disappears (in the dorsal brainstem directly below the peduncle. We note that the olivo-cerebellar tract was referred to as the spinal trigeminal tract by Maseko et al. 2013. We think the tract in question cannot be the spinal trigeminal tract for two reasons: (i) This tract is the sole brainstem source of peripherin-positive climbing fibers entering the peduncle/ the cerebellum; this is the defining characteristic of the olivo-cerebellar tract. (ii) The tract in question is much smaller than the trigeminal nerve, disappears posterior to where the trigeminal nerve enters the brainstem (see below), and has no continuity with the trigeminal nerve; the continuity with the trigeminal nerve is the defining characteristic of the spinal trigeminal tract, however.

      The anterior regions of the elephant olivo-cerebellar tract are similar to the anterior regions of olivo-cerebellar tract of other mammals in its dorsolateral position and the relation to the cerebellar peduncle. In its more posterior parts, the elephant olivo-cerebellar tract continues for a long distance (~1.5 cm) in roughly the same dorsolateral position and enters the serrated nucleus that we previously identified as the elephant inferior olive. The more posterior parts of the elephant olivo-cerebellar tract therefore differ from the more posterior parts of the olivo-cerebellar tract of other mammals, which follows a ventromedial trajectory towards a ventromedially situated inferior olive. The implication of our delineation of the elephant olivo-cerebellar tract is that we correctly identified the elephant inferior olive.

      (3) An in-depth analysis of peripherin-antibody reactivity also indicates that the trigeminal nucleus receives no climbing fiber input.

      We also studied the peripherin-antibody reactivity in and around the trigeminal nucleus. We had also noted in the previous submission that the trigeminal nucleus is weakly positive for peripherin, but that the staining pattern is uniform and not the type of axon bundle pattern that is seen in the inferior olive of other mammals. To us, this observation already argued against the presence of climbing fibers in the trigeminal nucleus. We also noted that the myelin stripes of the trigeminal nucleus were peripherin-antibody-negative. In the context of our olivo-cerebellar tract tracing we now also scrutinized the surroundings of the trigeminal nucleus for peripherin-antibody reactivity. We find that the ventral brainstem surrounding the trigeminal nucleus is devoid of peripherin-antibody reactivity. Accordingly, no climbing fibers, (which we have shown to be strongly peripherin-antibody-positive, see our point 1) arrive at the trigeminal nucleus. The absence of climbing fiber input indicates that previous work that identified the (trigeminal) nucleus as the inferior olive (Maseko et al 2013) is unlikely to be correct.

      (4) We characterized the entry of the trigeminal nerve into the elephant brain.

      To better understand how trigeminal information enters the elephant’s brain, we characterized the entry of the trigeminal nerve. This analysis indicated to us that the trigeminal nerve is not continuous with the olivo-cerebellar tract (the spinal trigeminal tract of Maseko et al. 2013) as previously claimed by Maseko et al. 2013. We show some of this evidence in Referee-Figure 1 below. The reason we think the trigeminal nerve is discontinuous with the olivo-cerebellar tract is the size discrepancy between the two structures. We first show this for the tracing data of Maseko et al. 2013. In the Maseko et al. 2013 data the trigeminal nerve (Referee-Figure 1A, their plate Y) has 3-4 times the diameter of the olivocerebellar tract (the alleged spinal trigeminal tract, Referee-Figure 1B, their plate Z). Note that most if not all trigeminal fibers are thought to continue from the nerve into the trigeminal tract (see our rat data below). We plotted the diameter of the trigeminal nerve and diameter of the olivo-cerebellar (the spinal trigeminal tract according to Maseko et al. 2013) from the Maseko et al. 2013 data (Referee-Figure 1C) and we found that the olivocerebellar tract has a fairly consistent diameter (46 ± 9 mm2, mean ± SD). Statistical considerations and anatomical evidence suggest that the tracing of the trigeminal nerve into the olivo-cerebellar (the spinal trigeminal tract according to Maseko et al. 2013) is almost certainly wrong. The most anterior point of the alleged spinal trigeminal tract has a diameter of 51 mm2 which is more than 15 standard deviations different from the most posterior diameter (194 mm2) of the trigeminal tract. For this assignment to be correct three-quarters of trigeminal nerve fibers would have to spontaneously disappear, something that does not happen in the brain. We also made similar observations in the African elephant Bibi, where the trigeminal nerve (Referee-Figure 1D) is much larger in diameter than the olivocerebellar tract (Referee-Figure 1E). We could also show that the olivocerebellar tract disappears into the peduncle posterior to where the trigeminal nerve enters (Referee-Figure 1F). Our data are very similar to Maseko et al. indicating that their outlining of structures was done correctly. What appears to have been oversimplified, is the assignment of structures as continuous. We also quantified the diameter of the trigeminal nerve and the spinal trigeminal tract in rats (from the Paxinos & Watson atlas; Referee-Figure 1D); as expected we found the trigeminal nerve and spinal trigeminal tract diameters are essentially continuous.

      In our hands, the trigeminal nerve does not continue into a well-defined tract that could be traced after its entry. In this regard, it differs both from the olivo-cerebellar tract of the elephant or the spinal trigeminal tract of the rodent, both of which are well delineated. We think the absence of a well-delineated spinal trigeminal tract in elephants might have contributed to the putative tracing error highlighted in our Referee-Figure 1A-C.

      We conclude that a size mismatch indicates trigeminal fibers do not run in the olivo-cerebellar tract (the spinal trigeminal tract according to Maseko et al. 2013).

      Author response image 1.

      The trigeminal nerve is discontinuous with the olivo-cerebellar tract (the spinal trigeminal tract according to Maseko et al. 2013)

      A, Trigeminal nerve (orange) in the brain of African elephant LAX as delineated by Maseko et al. 2013 (coronal section; their plate Y).

      B, Most anterior appearance of the spinal trigeminal tract of Maseko et al. 2013 (blue; coronal section; their plate Z). Note the much smaller diameter of the spinal trigeminal tract compared to the trigeminal nerve shown in C, which argues against the continuity of the two structures. Indeed, our peripherin-antibody staining showed that the spinal trigeminal tract of Maseko corresponds to the olivo-cerebellar tract and is discontinuous with the trigeminal nerve.

      C, Plot of the trigeminal nerve and olivo-cerebellar tracts (the spinal trigeminal tract according to Maseko et al. 2013) diameter along the anterior-posterior axis. The trigeminal nerve is much larger in diameter than the olivocerebellar tract (the spinal trigeminal tract according to Maseko et al. 2013). C, D measurements, for which sections are shown in panels C and D respectively. The olivocerebellar tract (the spinal trigeminal tract according to Maseko et al. 2013) has a consistent diameter; data replotted from Maseko et al. 2013. At mm 25 the inferior olive appears.

      D, Trigeminal nerve entry in the brain of African elephant Bibi; our data, coronal section, the trigeminal nerve is outlined in orange, note the large diameter.

      E, Most anterior appearance of the olivo-cerebellar tract in the brain of African elephant Bibi; our data, coronal section, approximately 3 mm posterior to the section shown in A, the olivocerebellar tract is outlined in blue. Note the smaller diameter of the olivo-cerebellar tract compared to the trigeminal nerve, which argues against the continuity of the two structures.

      F, Plot of the trigeminal nerve and olivo-cerebellar tract diameter along the anterior-posterior axis. The nerve and olivo-cerebellar tract are discontinuous and the trigeminal nerve is much larger in diameter than the olivocerebellar tract (the spinal trigeminal tract according to Maseko et al. 2013); our data. D, E measurements, for which sections are shown in panels D and E respectively. At mm 27 the inferior olive appears.

      G, In the rat the trigeminal nerve is continuous in size with the spinal trigeminal tract. Data replotted from Paxinos and Watson.

      Reviewer 2 (Public Review):

      As indicated in my previous review of this manuscript (see above), it is my opinion that the authors have misidentified, and indeed switched, the inferior olivary nuclear complex (IO) and the trigeminal nuclear complex (Vsens). It is this specific point only that I will address in this second review, as this is the crucial aspect of this paper - if the identification of these nuclear complexes in the elephant brainstem by the authors is incorrect, the remainder of the paper does not have any scientific validity.

      Comment: We agree with the referee that it is most important to sort out, the inferior olivary nuclear complex (IO) and the trigeminal nuclear complex, respectively.Change: We did additional experimental work to resolve this matter as detailed at the beginning of our response. Specifically, we ascertained that elephant climbing fibers are strongly peripherin-positive. Based on elephant climbing fiber peripherin-reactivity we delineated the elephant olivo-cerebellar tract. We find that the olivo-cerebellar connects to the structure we refer to as inferior olive to the cerebellum (the referee refers to this structure as the trigeminal nuclear complex). We also found that the trigeminal nucleus (the structure the referee refers to as inferior olive) appears to receive no climbing fibers. We provide indications that the tracing of the trigeminal nerve into the olivo-cerebellar tract by Maseko et al. 2023 was erroneous (Author response image 1). These novel findings support our ideas but are very difficult to reconcile with the referee’s partitioning scheme.

      The authors, in their response to my initial review, claim that I "bend" the comparative evidence against them. They further claim that as all other mammalian species exhibit a "serrated" appearance of the inferior olive, and as the elephant does not exhibit this appearance, that what was previously identified as the inferior olive is actually the trigeminal nucleus and vice versa. 

      For convenience, I will refer to IOM and VsensM as the identification of these structures according to Maseko et al (2013) and other authors and will use IOR and VsensR to refer to the identification forwarded in the study under review. <br /> The IOM/VsensR certainly does not have a serrated appearance in elephants. Indeed, from the plates supplied by the authors in response (Referee Fig. 2), the cytochrome oxidase image supplied and the image from Maseko et al (2013) shows a very similar appearance. There is no doubt that the authors are identifying structures that closely correspond to those provided by Maseko et al (2013). It is solely a contrast in what these nuclear complexes are called and the functional sequelae of the identification of these complexes (are they related to the trunk sensation or movement controlled by the cerebellum?) that is under debate.

      Elephants are part of the Afrotheria, thus the most relevant comparative data to resolve this issue will be the identification of these nuclei in other Afrotherian species. Below I provide images of these nuclear complexes, labelled in the standard nomenclature, across several Afrotherian species. 

      (A) Lesser hedgehog tenrec (Echinops telfairi) 

      Tenrecs brains are the most intensively studied of the Afrotherian brains, these extensive neuroanatomical studies undertaken primarily by Heinz Künzle. Below I append images (coronal sections stained with cresol violet) of the IO and Vsens (labelled in the standard mammalian manner) in the lesser hedgehog tenrec. It should be clear that the inferior olive is located in the ventral midline of the rostral medulla oblongata (just like the rat) and that this nucleus is not distinctly serrated. The Vsens is located in the lateral aspect of the medulla skirted laterally by the spinal trigeminal tract (Sp5). These images and the labels indicating structures correlate precisely with that provide by Künzle (1997, 10.1016, see his Figure 1K,L. Thus, in the first case of a related species, there is no serrated appearance of the inferior olive, the location of the inferior olive is confirmed through connectivity with the superior colliculus (a standard connection in mammals) by Künzle (1997), and the location of Vsens is what is considered to be typical for mammals. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report. 

      Peer Review Image 1.

      (B) Giant otter shrew (Potomogale velox) 

      The otter shrews are close relatives of the Tenrecs. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see hints of the serration of the IO as defined by the authors, but we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Peer Response Image 2.

      (C) Four-toed sengi (Petrodromus tetradactylus) 

      The sengis are close relatives of the Tenrecs and otter shrews, these three groups being part of the Afroinsectiphilia, a distinct branch of the Afrotheria. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see vague hints of the serration of the IO (as defined by the authors), and we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report. 

      Peer Response Image 3.

      (D) Rock hyrax (Procavia capensis) 

      The hyraxes, along with the sirens and elephants form the Paenungulata branch of the Afrotheria. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per the standard mammalian anatomy. Here we see hints of the serration of the IO (as defined by the authors), but we also see evidence of a more "bulbous" appearance of subnuclei of the IO (particularly the principal nucleus), and we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report. 

      Peer Review Image 4.

      (E) West Indian manatee (Trichechus manatus) 

      The sirens are the closest extant relatives of the elephants in the Afrotheria. Below I append images of cresyl violet (top) and myelin (bottom) stained coronal sections (taken from the University of Wisconsin-Madison Brain Collection, https://brainmuseum.org, and while quite low in magnification they do reveal the structures under debate) through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see the serration of the IO (as defined by the authors). Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Peer Review Image 5.

      These comparisons and the structural identification, with which the authors agree as they only distinguish the elephants from the other Afrotheria, demonstrate that the appearance of the IO can be quite variable across mammalian species, including those with a close phylogenetic affinity to the elephants. Not all mammal species possess a "serrated" appearance of the IO. Thus, it is more than just theoretically possible that the IO of the elephant appears as described prior to this study. 

      So what about elephants? Below I append a series of images from coronal sections through the African elephant brainstem stained for Nissl, myelin, and immunostained for calretinin. These sections are labelled according to standard mammalian nomenclature. In these complete sections of the elephant brainstem, we do not see a serrated appearance of the IOM (as described previously and in the current study by the authors). Rather the principal nucleus of the IOM appears to be bulbous in nature. In the current study, no image of myelin staining in the IOM/VsensR is provided by the authors. However, in the images I provide, we do see the reported myelin stripes in all stains - agreement between the authors and reviewer on this point. The higher magnification image to the bottom left of the plate shows one of the IOM/VsensR myelin stripes immunostained for calretinin, and within the myelin stripes axons immunopositive for calretinin are seen (labelled with an arrow). The climbing fibres of the elephant cerebellar cortex are similarly calretinin immunopositive (10.1159/000345565). In contrast, although not shown at high magnification, the fibres forming the Sp5 in the elephant (in the Maseko description, unnamed in the description of the authors) show no immunoreactivity to calretinin. 

      Peer Review Image 6.

      Comment: We appreciate the referee’s additional comments. We concede the possibility that some relatives of elephants have a less serrated inferior olive than most other mammals. We maintain, however, that the elephant inferior olive (our Figure 1J) has the serrated appearance seen in the vast majority of mammals.

      Change: None.

      Peripherin Immunostaining 

      In their revised manuscript the authors present immunostaining of peripherin in the elephant brainstem. This is an important addition (although it does replace the only staining of myelin provided by the authors which is unusual as the word myelin is in the title of the paper) as peripherin is known to specifically label peripheral nerves. In addition, as pointed out by the authors, peripherin also immunostains climbing fibres (Errante et al., 1998). The understanding of this staining is important in determining the identification of the IO and Vsens in the elephant, although it is not ideal for this task as there is some ambiguity. Errante and colleagues (1998; Fig. 1) show that climbing fibres are peripherin-immunopositive in the rat. But what the authors do not evaluate is the extensive peripherin staining in the rat Sp5 in the same paper (Errante et al, 1998, Fig. 2). The image provided by the authors of their peripherin immunostaining (their new Figure 2) shows what I would call the Sp5 of the elephant to be strongly peripherin immunoreactive, just like the rat shown in Errant et al (1998), and more over in the precise position of the rat Sp5! This makes sense as this is where the axons subserving the "extraordinary" tactile sensitivity of the elephant trunk would be found (in the standard model of mammalian brainstem anatomy). Interestingly, the peripherin immunostaining in the elephant is clearly lamellated...this coincides precisely with the description of the trigeminal sensory nuclei in the elephant by Maskeo et al (2013) as pointed out by the authors in their rebuttal. Errante et al (1998) also point out peripherin immunostaining in the inferior olive, but according to the authors this is only "weakly present" in the elephant IOM/VsensR. This latter point is crucial. Surely if the elephant has an extraordinary sensory innervation from the trunk, with 400 000 axons entering the brain, the VsensR/IOM should be highly peripherin-immunopositive, including the myelinated axon bundles?! In this sense, the authors argue against their own interpretation - either the elephant trunk is not a highly sensitive tactile organ, or the VsensR is not the trigeminal nuclei it is supposed to be. 

      Comment: We made sure that elephant climbing fibers are strongly peripherin-positive (our revised Figure 2). As we noted in already our previous ms, we see weak diffuse peripherin-reactivity in the trigeminal nucleus (the inferior olive according to the referee), but no peripherin-reactive axon bundles (i.e. climbing fibers) that are seen in the inferior olive of other species. We also see no peripherin-reactive axon bundles (i.e. the olivo-cerebellar tract) arriving in the trigeminal nucleus as the tissue surrounding the trigeminal nucleus is devoid of peripherin-reactivity. Again, this finding is incompatible with the referee’s ideas. As far as we can tell, the trigeminal fibers are not reactive for peripherin in the elephant, i.e. we did not observe peripherin-reactivity very close to the nerve entry, but unfortunately, we did not stain for peripherin-reactivity into the nerve. As the referee alludes to the absence of peripherin-reactivity in the trigeminal tract is a difference between rodents and elephants.

      Change: Our novel Figure 2.

      Summary: 

      (1) Comparative data of species closely related to elephants (Afrotherians) demonstrates that not all mammals exhibit the "serrated" appearance of the principal nucleus of the inferior olive. 

      (2) The location of the IO and Vsens as reported in the current study (IOR and VsensR) would require a significant, and unprecedented, rearrangement of the brainstem in the elephants independently. I argue that the underlying molecular and genetic changes required to achieve this would be so extreme that it would lead to lethal phenotypes. Arguing that the "switcheroo" of the IO and Vsens does occur in the elephant (and no other mammals) and thus doesn't lead to lethal phenotypes is a circular argument that cannot be substantiated. 

      (3) Myelin stripes in the subnuclei of the inferior olivary nuclear complex are seen across all related mammals as shown above. Thus, the observation made in the elephant by the authors in what they call the VsensR, is similar to that seen in the IO of related mammals, especially when the IO takes on a more bulbous appearance. These myelin stripes are the origin of the olivocerebellar pathway, and are indeed calretinin immunopositive in the elephant as I show. 

      (4) What the authors see aligns perfectly with what has been described previously, the only difference being the names that nuclear complexes are being called. But identifying these nuclei is important, as any functional sequelae, as extensively discussed by the authors, is entirely dependent upon accurately identifying these nuclei. 

      (4) The peripherin immunostaining scores an own goal - if peripherin is marking peripheral nerves (as the authors and I believe it is), then why is the VsensR/IOM only "weakly positive" for this stain? This either means that the "extraordinary" tactile sensitivity of the elephant trunk is non-existent, or that the authors have misinterpreted this staining. That there is extensive staining in the fibre pathway dorsal and lateral to the IOR (which I call the spinal trigeminal tract), supports the idea that the authors have misinterpreted their peripherin immunostaining.

      (5) Evolutionary expediency. The authors argue that what they report is an expedient way in which to modify the organisation of the brainstem in the elephant to accommodate the "extraordinary" tactile sensitivity. I disagree. As pointed out in my first review, the elephant cerebellum is very large and comprised of huge numbers of morphologically complex neurons. The inferior olivary nuclei in all mammals studied in detail to date, give rise to the climbing fibres that terminate on the Purkinje cells of the cerebellar cortex. It is more parsimonious to argue that, in alignment with the expansion of the elephant cerebellum (for motor control of the trunk), the inferior olivary nuclei (specifically the principal nucleus) have had additional neurons added to accommodate this cerebellar expansion. Such an addition of neurons to the principal nucleus of the inferior olive could readily lead to the loss of the serrated appearance of the principal nucleus of the inferior olive, and would require far less modifications in the developmental genetic program that forms these nuclei. This type of quantitative change appears to be the primary way in which structures are altered in the mammalian brainstem. 

      Comment: We still disagree with the referee. We note that our conclusions rest on the analysis of 8 elephant brainstems, which we sectioned in three planes and stained with a variety of metabolic and antibody stains and in which assigned two structures (the inferior olive and the trigeminal nucleus). Most of the evidence cited by the referee stems from a single paper, in which 147 structures were identified based on the analysis of a single brainstem sectioned in one plane and stained with a limited set of antibodies. Our synopsis of the evidence is the following.

      (1) We agree with the referee that concerning brainstem position our scheme of a ventromedial trigeminal nucleus and a dorsolateral inferior olive deviates from the usual mammalian position of these nuclei (i.e. a dorsolateral trigeminal nucleus and a ventromedial inferior olive).

      (2) Cytoarchitectonics support our partitioning scheme. The compact cellular appearance of our ventromedial trigeminal nucleus is characteristic of trigeminal nuclei. The serrated appearance of our dorsolateral inferior olive is characteristic of the mammalian inferior olive; we acknowledge that the referee claims exceptions here. To our knowledge, nobody has described a mammalian trigeminal nucleus with a serrated appearance (which would apply to the elephant in case the trigeminal nucleus is situated dorsolaterally).

      (3) Metabolic staining (Cyto-chrome-oxidase reactivity) supports our partitioning scheme. Specifically, our ventromedial trigeminal nucleus shows intense Cyto-chrome-oxidase reactivity as it is seen in the trigeminal nuclei of trigeminal tactile experts.

      (4) Isomorphism. The myelin stripes on our ventromedial trigeminal nucleus are isomorphic to trunk wrinkles. Isomorphism is a characteristic of somatosensory brain structures (barrel, barrelettes, nose-stripes, etc) and we know of no case, where such isomorphism was misleading.

      (5) The large-scale organization of our ventromedial trigeminal nuclei in anterior-posterior repeats is characteristic of the mammalian trigeminal nuclei. To our knowledge, no such organization has ever been reported for the inferior olive.

      (6) Connectivity analysis supports our partitioning scheme. According to our delineation of the elephant olivo-cerebellar tract, our dorsolateral inferior olive is connected via peripherin-positive climbing fibers to the cerebellum. In contrast, our ventromedial trigeminal nucleus (the referee’s inferior olive) is not connected via climbing fibers to the cerebellum.

      Change: As discussed, we advanced further evidence in this revision. Our partitioning scheme (a ventromedial trigeminal nucleus and a dorsolateral inferior olive) is better supported by data and makes more sense than the referee’s suggestion (a dorsolateral trigeminal nucleus and a ventromedial inferior olive). It should be published.

      Reviewer #3 (Public Review):

      Summary: 

      The study claims to investigate trunk representations in elephant trigeminal nuclei located in the brainstem. The researchers identify large protrusions visible from the ventral surface of the brainstem, which they examined using a range of histological methods. However, this ventral location is usually where the inferior olivary complex is found, which challenges the author's assertions about the nucleus under analysis. They find that this brainstem nucleus of elephants contains repeating modules, with a focus on the anterior and largest unit which they define as the putative nucleus principalis trunk module of the trigeminal. The nucleus exhibits low neuron density, with glia outnumbering neurons significantly. The study also utilizes synchrotron X-ray phase contrast tomography to suggest that myelin-stripe-axons traverse this module. The analysis maps myelin-rich stripes in several specimens and concludes that based on their number and patterning that they likely correspond with trunk folds; however this conclusion is not well supported if the nucleus has been misidentified. 

      Comment: The referee provides a summary of our work. The referee also notes that the correct identification of the trigeminal nucleus is critical to the message of our paper.

      Change: In line with these assessments we focused our revision efforts on the issue of trigeminal nucleus identification, please see our introductory comments and our response to Referee 2.

      Strengths: 

      The strength of this research lies in its comprehensive use of various anatomical methods, including Nissl staining, myelin staining, Golgi staining, cytochrome oxidase labeling, and synchrotron X-ray phase contrast tomography. The inclusion of quantitative data on cell numbers and sizes, dendritic orientation and morphology, and blood vessel density across the nucleus adds a quantitative dimension. Furthermore, the research is commendable for its high-quality and abundant images and figures, effectively illustrating the anatomy under investigation.

      Comment: We appreciate this positive assessment.

      Change: None

      Weaknesses: 

      While the research provides potentially valuable insights if revised to focus on the structure that appears to be inferior olivary nucleus, there are certain additional weaknesses that warrant further consideration. First, the suggestion that myelin stripes solely serve to separate sensory or motor modules rather than functioning as an "axonal supply system" lacks substantial support due to the absence of information about the neuronal origins and the termination targets of the axons. Postmortem fixed brain tissue limits the ability to trace full axon projections. While the study acknowledges these limitations, it is important to exercise caution in drawing conclusions about the precise role of myelin stripes without a more comprehensive understanding of their neural connections. 

      Comment: We understand these criticisms and the need for cautious interpretation. As we noted previously, we think that the Elife-publishing scheme, where critical referee commentary is published along with our ms, will make this contribution particularly valuable.

      Change: Our additional efforts to secure the correct identification of the trigeminal nucleus.

      Second, the quantification presented in the study lacks comparison to other species or other relevant variables within the elephant specimens (i.e., whole brain or brainstem volume). The absence of comparative data to different species limits the ability to fully evaluate the significance of the findings. Comparative analyses could provide a broader context for understanding whether the observed features are unique to elephants or more common across species. This limitation in comparative data hinders a more comprehensive assessment of the implications of the research within the broader field of neuroanatomy. Furthermore, the quantitative comparisons between African and Asian elephant specimens should include some measure of overall brain size as a covariate in the analyses. Addressing these weaknesses would enable a richer interpretation of the study's findings. 

      Comment: We understand, why the referee asks for additional comparative data, which would make our study more meaningful. We note that we already published a quantitative comparison of African and Asian elephant facial nuclei (Kaufmann et al. 2022). The quantitative differences between African and Asian elephant facial nuclei are similar in magnitude to what we observed here for the trigeminal nucleus, i.e. African elephants have about 10-15% more facial nucleus neurons than Asian elephants. The referee also notes that data on overall elephant brain size might be important for interpreting our data. We agree with this sentiment and we are preparing a ms on African and Asian elephant brain size. We find – unexpectedly given the larger body size of African elephants – that African elephants have smaller brains than Asian elephants. The finding might imply that African elephants, which have more facial nucleus neurons and more trigeminal nucleus trunk module neurons, are neurally more specialized in trunk control than Asian elephants.

      Change: We are preparing a further ms on African and Asian elephant brain size, a first version of this work has been submitted.

      Reviewer #4 (Public Review): 

      Summary: 

      The authors report a novel isomorphism in which the folds of the elephant trunk are recognizably mapped onto the principal sensory trigeminal nucleus in the brainstem. Further, they identifiy the enlarged nucleus as being situated in this species in an unusual ventral midline position. 

      Comment: The referee summarizes our work.

      Change: None.

      Strengths: 

      The identity of the purported trigeminal nucleus and the isomorphic mapping with the trunk folds is supported by multiple lines of evidence: enhanced staining for cytochrome oxidase, an enzyme associated with high metabolic activity; dense vascularization, consistent with high metabolic activity; prominent myelinated bundles that partition the nucleus in a 1:1 mapping of the cutaneous folds in the trunk periphery; near absence of labeling for the anti-peripherin antibody, specific for climbing fibers, which can be seen as expected in the inferior olive; and a high density of glia.

      Comment: The referee again reviews some of our key findings.

      Change: None. 

      Weaknesses: 

      Despite the supporting evidence listed above, the identification of the gross anatomical bumps, conspicuous in the ventral midline, is problematic. This would be the standard location of the inferior olive, with the principal trigeminal nucleus occupying a more dorsal position. This presents an apparent contradiction which at a minimum needs further discussion. Major species-specific specializations and positional shifts are well-documented for cortical areas, but nuclear layouts in the brainstem have been considered as less malleable. 

      Comment: The referee notes that our discrepancy with referee 2, needs to be addressed with further evidence and discussion, given the unusual position of both inferior olive and trigeminal nucleus in the partitioning scheme and that the mammalian brainstem tends to be positionally conservative. We agree with the referee. We note that – based on the immense size of the elephant trigeminal ganglion (50 g), half the size of a monkey brain – it was expected that the elephant trigeminal nucleus ought to be exceptionally large.

      Change: We did additional experimental work to resolve this matter: (i) We ascertained that elephant climbing fibers are strongly peripherin-positive. (ii) Based on elephant climbing fiber peripherin-reactivity we delineated the elephant olivo-cerebellar tract. We find that the olivo-cerebellar connects to the structure we refer to as inferior olive to the cerebellum. (iii) We also found that the trigeminal nucleus (the structure the referee refers to as inferior olive) appears to receive no climbing fibers. (iv) We provide indications that the tracing of the trigeminal nerve into the olivo-cerebellar tract by Maseko et al. 2023 was erroneous (Referee-Figure 1). These novel findings support our ideas.

      Reviewer #5 (Public Review): 

      After reading the manuscript and the concerns raised by reviewer 2 I see both sides of the argument - the relative location of trigeminal nucleus versus the inferior olive is quite different in elephants (and different from previous studies in elephants), but when there is a large disproportionate magnification of a behaviorally relevant body part at most levels of the nervous system (certainly in the cortex and thalamus), you can get major shifting in location of different structures. In the case of the elephant, it looks like there may be a lot of shifting. Something that is compelling is that the number of modules separated but the myelin bands correspond to the number of trunk folds which is different in the different elephants. This sort of modular division based on body parts is a general principle of mammalian brain organization (demonstrated beautifully for the cuneate and gracile nucleus in primates, VP in most of species, S1 in a variety of mammals such as the star nosed mole and duck-billed platypus). I don't think these relative changes in the brainstem would require major genetic programming - although some surely exists. Rodents and elephants have been independently evolving for over 60 million years so there is a substantial amount of time for changes in each l lineage to occur.

      I agree that the authors have identified the trigeminal nucleus correctly, although comparisons with more out groups would be needed to confirm this (although I'm not suggesting that the authors do this). I also think the new figure (which shows previous divisions of the brainstem versus their own) allows the reader to consider these issues for themselves. When reviewing this paper, I actually took the time to go through atlases of other species and even look at some of my own data from highly derived species. Establishing homology across groups based only on relative location is tough especially when there appears to be large shifts in relative location of structures. My thoughts are that the authors did an extraordinary amount of work on obtaining, processing and analyzing this extremely valuable tissue. They document their work with images of the tissue and their arguments for their divisions are solid. I feel that they have earned the right to speculate - with qualifications - which they provide. 

      Comment: The referee summarizes our work and appears to be convinced by the line of our arguments. We are most grateful for this assessment. We add, again, that the skeptical assessment of referee 2 will be published as well and will give the interested reader the possibility to view another perspective on our work.

      Change: None. 

      Recommendations for the authors: 

      Reviewer #1 (Recommendations For The Authors):

      With this manuscript being virtually identical to the previous version, it is possible that some of the definitive conclusions about having identified the elephant trigeminal nucleus and trunk representation should be moderated in a more nuanced manner, especially given the careful and experienced perspective from reviewers with first hand knowledge elephant neuroanatomy.

      Comment: We agree that both our first and second revisions were very much centered on the debate of the correct identification of the trigeminal nucleus and that our ms did not evolve as much in other regards. This being said we agree with Referee 2 that we needed to have this debate. We also think we advanced important novel data in this context (the delineation of elephant olivo-cerebellar tract through the peripherin-antibody).

      Changes: Our revised Figure 2. 

      The peripherin staining adds another level of argument to the authors having identified the trigeminal brainstem instead of the inferior olive, if differential expression of peripherin is strong enough to distinguish one structure from the other.

      Comment: We think we showed too little peripherin-antibody staining in our previous revision. We have now addressed this problem.

      Changes: Our revised Figure 2, i.e. the delineation of elephant olivo-cerebellar tract through the peripherin-antibody).

      There are some minor corrections to be made with the addition of Fig. 2., including renumbering the figures in the manuscript (e.g., 406, 521). 

      I continue to appreciate this novel investigation of the elephant brainstem and find it an interesting and thorough study, with the use of classical and modern neuroanatomical methods.

      Comment: We are thankful for this positive assessment.

      Reviewer #2 (Recommendations For The Authors):

      I do realise the authors are very unhappy with me and the reviews I have submitted. I do apologise if feelings have been hurt, and I do understand the authors put in a lot of hard work and thought to develop what they have; however, it is unfortunate that the work and thoughts are not correct. Science is about the search for the truth and sometimes we get it wrong. This is part of the scientific process and why most journals adhere to strict review processes of scientific manuscripts. As I said previously, the authors can use their data to write a paper describing and quantifying Golgi staining of neurons in the principal olivary nucleus of the elephant that should be published in a specialised journal and contextualised in terms of the motor control of the trunk and the large cerebellum of the elephant. 

      Comment: We appreciate the referee’s kind words. Also, no hard feelings from our side, this is just a scientific debate. In our experience, neuroanatomical debates are resolved by evidence and we note that we provide evidence strengthening our identification of the trigeminal nucleus and inferior olive. As far as we can tell from this effort and the substantial evidence accumulated, the referee is wrong.

      Reviewer #4 (Recommendations For The Authors):

      As a new reviewer, I have benefited from reading the previous reviews and Author response, even while having several new comments to add. 

      (1) The identification of the inferior olive and trigeminal nuclei is obviously center stage. An enlargement of the trigeminal nuclei is not necessarily problematic, given the published reports on the dramatic enlargement of the trigeminal nerve (Purkart et al., 2022). At issue is the conspicuous relocation of the trigeminal nuclei that is being promoted by Reveyaz et al. Conspicuous rearrangements are not uncommon; for example, primary sensory cortical fields in different species (fig. 1 in H.H.A. Oelschlager for dolphins; S. De Vreese et al. (2023) for cetaceans, L. Krubitzer on various species, in the context of evolution). The difficult point here concerns what looks like a rather conspicuous gross anatomical rearrangement, in BRAINSTEM - the assumption being that the brainstem bauplan is going to be specifically conservative and refractory to gross anatomical rearrangement. 

      Comment: We agree with the referee that the brainstem rearrangements are unexpected. We also think that the correct identification of nuclei needs to be at the center of our revision efforts.

      Change: Our revision provided further evidence (delineation of the olivo-cerebellar tract, characterization of the trigeminal nerve entry) about the identity of the nuclei we studied.

      Why would a major nucleus shift to such a different location? and how? Can ex vivo DTI provide further support of the correct identification? Is there other "disruption" in the brainstem? What occupies the traditional position of the trigeminal nuclei? An atlas-equivalent coronal view of the entire brainstem would be informative. The Authors have assembled multiple criteria to support their argument that the ventral "bumps" are in fact a translocated trigeminal principal nucleus: enhanced CO staining, enhanced vascularization, enhanced myelination (via Golgi stains and tomography), very scant labeling for a climbing fiber specific antibody ( anti-peripherin), vs. dense staining of this in the alternative structure that they identify as IO; and a high density of glia. Admittedly, this should be sufficient, but the proposed translocation (in the BRAINSTEM) is sufficiently startling that this is arguably NOT sufficient. <br /> The terminology of "putative" is helpful, but a more cogent presentation of the results and more careful discussion might succeed in winning over at least some of a skeptical readership. 

      Comment: We do not know, what led to the elephant brainstem rearrangements we propose. If the trigeminal nuclei had expanded isometrically in elephants from the ancestral pattern, one would have expected a brain with big lateral bumps, not the elephant brain with its big ventromedial bumps. We note, however, that very likely the expansion of the elephant trigeminal nuclei did not occur isometrically. Instead, the neural representation of the elephant nose expanded dramatically and in rodents the nose is represented ventromedially in the brainstem face representation. Thus, we propose a ‘ventromedial outgrowth model’ according to which the elephant ventromedial trigeminal bumps result from a ventromedially direct outgrowth of the ancestral ventromedial nose representation.

      We advanced substantially more evidence to support our partitioning scheme, including the delineation of the olivo-cerebellar tract based on peripherin-reactivity. We also identified problems in previous partitioning schemes, such as the claim that the trigeminal nerve continues into the ~4x smaller olivocerebellar tract (Referee-Figure 1C, D); we think such a flow of fibers, (which is also at odds with peripherin-antibody-reactivity and the appearance of nerve and olivocerebellar tract), is highly unlikely if not physically impossible. With all that we do not think that we overstate our case in our cautiously presented ms.

      Change: We added evidence on the identification of elephant trigeminal nuclei and inferior olive.

      (2) Role of myelin. While the photos of myelin are convincing, it would be nice to have further documentation. Gallyas? Would antibodies to MBP work? What is the myelin distribution in the "standard" trigeminal nuclei (human? macaque or chimpanzee?). What are alternative sources of the bundles? Regardless, I think it would be beneficial to de-emphasize this point about the role of myelin in demarcating compartments. <br /> I would in fact suggest an alternative (more neutral) title that might highlight instead the isomorphic feature; for example, "An isomorphic representation of Trunk folds in the Elephant Trigeminal Nucleus." The present title stresses myelin, but figure 1 already focuses on CO. Additionally, the folds are actually mentioned almost in passing until later in the manuscript. I recommend a short section on these at the beginning of the Results to serve as a useful framework.

      Here I'm inclined to agree with the Reviewer, that the Authors' contention that the myelin stipes serve PRIMARILY to separate trunk-fold domains is not particularly compelling and arguably a distraction. The point can be made, but perhaps with less emphasis. After all, the fact that myelin has multiple roles is well-established, even if frequently overlooked. In addition, the Authors might make better use of an extensive relevant literature related to myelin as a compartmental marker; for example, results and discussion in D. Haenelt....N. Weiskopf (eLife, 2023), among others. Another example is the heavily myelinated stria of Gennari in primate visual cortex, consisting of intrinsic pyramidal cell axons, but where the role of the myelination has still not been elucidated. 

      Comment: (1) Documentation of myelin. We note that we show further identification of myelinated fibers by the fluorescent dye fluomyelin in Figure 4B. We also performed additional myelin stains as the gold-myelin stain after the protocol of Schmued (Referee-Figure 2). In the end, nothing worked quite as well to visualize myelin-stripes as the bright-field images shown in Figure 4A and it is only the images that allowed us to match myelin-stripes to trunk folds. Hence, we focus our presentation on these images.

      (2) Title: We get why the referee envisions an alternative title. This being said, we would like to stick with our current title, because we feel it highlights the major novelty we discovered.

      (3) We agree with many of the other comments of the referee on myelin phenomenology. We missed the Haenelt reference pointed out by the referee and think it is highly relevant to our paper

      Change: 1. Referee Figure. 2. Inclusion of the Haenelt-reference.

      Author response image 2.

      Myelin stripes of the elephant trunk module visualized by Gold-chloride staining according to Schmued

      A, Low magnification micrograph of the trunk module of African elephant Indra stained with AuCl according to Schmued. The putative finger is to the left, proximal is to the right. Myelin stripes can easily be recognized. The white box indicates the area shown in B.

      B, high magnification micrograph of two myelin stripes. Individual gold-stained (black) axons organized in myelin stripes can be recognized.

      Schmued, L. C. (1990). A rapid, sensitive histochemical stain for myelin in frozen brain sections. Journal of Histochemistry & Cytochemistry38(5), 717-720.

      Are the "bumps" in any way "analogous" to the "brain warts" seen in entorhinal areas of some human brains (G. W. van Hoesen and A. Solodkin (1993)? 

      Comment: We think this is a similar phenomenon.

      Change: We included the Hoesen and A. Solodkin (1993) reference in our discussion.

      At least slightly more background (ie, a separate section or, if necessary, supplement) would be helpful, going into more detail on the several subdivisions of the ION and if these undergo major alterations in the elephant.

      Comment: The strength of the paper is the detailed delineation of the trunk module, based on myelin stripes and isomorphism. We don’t think we have strong evidence on ION subdivisions, because it appears the trigeminal tract cannot be easily traced in elephants. Accordingly, we find it difficult to add information here.

      Change: None.

      Is there evidence from the literature of other conspicuous gross anatomical translocations, in any species, especially in subcortical regions? 

      Comment: The best example that comes to mind is the star-nosed mole brainstem. There is a beautiful paper comparing the star-nosed mole brainstem to the normal mole brainstem (Catania et al 2011). The principal trigeminal nucleus in the star-nosed mole is far more rostral and also more medial than in the mole; still, such rearrangements are minor compared to what we propose in elephants.

      Catania, Kenneth C., Duncan B. Leitch, and Danielle Gauthier. "A star in the brainstem reveals the first step of cortical magnification." PloS one 6.7 (2011): e22406.

      Change: None.

      (3) A major point concerns the isomorphism between the putative trigeminal nuclei and the trunk specialization. I think this can be much better presented, at least with more discussion and other examples. The Authors mention about the rodent "barrels," but it seemed strange to me that they do not refer to their own results in pig (C. Ritter et al., 2023) nor the work from Ken Catania, 2002 (star-nosed mole; "fingerprints in the brain") or other that might be appropriate. I concur with the Reviewer that there should be more comparative data. 

      Comment: We agree.

      Change: We added a discussion of other isomorphisms including the the star-nosed mole to our paper.

      (4) Textual organization could be improved. 

      The Abstract all-important Introduction is a longish, semi "run-on" paragraph. At a minimum this should be broken up. The last paragraph of the Introduction puts forth five issues, but these are only loosely followed in the Results section. I think clarity and good organization is of the upmost importance in this manuscript. I recommend that the Authors begin the Results with a section on the trunk folds (currently figure 5, and discussion), continue with the several points related to the identification of the trigeminal nuclei, and continue with a parallel description of ION with more parallel data on the putative trigeminal and IO structures (currently referee Table 1, but incorporate into the text and add higher magnification of nucleus-specific cell types in the IO and trigeminal nuclei). Relevant comparative data should be included in the Discussion.

      Comment: 1. We agree with the referee that our abstract needed to be revised. 2. We also think that our ms was heavily altered by the insertion of the new Figure 2, which complemented Figure 1 from our first submission and is concerned with the identification of the inferior olive. From a standpoint of textual flow such changes were not ideal, but the revisions massively added to the certainty with which we identify the trigeminal nuclei. Thus, although we are not as content as we were with the flow, we think the ms advanced in the revision process and we would like to keep the Figure sequence as is. 3. We already noted above that we included additional comparative evidence.

      Change: 1. We revised our abstract. 2. We added comparative evidence.

      Reviewer #5 (Recommendations For The Authors): 

      The data is invaluable and provides insights into some of the largest mammals on the planet. 

      Comment: We are incredibly thankful for this positive assessment.

    1. Author response:

      Reviewer #3 (Public Review):

      (1) Conditions on growth and interaction rates for feasibility and stability. The authors approach this using a mean field approximation, and it is important to note that there is no particular temperature dependence assumed here: as far as it goes, this analysis is completely general for arbitrary Lotka-Volterra interactions.

      However, the starting point for the authors' mean field analysis is the statement that "it is not possible to meaningfully link the structure of species interactions to the exact closed-form analytical solution for [equilibria] 𝑥^*_𝑖 in the Lotka-Volterra model.

      I may be misunderstanding, but I don't agree with this statement. The time-independent equilibrium solution with all species present (i.e. at non-zero abundances) takes the form

      x^* = A^{-1}r

      where A is the inverse of the community matrix, and r is the vector of growth rates. The exceptions to this would be when one or more species has abundance = 0, or A is not invertible. I don't think the authors intended to tackle either of these cases, but maybe I am misunderstanding that.

      So to me, the difficulty here is not in writing a closed-form solution for the equilibrium x^*, it is in writing the inverse matrix as a nice function of the entries of the matrix A itself, which is where the authors want to get to. In this light, it looks to me like the condition for feasibility (i.e. that all x^* are positive, which is necessary for an ecologically-interpretable solution) is maybe an approximation for the inverse of A---perhaps valid when off-diagonal entries are small. A weakness then for me was in understanding the range of validity of this approximation, and whether it still holds when off-diagonal entries of A (i.e. inter-specific interactions) are arbitrarily large. I could not tell from the simulation runs whether this full range of off-diagonal values was tested.

      We thank the reviewer for pointing this out and we agree that the language used is imprecise. The GLV model is solvable using the matrix inversion method but as they note, this does not give an interpretable expression in terms of the system parameters. This is important as we aim to build understanding of how these parameters (which in turn depend on temperature) affect the richness in communities. We have made this clearer in lines 372-379.

      In regards to the validity of the approximation we have significantly increased the detail of the method in the manuscript, including the assumptions it makes (lines 384-393). In general the method assumes that any individual interaction has a weak effect on abundance. This will fail when the variation in interactions becomes too strong but should be robust to changes in the average interaction strength across the community.

      As a secondary issue here, it would have been helpful to understand whether the authors' feasible solutions are always stable to small perturbations. In general, I would expect this to be an additional criterion needed to understand diversity, though as the authors point out there are certain broad classes of solutions where feasibility implies stability.

      As the reviewer notes previous work using the GLV model by ? has shown that stability almost surely implies stability in the GLV. Thus we expect that our richness estimates derived from feasibility will closely resemble those from stabiltiy. We have amended the maintext to make this argument clear on lines 321-335.

      (2) I did not follow the precise rationale for selecting the temperature dependence of growth rate and interaction rates, or how the latter could be tested with empirical data, though I do think that in principle this could be a valuable way to understand the role of temperature dependence in the Lotka-Volterra equations.

      First, as the authors note, "the temperature dependence of resource supply will undoubtedly be an important factor in microbial communities"

      Even though resources aren't explicitly modeled here, this suggests to me that at some temperatures, resource supply will be sufficiently low for some species that their growth rates will become negative. For example, if temperature dependence is such that the limiting resource for a given species becomes too low to balance its maintenance costs (and hence mortality rate), it seems that the net growth rate will be negative. The alternative would be that temperature affects resource availability, but never such that a limiting resource leads to a negative growth rate when a taxon is rare.

      On the other hand, the functional form for the distribution of growth rates (eq 3) seems to imply that growth rates are always positive. I could imagine that this is a good description of microbial populations in a setting where the resource supply rate is controlled independently of temperature, but it wasn't clear how generally this would hold.

      We thank the reviewer for their comment. The assumption of positive growth rates is indeed a feature of the Boltzmann-Arrhenius model of temperature dependence. We use the Boltzmann-Arrhenius model due to the dependence of growth on metabolic rate. As metabolic rate is ultimately determined by biochemical kinetics its temper- ature dependence is well described by the Boltzmann-Arrhenius. In addition to this reasoning there is a wealth of empirical evidence supporting the use of the Boltzmann- Arrhenius to describe the temperature dependence of growth rate in microbes.

      Ultimately the temperature dependence of resource supply is not something we can directly consider in our model. As such we have to assume that resource supply is sufficient to maintain positive growth rates in the community. Note that this assump- tion only requires resource supply is sufficient to maintain positive growth rates (i.e. the maximal growth rate of species in isolation) not that resource supply is sufficient to maintain growth in the presence of intra- and interspecific competition. We have updated the manuscript in lines 156-159 to make these assumptions more clear.

      Secondly, while I understand that the growth rate in the exponential phase for a single population can be measured to high precision in the lab as a function of temperature, the assumption for the form of the interaction rates' dependence on temperature seems very hard to test using empirical data. In the section starting L193, the authors seem to fit the model parameters using growth rate dependence on temperature, but then assume that it is reasonable to "use the same thermal response for growth rates and interactions". I did not follow this, and I think a weakness here is in not providing clear evidence that the functional form assumed in Equation (4) actually holds.

      The reviewer is correct, it is very difficult to measure interaction coefficients experi- mentally and to our knowledge there is little to no data available on their empirical temperature responses. We as a best guess use the observed variation in thermal physiology parameters for growth rate as a proxy assuming that interactions must also depend on metabolic rates of the interacting species (see also response to com- ment 8).

    1. eLife assessment

      This important study builds on a previous publication, demonstrating that T. brucei has a continuous endomembrane system, which probably facilitates high rates of endocytosis. Using a range of cutting-edge approaches, the authors present compelling evidence that an actomyosin system, with the myosin TbMyo1 as an active molecular motor, is localized close to and can associate with the endosomal system in the bloodstream form of Trypanosoma brucei. It shows convincingly that both actin and Myo I play a role in the organization and integrity of the endosomal system: both RNAi-mediated depletion of Myo1, and treatment of the cells with latrunculin A resulted in endomembrane disruption. This work should be of interest to cell biologists and microbiologists working on the cytoskeleton, and unicellular eukaryotes.

    2. Reviewer #1 (Public Review):

      Using a combination of cutting-edge high-resolution technologies (expansion microscopy, SIM, and CLEM) and biochemical approaches (in vitro translocation of actin filaments, cargo uptake assays, and drug treatment), the authors revisit and update previous results about TbMyo1 and TbACT in the bloodstream form (BSF) of Trypanosoma brucei. They show that a great part of the myosin motor is cytoplasmic but the fraction associated with organelles is in proximity to the endosomal system and in glycosomes. In addition, they show that TbMyo1 can move actin filaments in vitro and visualize for the first time this actomyosin system using specific antibodies, a "classical" antibody for TbMyo1, and a chromobody for actin. Finally, using latrunculin A, which sequesters G-actin and prevents F-actin assembly, the authors show the delocalization and eventually the loss of the filamentous actin signal and the concomitant loss of the endosomal system integrity.<br /> Overall this well-conducted and convincing study paves the way toward the elucidation of the role of an actomyosin system in the maintenance of the endosomal network in T. brucei.

      Strengths:

      The work is of high quality and uses advanced technologies to determine the involvement TbMyo1 and actin in the integrity of the endosomal system. The conclusions are not over-interpreted and are supported by the experimental results and their quantification.

      Weaknesses:

      Although disruption of the actomyosin system using either the actin-depolymerizing drug latrunculin A or the TbMyo1-RNAi cell line established an effect on the endosomal system integrity, it remains to understand how this occurs mechanistically and what are the intracellular components involved.

    3. Reviewer #2 (Public Review):

      The study by Link et al. advances our understanding of the actomyosin system in T. brucei, focusing on the role of TbMyo1, a class I myosin, within the parasite's endosomal system. Using a combination of biochemical fractionation, in vitro motility assays, and advanced imaging techniques such as correlative light and electron microscopy (CLEM), this paper demonstrates that TbMyo1 is dynamically distributed across early and late endosomes, the cytosol, is associated with the cytoskeleton, and a fraction has an unexpected association with glycosomes. Notably, the study shows that TbMyo1 can translocate actin filaments at velocities suggesting an active role in intracellular trafficking, potentially higher than those observed for similar myosins in other cell types. This work not only elucidates the spatial dynamics of TbMyo1 within T. brucei but also suggests its broader involvement in maintaining the complex architecture of the endosomal network, underscoring the critical role of the actomyosin system in a parasite that relies on high rates of endocytosis for immune evasion.

      A key strength of the study is its exceptional rigor and successful integration of a wide array of sophisticated techniques, such as in vitro motility assays, and advanced imaging methods, e.g. CLEM. This combination of approaches underscores the study's comprehensive approach to examining the ultrastructural organization of the trypanosome endomembrane system. The application of functional data using inhibitors, such as latrunculin A for actin depolymerization, further strengthens the study by providing insights into the dynamics and regulatory mechanisms of the endomembrane system. This demonstrates how the actomyosin system contributes to cellular morphology and trafficking processes. Furthermore, the discovery of TbMyo1 localization to glycosomes introduces a novel aspect to the potential roles of myosin I proteins within the cell, particularly in the context of organelles analogous to peroxisomes. This observation not only broadens our understanding of myosin I functionality but also opens up new avenues for research into the cell biology of trypanosomatids, marking a significant contribution to the field.

      A significant initial weakness was the reliance on spatial association data to infer functional relationships without direct demonstration of biochemical activities in vivo. The authors have since addressed this by including new evidence from TbMyo1 RNAi cell lines and EM data that show the effects of TbMyo1 depletion on cellular ultrastructure. The authors' responses and additional data reinforce their initial conclusions and address previous concerns. Several new, elegant hypotheses are proposed in the discussion that warrant further investigation to fully understand TbMyo1's interactions and regulatory mechanisms in vivo.

    4. Reviewer #3 (Public Review):

      Summary:

      In this work, Link and colleagues have investigated the localization and function of the actomyosin system in the parasite Trypanosoma brucei, which represents a highly divergent and streamlined version of this important cytoskeletal pathway. Using a variety of cutting-edge methods, the authors have shown that the T. brucei Myo1 homolog is a dynamic motor that can translocate actin, suggesting that it may not function as a more passive crosslinker. Using expansion microscopy, iEM, and CLEM, the authors show that MyoI localizes to the endosomal pathway, specifically the portion tasked with internalizing and targeting cargo for degradation, not the recycling endosomes. The glycosomes also appear to be associated with MyoI, which was previously not known. An actin chromobody was employed to determine the localization of filamentous actin in cells, which was correlated with the localization of Myo1. Interestingly, the pool of actomyosin was not always closely associated with the flagellar pocket region, suggesting that portions of the endolysomal system may remain at a distance from the sole site of parasite endocytosis. Lastly, the authors used actin-perturbing drugs to show that disrupting actin causes a collapse of the endosomal system in T. brucei, which they have shown recently does not comprise distinct compartments but instead a single continuous membrane system with subdomains containing distinct Rab markers.

      Strengths:

      Overall, the quality of the work is extremely high. It contains a wide variety of methods, including biochemistry, biophysics, and advanced microscopy that are all well deployed to answer the central question. The data is also well quantitated to provide additional rigor to the results. The main premise, that actomyosin is essential for the overall structure of the T. brucei endocytic system, is well supported and is of general interest, considering how uniquely configured this pathway is in this divergent eukaryote and how important it is to the elevated rates of endocytosis that are necessary for this parasite to inhabit its host.

      Comments on revised version:

      The revised manuscript has addressed the main issue, the lack of TbMyo1 functional data that was brought up during the first round of review. I find it interesting that Myo1 depletion has what appears to be a limited effect on endocytosis while producing a similar fragmentation of the endocytic pathway to what is seen with the LatA treatments. As CCV remains in both LatA treatments and TbMyo1 RNAi, it seems apparent that the organization of the endocytic pathway is not required for at least basal levels of endocytosis.

      My other points were well addressed by the rebuttal. I am satisfied with the update.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment  

      This important study builds on a previous publication (with partially overlapping authors), demonstrating that T. brucei has a continuous endomembrane system, which probably facilitates high rates of endocytosis. Using a range of cutting-edge approaches, the authors present compelling evidence that an actomyosin system, with the myosin TbMyo1 as the molecular motor, is localized close to the endosomal system in the bloodstream form (BSF) of Trypanosoma brucei. It shows convincingly that actin is important for the organization and integrity of the endosomal system, and that the trypanosome Myo1is an active motor that interacts with actin and transiently associates with endosomes, but a role of Myo1 in endomembrane function in vivo was not directly demonstrated. This work should be of interest to cell biologists and microbiologists working on the cytoskeleton, and unicellular eukaryotes.

      We were delighted at the editors’ positive assessment and the reviewers’ rigorous, courteous, and constructive responses to the paper. We agree that a direct functional role for TbMyo1 in endomembrane activity was not demonstrated in the original submission, but have incorporated some new data (see new supplemental Figure S5) using the TbMyo1 RNAi cell line which are consistent with our earlier observations and interpretations.  

      Public Reviews:   

      Reviewer #1 (Public Review):  

      Using a combination of cutting-edge high-resolution approaches (expansion microscopy, SIM, and CLEM) and biochemical approaches (in vitro translocation of actin filaments, cargo uptake assays, and drug treatment), the authors revisit previous results about TbMyo1 and TbACT in the bloodstream form (BSF) of Trypanosoma brucei. They show that a great part of the myosin motor is cytoplasmic but the fraction associated with organelles is in proximity to the endosomal system. In addition, they show that TbMyo1 can move actin filaments in vitro and visualize for the first time this actomyosin system using specific antibodies, a "classical" antibody for TbMyo1, and a chromobody for actin. Finally, using latrunculin A, which sequesters G-actin and prevents F-actin assembly, the authors show the delocalization and eventually the loss of the filamentous actin signal as well as the concomitant loss of the endosomal system integrity. However, they do not assess the localization of TbMyo1 in the same conditions.  

      Overall the work is well conducted and convincing. The conclusions are not over-interpreted and are supported by the experimental results. 

      We are very grateful to Reviewer1 for their balanced assessment. The reviewer is correct that we did not assess the localisation of TbMyo1 following latrunculin A treatment, but it is worth noting that Spitznagel et al. carried out this exact experiment in the earlier 2010 paper – we have mentioned this in the revised manuscript.  

      Reviewer #2 (Public Review):  

      Summary:  

      The study by Link et al. advances our understanding of the actomyosin system in T. brucei, focusing on the role 

      of TbMyo1, a class I myosin, within the parasite's endosomal system. Using a combination of biochemical fractionation, in vitro motility assays, and advanced imaging techniques such as correlative light and electron microscopy (CLEM), this paper demonstrates that TbMyo1 is dynamically distributed across early and late endosomes, the cytosol, is associated with the cytoskeleton, and a fraction has an unexpected association with glycosomes. Notably, the study shows that TbMyo1 can translocate actin filaments at velocities suggesting an active role in intracellular trafficking, potentially higher than those observed for similar myosins in other cell types. This work not only elucidates the spatial dynamics of TbMyo1 within T. brucei but also suggests its broader involvement in maintaining the complex architecture of the endosomal network, underscoring the critical role of the actomyosin system in a parasite that relies on high rates of endocytosis for immune evasion. 

      Strengths:  

      A key strength of the study is its exceptional rigor and successful integration of a wide array of sophisticated techniques, such as in vitro motility assays, and advanced imaging methods, including correlative light and electron microscopy (CLEM) and immuno-electron microscopy. This combination of approaches underscores the study's comprehensive approach to examining the ultrastructural organization of the trypanosome endomembrane system. The application of functional data using inhibitors, such as latrunculin A for actin depolymerization, further strengthens the study by providing insights into the dynamics and regulatory mechanisms of the endomembrane system. This demonstrates how the actomyosin system contributes to cellular morphology and trafficking processes. Furthermore, the discovery of TbMyo1 localization to glycosomes introduces a novel aspect to the potential roles of myosin I proteins within the cell, particularly in the context of organelles analogous to peroxisomes. This observation not only broadens our understanding of myosin I functionality but also opens up new avenues for research into the cellular biology of trypanosomatids, marking a significant contribution to the field. 

      We are very pleased that the Reviewer felt the work is a significant contribution to the state of the art.  

      Weaknesses:  

      Certain limitations inherent in the study's design and scope render the narrative incomplete and make it challenging to reach definitive conclusions. One significant limitation is the reliance on spatial association data, such as colocalization of TbMyo1 with various cellular components-or the absence thereof-to infer functional relationships. Although these data suggest potential interactions, the authors do not confirm functional or direct physical interactions.  

      While TbMyo1's localization is informative, the authors do not directly demonstrate its biochemical or mechanical activities in vivo, leaving its precise role in cellular processes speculative. Direct assays that manipulate TbMyo1 levels, activity, and/or function, coupled with observations of the outcomes on cellular processes, would provide more definitive evidence of the protein's specific roles in T. brucei. A multifaceted approach, including genetic manipulations, uptake assays, kinetic trafficking experiments, and imaging, would offer a more robust framework for understanding TbMyo1's roles. This comprehensive approach would elucidate not just the "what" and "where" of TbMyo1's function but also the "how" and "why," thereby deepening our mechanistic insights into T. brucei's biology.  

      The reviewer is absolutely correct that the study lacks data on direct or indirect interactions between TbMyo1 and its intracellular partners, and this is an obvious area for future investigation. Given the generally low affinities of motor-cargo interactions, a proximity labelling approach (such has already been successfully used in studies of other myosins) would probably be the best way to proceed. 

      The reviewer is also right to highlight that a detailed mechanistic understanding of TbMyo1 function in vivo is currently lacking. We feel that this would be beyond the scope of the present work, but have included some new data using the TbMyo1 RNAi cell line (Figure S5), which are consistent with our previous findings.  

      Reviewer #3 (Public Review):  

      Summary:  

      In this work, Link and colleagues have investigated the localization and function of the actomyosin system in the parasite Trypanosoma brucei, which represents a highly divergent and streamlined version of this important cytoskeletal pathway. Using a variety of cutting-edge methods, the authors have shown that the T. brucei Myo1 homolog is a dynamic motor that can translocate actin, suggesting that it may not function as a more passive crosslinker. Using expansion microscopy, iEM, and CLEM, the authors show that MyoI localizes to the endosomal pathway, specifically the portion tasked with internalizing and targeting cargo for degradation, not the recycling endosomes. The glycosomes also appear to be associated with MyoI, which was previously not known. An actin chromobody was employed to determine the localization of filamentous actin in cells, which was correlated with the localization of Myo1. Interestingly, the pool of actomyosin was not always closely associated with the flagellar pocket region, suggesting that portions of the endolysomal system may remain at a distance from the sole site of parasite endocytosis. Lastly, the authors used actin-perturbing drugs to show that disrupting actin causes a collapse of the endosomal system in T. brucei, which they have shown recently does not comprise distinct compartments but instead a single continuous membrane system with subdomains containing distinct Rab markers.  

      Strengths:  

      Overall, the quality of the work is extremely high. It contains a wide variety of methods, including biochemistry, biophysics, and advanced microscopy that are all well-deployed to answer the central question. The data is also well-quantitated to provide additional rigor to the results. The main premise, that actomyosin is essential for the overall structure of the T. brucei endocytic system, is well supported and is of general interest, considering how uniquely configured this pathway is in this divergent eukaryote and how important it is to the elevated rates of endocytosis that are necessary for this parasite to inhabit its host.  

      We are very pleased that the Reviewer formed such a positive impression of the work. 

      Weaknesses:  

      (1) Did the authors observe any negative effects on parasite growth or phenotypes like BigEye upon expression of the actin chromobody?  

      Excellent question! There did appear to be detrimental effects on cell morphology in some cells, and it would definitely be worth doing a time course of induction to determine how quickly chromobody levels reach their maximum. The overnight inductions used here are almost certainly excessive, and shorter induction times would be expected to minimise any detrimental effects. We have noted these points in the Discussion.  

      (2) The Garcia-Salcedo EMBO paper cited included the production of anti-actin polyclonal antibodies that appeared to work quite well. The localization pattern produced by the anti-actin polyclonals looks similar to the chromobody, with perhaps a slightly larger labeling profile that could be due to differences in imaging conditions. I feel that the anti-actin antibody labeling should be expressly mentioned in this manuscript, and perhaps could reflect differences in the F-actin vs total actin pool within cells.  

      Implemented. We have explicitly mentioned the use of the anti-actin antibody in the Garcia-Salcedo paper in the revised Results and Discussion sections.  

      (3) The authors showed that disruption of F-actin with LatA leads to disruption of the endomembrane system, which suggests that the unique configuration of this compartment in T. brucei relies on actin dynamics. What happens under conditions where endocytosis and endocyctic traffic is blocked, such as 4 C? Are there changes to the localization of the actomyosin components? 

      Another excellent question! We did not analyse the localisation of TbMyo1 and actin under temperature block conditions, but this would definitely be a key experiment to do in follow-up work.

      (4) Along these lines, the authors suggest that their LatA treatments were able to disrupt the endosomal pathway without disrupting clathrin-mediated endocytosis at the flagellar pocket. Do they believe that actin is dispensable in this process? That seems like an important point that should be stated clearly or put in greater context.  

      Whether actin plays a direct or indirect role in endocytosis would be another fascinating question for future enquiry, and we do not have the data to do more than speculate on this point. Recent work in mammalian cells (Jin et al., 2022) has suggested that actin is primarily recruited when endocytosis stalls, and it could be that a similar role is at play here. We have noted this point in the Discussion. The observation of clathrin vesicles close to the flagellar pocket membrane and clathrin patches on the flagellar pocket membrane itself in the LatA-treated cells might suggest that some endocytic activity can occur in the absence of filamentous actin. 

      Recommendations for the authors:

      Note from the Reviewing Editor:  

      During discussion, all reviewers agreed that the role of TbMyo1 in vivo in endomembrane function had not been directly demonstrated. This could be done by testing the endocytic trafficking of (for example) fluorophoreconjugated TfR and BSA in the existing Myo1 RNAi line, using wide-field microscopy. Examining the endosomes/lysosomes' organization by thin-section EM would be even better. The actin signal detected by the chromobody tends to occupy a larger region than the MyoI. It's therefore conceivable that actin filamentation and stabilization via other actin-interacting proteins create the continuous endosomal structure, while MyoI is necessary for transport or other related processes. 

      These are all excellent points and very good suggestions. We have now incorporated new data (supplemental Figure S5) that includes BSA uptake assays in the TbMyo1 RNAi cell line and electron microscopy imaging after TbMyo1 depletion – the results are consistent with our earlier observations.   

      Reviewer #1 (Recommendations For The Authors):  

      -  Figure S2E. This panel is supposed to show the downregulation of TbMyo1 in the PCF compared to BSF but there is no loading control to support this claim. This is important because the authors mention in lines 381-383 that this finding conflicts with the previous study (Spitznagel et al., 2010). The authors also indicate in the figure legend that there is 50% less signal but there is no explanation about this quantification.   

      Good point. Equal numbers of cells were loaded in each lane, but we did not have an antibody against a protein known to be expressed at the same level in both PCF and BSF cells to use as a loading control. Using a total protein stain would have been similarly unhelpful in this context, as the proteomes of PCF and BSF cells are dissimilar. The quantification was made by direct measurement after background subtraction, but without normalisation owing to the lack of a loading control. This makes the conclusion somewhat tentative, but given the large difference in signal observed between the two samples (and the fact that this is consistent with the proteomic data obtained by Tinti and Ferguson) we feel that the conclusion is valid. We have clarified these points in the figure legend and Discussion.  

      -  It is mentioned in the discussion, as unpublished observations, that the predicted FYVE motif of TbMyo1 can bind specifically PI(3)P lipids. This is a very interesting point that would be new and would strengthen the suggested association with the endosomal system mainly based on imaging data. 

      We agree that this is – potentially – a very exciting observation and it is an obvious direction for future enquiry.  

      The data are preliminary at this stage and will form the basis of a future publication. Given that the predicted FYVE domain of TbMyo1 and known lipid-binding activity of other class I myosins makes this activity not wholly unexpected, we feel that it is acceptable at this stage to highlight these preliminary findings.  

      -  The authors use the correlation coefficient to estimate the colocalization (lines 223-226). Although they clearly explain the difference between the correlation coefficient and the co-occurrence of two signals, I wonder if it would not be clearer for the audience to have quantification of the overlapping signals. Also, it is not mentioned on which images the correlation coefficient was measured. It seems that it is from widefield images (Figures 3E and 6E), and likely from SIM images for Figure 3C but the resolution is different. Are widefield images sufficient to assess these measurements? 

      With hindsight, and given the different topological locations of TbMyo1 and the cargo proteins (cytosolic and lumenal, respectively) it would probably have been wiser to measure co-occurrence rather than correlation, but we would prefer not to repeat the entire analysis at this stage. The correlations were measured from widefield images using the procedure described in the Materials & Methods. These are obviously lower resolution than confocal or SIM images would be, but are still of value, we believe. One further point – upon re-examination of some of the TbMyo1 transferrin (Tf) and BSA data, we noticed that there are many pixels with a value of 0 for Tf/BSA and a nonzero value for TbMyo1 and vice-versa. The incidence of zero-versus-nonzero values in the two channels will have lowered the correlation coefficient, and in this sense, the correlation coefficients are giving us a hint of what the immuno-EM images later confirm: that the TbMyo1 and cargo are present in the same locations, but in different proportions. We have added this point to the discussion.  

      -  It would be good to know if the loss of the endosomal system integrity (using EBI) is the same upon TbMyo1 depletion than in the latrunculin A treated parasites. 

      We agree! We have now included new data (Figure S5) that suggests endosomal system morphology is altered upon TbMyo1 depletion. We would predict that the effect upon TbMyo1 depletion is slower or less dramatic than upon LatA treatment (as LatA affects both actin and TbMyo1, given that TbMyo1 depends upon actin for its localisation).

      -  Conversely, it would be of interest to see how the localization of TbMyo1 changes upon latrunculin A treatment.

      This experiment was done in 2010 by Spitznagel et al., who observed a delocalisation of the TbMyo1 signal after LatA treatment. We have noted this in the Results and Discussion.

      Minor corrections:  

      -  Line 374: Figure S1 should be Figure S2. 

      Implemented (many thanks!).  

      -  Panel E of Figure S2 refers to TbMyo1 and should therefore be included in Figure S1 and not S2. 

      We would prefer not to implement this suggestion. We did struggle over the placing of this panel for exactly this reason, but as the samples were obtained as part of the experiments described in Figure S2, we felt that its placement here worked best in terms of the narrative of the manuscript.    

      -  Figure S2F: the population of TbMyo21 +Tet seems lost after 48 h although the authors mention that there is no growth defect. 

      Good eyes! We have re-added the panel, which shows that there was no growth defect in the tetracycline-treated population.  

      Reviewer #2 (Recommendations For The Authors):  

      Fig 1 vs. Figure 3: The biochemical fractionation experiments have been well-controlled, showing that 40% of TbMyo1 is found in both the cytosolic and cytoskeletal fractions, with only 20% in the organelle-associated fraction. The conclusion is supported by the experimental design, which includes controls to rule out crosscontamination between fractions. However, does this contrast with the widefield microscopy experiments, where the vast majority of the signal is in endocytic compartments and nowhere else? 

      This is a good point. There are three factors that probably explain this. First, given that the actin cytoskeleton is associated with the endosomal system, a large proportion of the material partitioning into the cytoskeleton (P2) fraction is probably localised to the endosomal system (a fun experiment would be to repeat the fractionation with addition of ATP to the extraction buffer to make the myosin dissociate and see whether more appeared in the SN2 fraction as a result). Second, the 40% of the TbMyo1 that is cytosolic is distributed throughout the entire cellular volume, whereas the material localised to the endosomes is concentrated in a much smaller space, by comparison, and producing a stronger signal. Third, the widefield microscopy images have had brightness and contrast adjusted in order to reduce “background” signal, though this will also include cytosolic molecules. We hope these explanations are satisfactory, but would welcome any additional thoughts from either the reviewer or the community.  

      The section title 'TbMyo1 translocates filamentous actin at 130 nm/s' could mislead readers by not specifying that the findings are from an in vitro experiment with a recombinant protein, which may not fully reflect the cell's complex context. Although this detail is noted in the figure legend, incorporating it into the main text and considering a title revision would ensure clarity and accuracy.  

      Good point. Implemented – we have amended the section title to “TbMyo1 translocates filamentous actin at 130 nm/s in vitro” and the figure legend title to “TbMyo1 translocates filamentous actin in vitro”.  

      The discussion of the translocation experiment could be better phrased addressing certain limitations. The in vitro conditions might not fully capture the complexity and dynamic nature of cellular environments where multiple regulatory mechanisms, interacting partners, and cellular compartments come into play. 

      Good point, implemented. We have added a note on this to the Discussion.  

      It is puzzling that RNAi, which is widely used in T. brucei was not used to further investigate the functional roles of TbMyo1 in Trypanosoma brucei. Given that the authors already had the cell line and used it to validate the specificity of the anti-TbMyo1. RNAi could have been employed to knock down TbMyo1 expression and observe the resultant effects on actin filament dynamics and organization within the cell. This would have directly tested TbMyo1's contribution to actin translocation observed in the in vitro experiments. 

      It would obviously be interesting to carry out an in-depth characterisation of the phenotype following TbMyo1 depletion and whether this has an effect on actin dynamics. We have now included additional data (supplemental Figure S5) using the TbMyo1 RNAi cells and the results are consistent with our earlier observations and interpretations. It is worth noting too that at least for electron microscopy studies of intracellular morphology, the slower onset of an RNAi phenotype and the asynchronous replication of T. brucei populations make observation of direct (early) effects of depletion challenging – hence the preferential use of LatA here to depolymerise actin and trigger a faster phenotype.  

      I found that several declarative statements within the main text may not be fully supported by the overall evidence. I suggest modifications to present a more balanced view,  

      Line 227: "The results here suggest that although the TbMyo1 distribution overlaps with that of endocytic cargo, the signals are not strongly correlated." This conclusion about the lack of strong correlation might mislead readers about the functional relationship between TbMyo1 and endocytic cargo, as colocalization does not directly imply functional interaction. 

      We would prefer not to alter this statement. It was our intention to phrase this cautiously, as we have not directly investigated the functional interplay between TbMyo1 and endocytic cargo and the subsequent sentence directs the reader to the Discussion for more consideration of this issue.    

      Line 397: "This relatively high velocity might indicate that TbMyo1 is participating in intracellular trafficking of BSF T. brucei and functioning as an active motor rather than a static tether." The statement directly infers TbMyo1's functional role from in vitro motility assay velocities without in vivo corroboration.

      We have amended the sentence in the Discussion to make it clear that it is speculative.  

      The hypothesis that cytosolic TbMyo1 adopts an auto-inhibited "foldback" configuration, drawn by analogy with findings from other studies, is intriguing. Yet, direct evidence linking this configuration to TbMyo1's function in T. brucei is absent from the data presented. 

      We have amended the sentence in the Discussion to make it clear that it is speculative. Future in vitro experiments will test this hypothesis directly.  

      The suggestion that a large cytosolic fraction of TbMyo1 indicates dynamic behavior, high turnover on organelles, and a low duty ratio is plausible but remains speculative without direct experimental evidence. Measurements of TbMyo1 turnover rates or duty ratios in T. brucei through kinetic studies would substantiate this claim with the necessary evidence.  

      We have amended the sentence in the Discussion to make it clear that it is speculative, and deleted the reference to a possible low duty ratio. Again, future in vitro experiments will measure the duty ratio of TbMyo1 using stopped-flow. 

      Reviewer #3 (Recommendations For The Authors):  

      Lines 171-172: The authors mention that MyoI could be functioning as a motor rather than a tether. The differences in myosin function have not been introduced prior to this. I would recommend explaining these differences and what it could mean for the function of the motor in the introduction to help a non-expert audience.

      Good point. Implemented.  

      Line 94-95: This phenotype only holds for the bloodstream form- the procyclic form are quite resistant to actin RNAi and MyoI RNAi. I would clarify. 

      Good point. Implemented.  

      Line 142-146: did the authors attempt to knock out the Myo21? 

      Good point. No, this was not attempted. Given the extremely low expression levels of TbMyo21 in the BSF cells we would not expect a strong phenotype, but this assumption would be worth testing. 

      Figure 3D: is there a reason why the authors chose to show the single-channel images in monochrome in this case?  

      Not especially. These panels are the only ones that show a significant overlap in the signals between the two channels (unlike the colabelling experiments with ER, Golgi), so greyscale images were used because of their higher contrast. 

      Line 397-398: I'm struggling a bit to understand how MyoI could be involved in intracellular trafficking in the endosomal compartments if the idea is that we have a continuous membrane? Some more detail as to the author's thinking here would be useful. 

      Implemented. We have noted that this statement is speculative, and emphasised that being an active motor does not automatically mean that it is involved in intracellular traffic – it could instead be involved in manipulating endosomal membranes. We have noted too that the close proximity between TbMyo1 and the lysosome (Figures

      3-5) could be important in this regard. The lysosome is not contiguous with the endosomal system, and it is possible that TbMyo1 is working as a motor to transport material (class II clathrin-coated vesicles) from the endosomal system to the lysosome.  

      Line 493-496: Does this mean that endocytosis from the FP does not require actin? This would be hard to explain considering the phenotypes observed in the original actin RNAi work. Is the BigEye phentopye observed in BSF actin RNAi and Myo1 RNAi cells due to some indirect effect? 

      It seems possible that actin is not directly or essentially involved in endocytosis, and the characterisation of the actin RNAi phenotype would be worth revisiting in this respect – we have noted this in the Discussion. Although RNAi of actin was lethal, the phenotype appears less penetrant than that seen following depletion of the essential endocytic cofactor clathrin (based on the descriptions in Garcia-Salcedo et al., 2004 and Allen et al., 2003). BigEye phenotypes occur in BSF cells whenever there is some perturbation of endomembrane trafficking and are not necessarily a direct consequence of depletion – this is why careful investigation of early timepoints following RNAi induction is critical.

    1. Author response:

      Reviewer #3 (Public Review):

      The paper by Rai and colleagues examines the transcriptional response of Candida glabrata, a common human fungal pathogen, during interaction with macrophages. They use RNA PolII profiling to identify not just the total transcripts but instead focus on the actively transcribing genes. By examining the profile over time, they identify particular transcripts that are enriched at each timepoint, and build a hierarchical model for how a transcription factor, Xbp1, may regulate this response. Due to technical difficulties in identifying direct targets of Xbp1 during infection, the authors then turn to the targets of Xbp1 during cellular quiescence.

      The authors have generated a large and potentially impactful dataset, examining the responses of C. glabrata during an important host-pathogen interface. However, the conclusions that the authors make are not well supported by the data. The ChIP-seq is interesting, but the authors make conclusions about the biological processes that are differentially regulated without testing them experimentally. Because Candida glabrata has a significant percent of the genome without GO term annotation, the GO term enrichment analysis is less useful than in a model organism. To support these claims, the authors should test the specific phenotypes, and validate that the transcriptional signature is observed at the protein level.

      Additionally, the authors should also include images of the infections, along with measurements of phagocytosis, to show that the time points are the appropriate. At 30 minutes, are C. glabrata actually internalized or just associated? This may explain the difference in adherence genes at the early timepoint. For example, in Lines 123-132, the authors could measure the timing of ROS production by macrophages to determine when these attacks are deployed, instead of speculating based on the increased transcription of DNA damage response genes. Potentially, other factors could be influencing the expression of these proteins. At the late stage of infection, the authors should measure whether the C. glabrata cells are proliferating, or if they have escaped the macrophage, as other fungi can during infection. This may explain some of the increase in transcription of genes related to proliferation.

      An additional limitation to the interpretation of the data is that the authors should put their work in the context of the existing literature on C. albicans temporal adaptation to macrophages, including recent work from Munoz (doi: 10.1038/s41467-019-09599-8), Tucey (doi: 10.1016/j.cmet.2018.03.019), and Tierney (doi: 10.3389/fmicb.2012.00085), among others.

      When comparing the transcriptional profile between WT and xbp1 mutant, it is not clear whether the authors compared the strains under non-stress conditions. The authors should include an analysis of the wild-type to xbp1 mutants in the absence of macrophage stress, as the authors claims of precocious transcription may be a function of overall decreased transcriptional repression, even in the absence of the macrophage stress. The different cut-offs used to call peaks in the two strain backgrounds is also somewhat concerning-it is not clear to me whether that will obscure the transcriptional signature of each of the strains. Additionally, the authors go on to show that the xbp1 mutant has a significant proliferation defect in macrophages, so potentially this could confound the PolII binding sites if the cells are dying.

      In the section on hierarchical analysis of transcription factors, at least one epistasis experiment should have been performed to validate the functional interaction between Xbp1 and a particular transcription factor. If the authors propose a specific motif, they should test this experimentally through EMSA assays to fully test that the motif is functional.

      The jump from macrophages to quiescent culture is also not well justified. If the transcriptional program is so dynamic during a timecourse of macrophage infection, it is hard to translate the findings from a quiescent culture to this host environment.

      Overall, there is a strong beginning and the focus on active transcription in the macrophage is an exciting approach. However, the conclusions need additional experimental evidence.

      We thank this reviewer’s critical analysis of our manuscript and the comments.

      We fully agree that the jump from macrophages to quiescent culture is also not well justified. We have successfully performed CgXbp1 ChIP-seq during macrophage infection and have rewritten the manuscript according to the new results. With the CgXbp1 ChIP-seq data during macrophage infection added, we have removed the data related to quiescence to focus the paper on the macrophage response. Because of this, we have also removed the DNA binding motif analysis from this work and will report the findings in a separate manuscript comparing CgXbp1 bindings between macrophage response and quiescence.

      As mentioned above, the RNAPII ChIP-seq time course experiment compared RNAP occupancies at different times during infection to the first infection time point. We did not calculate relative to the data in the absence of stress (e.g. before infection), because Xbp1 was expressed at a low level and induced by stresses. Hence its role under no stress conditions is expected to be less than inside macrophages. In addition, up-regulation of its target genes depends on the presence of their transcriptional activators under the experimental conditions, which is going to be very different in normal growth media (RPMI or YPD; i.e. before infection) versus inside macrophages. Hence, comparing to normal growth media would not show the real CgXbp1 effects and/or the CgXbp1 effect might be different. In fact, this can be seen from the new RNAseq analysis of wildtype and Cgxbp1∆ C. glabrata cells in the presence and absence of fluconazole (which are added to the revised manuscript to study CgXbp1’s role on fluconazole resistance). The result shows that CgXbp1 (which was expressed at a low level) has a very small effect on global expression and the up-regulated genes are mainly related to transmembrane transport. More importantly, the effect of the Cgxbp1∆ mutant on TCA cycle and amino acid biosynthesis genes’ expression during macrophage infection is not observed when the mutant is grown under normal growth conditions (YPD without fluconazole). Therefore, the results show that CgXbp1 has condition-specific effects on global gene expression, which is also dependent on the transcriptional activators present in the cell. The result of the new RNAseq analysis of wildtype and Cgxbp1∆ C. glabrata cells in the absence of fluconazole is described in lines 329-339 as follows: “On the other hand, 135 genes were differentially expressed in the Cgxbp1∆ mutant during normal exponential growth (i.e. no fluconazole treatment) (Figure 6c) with up-regulated genes highly enriched with the “transmembrane transport” function and down- regulated genes associated with different metabolic processes (e.g. carbohydrate, glycogen and trehalose) (e.g. carbon metabolism, nucleotide metabolism, and transmembrane transport, etc.) (Supplementary Table 12). Interesting, the TCA cycle and amino acid biosynthesis genes, whose expressions were accelerated in the Cgxbp1∆ mutant during macrophage (Figure 3C, 3D), were not affected by the loss of CgXbp1 function under normal growth conditions (i.e. in YPD media without fluconazole) (Supplementary Figure 11, Supplementary Table 11), suggesting that the overall (direct and indirect) effects of CgXbp1 are condition-specific.”

      For the comment about RNAPII bindings affected by dying cells, our observation of reduced proliferation does not mean that the cells were dying, because we did observe increase in cell numbers over time (i.e. the cells were proliferating) but the rate of proliferation was slower in the Cgxbp1∆ mutant comparing to wildtype. Presumably, the reduced proliferation and/or growth within macrophages is due to poorer adaptation in and compromised response to macrophages.

      We have also discussed our findings in the context of the suggested (and other) literatures in various parts of the Discussion.

      Reviewer #4 (Public Review):

      Macrophages are the first line of defense against invading pathogens. C. glabrata must interact with these cells as do all pathogens seeking to establish an infection. Here, a ChIP-seq approach is used to measure levels of RNA polymerase II levels across Cg genes in a macrophage infection assay. Differential gene expression is analyzed with increasing time of infection. These differentially expressed genes are compared at the promoter level to identify potential transcription factors that may be involved in their regulation. A factor called CgXbp1 on the basis of its similar with the S. cerevisiae Xbp1 protein is characterized. ChIP-seq is done on CgXbp1 using in vitro grown cells and a potential binding site identified. Evidence is provided that CgXbp1 affects virulence in a Galleria system and that this factor might impact azole resistance.

      As the authors point out, candidiasis associated with C. glabrata has dramatically increased in the recent past. Understanding the unique aspects of this Candida species would be a great value in trying to unravel the basis of the increasing fungal disease caused by C. glabrata. The use of ChIP-seq analysis to assess the time-dependent association of RNA polymerase II with Cg genes is a nice approach. Identification of CgXbp1 as a potential participant in the control of this gene expression program is also interesting. Unfortunately, this work suffers by comparison to a significant amount of previous effort that renders the progress detailed here incremental at best.

      I agree that their ChIP-seq time course of RNA polymerase II distribution across the Cg genome is both elegant and an improvement on previous microarray experiments. However, these microarray experiments were carried out 14 years ago and while the current work is certainly at higher resolution, little more can be gleaned from the current work. The authors argue that standard transcriptional analysis is compromised by transcript stability effects. I would suggest that, while no approach is without issues, quite a bit has been learned from approaches like RNA-seq and there are recent developments to this technique that allow for a focus on newly synthesized mRNA (thiouridine labeling).

      The CgXbp1 characterization relies heavily on work from S. cerevisiae. This is disappointing as conservation of functional links between C. glabrata and S. cerevisiae is not always predictable.

      The effects caused by loss of CgXBP1 on virulence (Figure 4) may be statistically significant but are modest. No comparison is shown for another gene that has already been accepted to have a role in virulence to allow determination of the biological importance of this effect.

      The phenotypic effects of the loss of XBP1 on azole resistance look rather odd (Figure 6). The appearance of fluconazole resistant colonies in the xbp1 null strain occurs at a very low frequency and seems to resemble the appearance of rho0 cells in the population. The vast majority of xbp1 null cells do not exhibit increased growth compared to wild-type in the presence of fluconazole.

      Irrespective of the precise explanation, more analysis should be performed to confirm that CgXbp1 is negatively regulating the genes suggested in Figure 6A to be responsible for the increased fluconazole resistance.

      Additionally, the entire analysis of CgXbp1 is based on ChIP-seq performed using cells grown under very different conditions that the RNA polymerase II study. Evidence should be provided that the presumptive CgXbp1 target genes actually impact the expression profiles established earlier.

      We thank this reviewer’s critical analysis of our manuscript. We have done the following to address the comments. As a result, the manuscript is significantly improved.

      • The ChIP-seq data of Xbp1 in macrophage has been successfully generated and the result is now presented in Figure 2C-2F, and lines 182-227 of the revised manuscript. With the addition, we have removed the ChIPseq data related to quiescent from the revised manuscript and re-written the manuscript focusing on the role of Xbp1 in macrophage.

      • We agree that the conservation of functional links between C. glabrata and S. cerevisiae is not always predictable. That’s the reason why we did not solely rely on the S. cerevisiae network for inferring Xbp1’s functions, and had undertaken several different ways (e.g. ChIP-seq of Xbp1 and characterization of the Cgxbp1∆ mutant) to delineate its functions.

      • We also agree that the virulence effect is modest, but it is, nevertheless, an effect that may contribute to the overall virulence of C. glabrata. Since virulence is a pleiotropic trait involving many genes and every gene affects different aspects of the complex process, we feel that it is not fair to penalize a given gene based on its (weaker) effect relative to another gene. Therefore, we respectfully disagree that another gene should be included for benchmarking the effect.

      • We have measured C. glabrata cell numbers in a time course experiment. The result (presented in Figure 4A) showed that there was an increase in cell number at the end of the macrophage infection time course experiment (e.g. 8 hr). We have highlighted this information on lines 278-283.

      • Additional analysis of the fluconazole resistance phenotype of the Cgxbp1∆ mutant has been added, including standard MIC assays. The results are presented in Figure 5C-5E.

      • As suggested and to understand the role of CgXbp1 on fluconazole resistance, we have now carried out RNAseq analysis of WT and the Cgxbp1∆ mutant in the presence and absence of fluconazole. The genes differentially controlled in the Cgxbp1∆ mutant have been identified and a proposed model on how CgXbp1 affects fluconazole resistance is added to Figure 7 in the revised manuscript.

    1. Author response:

      Reviewer #1 (Public Review):

      The authors conducted cross-species comparisons between the human brain and the macaque brain to disentangle the specific characteristics of structural development of the human brain. Although previous studies had revealed similarities and differences in brain anatomy between the two species by spatially aligning the brains, the authors made the comparison along the chronological axis by establishing models for predicting the chronological ages with the inputting brain structural features. The rationale is actually clear given that brain development occurs over time in both. More interestingly, the model trained on macaque data was better able to predict the age of humans than the human-trained model was at predicting macaque age. This revealed a brain cross-species age gap (BCAP) that quantified the discrepancy in brain development between the two species, and the authors even found this BCAP measure was associated with performance on behavioral tests in humans. Overall, this study provides important and novel insights into the unique characteristics of human brain development. The authors have employed a rigorous scientific approach, reflecting diligent efforts to scrutinize the patterns of brain age models across species. The clarity of the rationale, the interpretability of the methods, and the quality of the presentation all contribute to the strength of this work.

      We are grateful to your helpful and thorough review and for being so positive about our manuscript. Following your recommendations, we have added more analytic details that have strengthened our paper. We would like to thank you for your input.

      Reviewer #2 (Public Review):

      In the current study, Li et al. developed a novel approach that aligns chronological age to a cross-species brain age prediction model to investigate the evolutionary effect. This method revealed some interesting findings, like the brain-age gap of the macaque model in predicting human age will increase as chronological age increases, suggesting an evolutionary alignment between the macaque brain and the human brain in the early stage of development. This study exhibits ample novelty and research significance. However, I still have some concerns regarding the reliability of the current findings.

      We thank you for the positive and appreciative feedback on our work and the insightful comments, which we have addressed below.

      Question 1: Although the authors named their new method a "cross-species" model, the current study only focused on the prediction between humans and macaques. It would be better to discuss whether their method can also generalize to cross-species examination of other species (e.g., C. elegans), which may provide more comprehensive evolutionary insights. Also, other future directions with their new method are worth discussing.

      We appreciate your insightful comment regarding the generalizability of our model to other species. As you said, we indeed only performed human-macaque cross-species study not including other species. In our study, we only focused human and macaque because macaque is considered to be one of the closest primates to humans except chimpanzees and thus is considered to be the best model for studying human brain evolution. However, our proposed method has limitations that limit its generalizability for other species, e.g., C. elegans. First, our model was trained using MRI data, which limits its applicability to species for which such data is unavailable. This technological requirement brings a barrier to broaden cross-species application. Second, our current model is based on homologous brain atlases that are available for both humans and macaques. The lack of comparable atlases for other species further restricts the model's generalizability. We have discussed this limitation in the revised manuscript and outlined potential future directions to overcome these challenges. This includes discussing the need for developing comparable imaging techniques and standardized brain atlases across a wider range of species to enhance the model's applicability and broaden our understanding of cross-species neurodevelopmental patterns.

      On page 15, lines 11-18

      “However, the existing limitation should be noted regarding the generalizability of our proposed approach for cross-species brain comparison. Our current model relies on homologous brain atlases, and the lack of comparable atlases for other species restricts its broader applicability. To address this limitation, future research should focus on developing prediction models that do not depend on atlases. For instance, 3D convolutional neural networks could be trained directly on raw MRI data for age prediction. These deep learning models may offer greater flexibility for cross-species applications once the training within species is complete. Such advancements would significantly enhance the model's adaptability and expand its potential for comparative neuroscience studies across a wider range of species.”

      Question 2: Algorithm of prediction model. In the method section, the authors only described how they chose features, but did no description about the algorithm (e.g., supporting vector regression) they used. Please add relevant descriptions to the methods.

      Thank you for your comment. We apologize for not providing sufficient details about the model training process in our initial submission. In our study, we used a linear regression model for prediction. We have provided more details regarding the algorithm of prediction model in our response to Reviewer #1. For your convenience, we have attached them below.

      For details on the algorithm of prediction model:

      “A linear regression model was adopted for intra- and inter-species age prediction. The linear regression model was built including the following three main steps: 1) Feature selection: a total of two steps are required to extract the final features. The first step is preliminary extraction. First, all the human or macaque participants were divided into 10-fold and 9-fold was used for model training and 1-fold for model test. The preliminary features were chosen by identifying the significantly age-associated features with p < 0.01 during calculating Pearson’s correlation coefficients between all the 260 features and actual ages of the 9-fold subjects. This process was repeated 100 times. Since we obtained not exactly the same preliminary features each time, we thus further analyzed the preliminary features using two methods to determine the final features: common features and minimum mean absolute error (min MAE). Common features are the preliminary features that were selected in all the 100 times during preliminary model training. The min MAE features were the preliminary features that with the smallest MAE value during the 100 times model test for predicting age. After the above feature selections, we obtained two sets of features: 62 macaque features and 225 human features (common features) and 117 macaque features and 239 human features (min MAE). In addition, to further exclude the influences of unequal number of features in human and macaque, we also selected the first 62 features in human and macaque to test the model prediction performances. 2) Model construction: we conducted age prediction linear model using 10-fold cross-validation based on the selected features for human and macaque separately. The linear model parameters are obtained using the training set data and applied to the test set for prediction. The above process is also repeated 100 times. 3) Prediction: with the above results, we obtained the optimal linear prediction models for human and macaque. Next, we performed intra-species and inter-species brain age prediction, i.e., human model predicted human age, human model predicted macaque age, macaque model predicted macaque age and macaque model predicted human age. Three sets of features (62 macaque features and 225 human features; 117 macaque features and 239 human features; 62 macaque features and 62 human features) were used to test the prediction models for cross-validation and to exclude effects of different number of features in human and macaque. In the main text, we showed the results of brain age prediction, brain developmental and evolutional analyses based on common features and the results obtained using other two types of features were shown in supplementary materials. The prediction performances were evaluated by calculating the Pearson’s correlation and MAE between actual ages and predicted ages.”

      Question 3: Sex difference. The sex difference results are strange to me. For example, in the second row of Figure Supplement 3A, different models show different correlation patterns, but why their Pearson's r is all equal to 0.3939? If they are only typo errors, please correct them. The authors claimed that they found no sex difference. However, the results in Figure Supplement 3 show that, the female seems to have poorer performance in predicting macaque age from the human model. Moreover, accumulated studies have reported sex differences in developing brains (Hines, 2011; Kurth et al., 2021). I think it is also worth discussing why sex differences can't be found in the evolutionary effect.

      Reference:

      Hines, M. (2011). Gender development and the human brain. Annual review of neuroscience, 34, 69-88.

      Kurth, F., Gaser, C., & Luders, E. (2021). Development of sex differences in the human brain. Cognitive Neuroscience, 12(3-4), 155-162.

      It is recommended that the authors explore different prediction models for different species. Maybe macaques are suitable for linear prediction models, and humans are suitable for nonlinear prediction models.

      Thank you for pointing the typos out and comments on sex difference. In Figure Supplement 3A, there are typos for Pearson’s r values and we have corrected it in updated Figure 2-figure supplement 3. For details, please see the updated Figure 2-figure supplement 3 and the following figure.

      Regarding gender effects, we acknowledge your point about the importance of gender differences in understanding brain evolution and development. In our study, however, our primary goal was to develop a robust age prediction model by maximizing the number of training samples. To mitigate gender-related effects in our main results, we incorporated gender information as a covariate in the ComBat harmonization process. We conducted a supplementary analysis just to demonstrate the stability of our proposed cross-species age prediction model by separating the data with gender variable not to investigate gender differences. Although our results demonstrated that gender-specific models could still significantly predict chronological age, we refrained from emphasizing these models' performance in gender-specific species comparisons due to difficulty in explanation for the predicted gender difference. For cross-species prediction, whether a higher Pearson’s r value between actual age and predicted age could reflect conserved evolution for male or female is not convincing. In addition, we adopted same not different prediction models for human and macaque aiming to establish a comparable model between species. Generally speaking, the nonlinear model could obtain better prediction accuracy than linear model. If different species used different models, it is unfair to perform cross-species prediction. Importantly, our study aimed to developed new index based on the same prediction models to quantify brain evolution difference, i.e., brain cross-species age gap (BCAP) instead of traditional statistical analyses. Different prediction models for different species may introduce bias causing by prediction methods and thus impacting the accuracy of BCAP. Thus, we adopted the linear model with best prediction performances for intra-species prediction in this study for cross-species prediction. Although our main goal in this study is to set up stable cross-species prediction model and the models built using either male or female subjects showed good performances during cross-species prediction, however, as your comment, how to unbiasedly characterize evolutionary gender differences using machining learning approaches needs to be further investigated since there are many reports about the gender difference in developing brain in humans. In fact, whether macaque brains have the same gender differences as humans is an interesting scientific question worth studying. Thus, we have included a discussion on how to use machining learning method to study the evolutionary gender difference in our revised manuscript.

      On page 15, lines 18-23 and page 16, line 1-4

      “Many studies have reported sex differences in developing human brains (Hines, 2011; Kurth, Gaser, & Luders, 2021), however, whether macaque brains have similar sex differences as humans is still unknown. We used machining learning method for cross-species prediction to quantify brain evolution and the established prediction models are stable even when only using male or female data, which may indicate that the proposed cross-species prediction model has no evolutionary sex difference. Although the stable prediction model can be established in either male or female participants for cross-species prediction, this indeed does not mean that there are no evolutionary sex differences due to lack of quantitative comparative analysis. In the future, we need to develop more objective, quantifiable and stable index for studying sex differences using machining learning methods to further identify sex differences in the evolved brain”

      Reviewer #3 (Public Review):

      The authors identified a series of WM and GM features that correlated with age in human and macaque structural imaging data. The data was gathered from the HCP and WA studies, which was parcellated in order to yield a set of features. Features that correlated with age were used to train predictive intra and inter-species models of human and macaque age. Interestingly, while each model accurately predicted the corresponding species age, using the macaque model to predict human age was more accurate than the inverse (using the human model to predict macaque age). In addition, the prediction error of the macaque model in predicting human age increased with age, whereas the prediction error of the human model predicting macaque age decreased with age.

      After elaboration of the predictive models, the authors classified the features for prediction into human-specific, macaque-specific and common to human and macaque, where they most notably found that macaque-only and common human-macaque areas were located mainly in gray matter, with only a few human-specific features found in gray matter. Furthermore, the authors found significant correlations between BCAP and picture vocabulary (positive correlation) test and visual sensitivity (negative correlation) test. Several white matter tracts (AF, OR, SLFII) were also identified showing a correlation with BCAP.

      Thank you for providing this excellent summary. We appreciate your thorough review and concise overview of our work.

      STRENGTHS AND WEAKNESSES

      The paper brings an interesting perspective on the evolutionary trajectories of human and non-human primate brain structure, and its relation to behavior and cognition. Overall, the methods are robust and support the theoretical background of the paper. However, the overall clarity of the paper could be improved. There are many convoluted sentences and there seems to be both repetition across the different sections and unclear or missing information. For example, the Introduction does not clearly state the research questions, rather just briefly mentions research gaps existing in the literature and follows by describing the experimental method. It would be desirable to clearly state the theoretical background and research questions and leave out details on methodology. In addition, the results section repeats a lot of what is already stated in the methods. This could be further simplified and make the paper much easier to read.

      In the discussion, authors mention that "findings about cortex expansion are inconsistent and even contradictory", a more convincing argument could be made by elaborating on why the cortex expansion index is inadequate and how BCAP is more accurate.

      Thank you for highlighting the interesting aspects of our work. We are sorry for the lack of the clarity in certain parts of our manuscript. Following your valuable suggestions, we have revised the manuscript to reduce unnecessary repetitions and provide a clearer statement of our research question in Introduction. Specifically, unlike previous analyses of human and macaque evolution using comparative neuroscience, this study embeds chronological axis into the cross-species evolutionary analysis process. It constructed a linear prediction model of brain age for humans and macaques, and quantitatively described the degree of evolution. The brain structure based cross-species age prediction model and cross-species brain age differences proposed in this study further eliminate the inherent developmental effects of humans and macaques on cross-species evolutionary comparisons, providing new perspectives and approaches for studying cross-species development. Regarding the existing repetition in the results section, we have simplified them for the clarity. Regarding the comparison between the cortex expansion index and BCAP, we would like to emphasize that the cortex expansion index was derived without fully considering cross-species alignment along the chronological axis. Specifically, this index does not correspond to a specific developmental stage, but rather focuses on a direct comparison between the two species. In contrast, BCAP addresses this limitation by utilizing a prediction model to establish alignment (or misalignment) between species at the individual level. Therefore, BCAP may serve as a more flexible and nuanced tool for cross-species brain comparison.

      STUDY AIMS AND STRENGTH OF CONCLUSIONS

      Overall, the methods are robust and support the theoretical background of the paper, but it would be good to state the specific research questions -even if exploratory in nature- more specifically. Nevertheless, the results provide support for the research aims.

      Thank you for excellent suggestion. We have revised our introduction to state the specific research question as mentioned above.

      IMPACT OF THE WORK AND UTILITY OF METHODS AND DATA TO THE COMMUNITY

      This study is a good first step in providing a new insight into the neurodevelopmental trajectories of humans and non-human primates besides the existing cortical expansion theories.

      Thank you for your encouraging comment.

      ADDITIONAL CONTEXT:

      It should be clearly stated both in the abstract and methods that the data used for the experiment came from public databases.

      Thank you for your suggestion. We have added this information in both abstract and method. For details, please see page 2, line 9 in Abstract section; page 16, lines 10-11 and page 17, lines 6-10 in Materials and Method section.

    1. Author response:

      Reviewer #1 (Public Review):

      Using structural analysis, Bonchuk and colleagues demonstrate that the TTK-like BTB/POZs of insects form stable hexameric assemblies composed of trimers of POZ dimers, a configuration observed consistently across both homomultimers and heteromultimers, which are known to be formed by TTK-like BTB/POZ domains. The structural data is comprehensive, unambiguous, and further supported by theoretical fold prediction analyses. In particular the judicious complementation of experiments and fold prediction is commendable. This study now adds an important cog that might help generalize the general principles of the evolution of multimerization in members of this fold family.

      I strongly feel that enhancing the inclusivity of the discussion would strengthen the paper. Below, I suggest some additional points for consideration for the same.

      Major points.

      1) It would be valuable to discuss alternative multimer assembly interfaces, considering the diverse ways POZs can multimerize. For instance, the Potassium channel POZ domains form tetramers. A comparison of their inter-subunit interface with that of TTK and non-TTK POZs could provide insightful contrasts.

      Thanks for the suggestion, we added this important comparison, as well as comparison with recently published structures of filament-forming BTB domains.

      2) The so-called TTK motif, despite its unique sequence signature, essentially corresponds to the N-terminal extension observed in other "non-TTK" proteins such as Miz-1. Given Miz-1's structure, it becomes evident that the utilization of the N-terminal extension for dimerization is shared with the TTK family, suggesting a common evolutionary origin in metazoan transcription factors. Early phylogenetic trees (e.g. in PMID: 9917379) support the grouping of the TTK-like POZs with other animal Transcription factors containing POZ domains such as those with Kelch repeats further suggesting that the extension might be ancestral. Structural investigations by modeling prominent examples or comparing known structures of similar POZ domains, could support this inference. Control comparisons with POZ domains from fungi, plants and amoebozoans like Dictyostelium could offer additional insights.

      We performed AlphaFold2-Multimer modeling of dimers of all BTB domains from the most ancestral metazoan clades, Placozoa and Porifera, along with BTBs from Choanoflagellates – the closest to first metazoans unicellular eukaryotes. The presence of N-terminal beta-sheet was evaluated. KLHL-BTBs are present in all eukaryotes and likely are predecessors of ZBTB domains. According to AlphaFold modeling of dimers, all KLHL-BTB domains of plants and basal metazoans have alpha1 helix, but most of these domains from do not possess additional N-terminal beta-strand (beta1) characteristic for ZBTB domains. We found only one KLHL-BTB (Uniprot ID: AA9VCT1_MONBE) with such N-terminal extension in Choanoflagellate proteome, one in Dictyostelium proteome (Q54F31_DICDI), and 7 (out of 43 BTB domains in total) and 13 (out of 81) such domains in Trichoplax and Amphimedon proteomes correspondingly. There was no significant sequence similarity of beta1 element at the level of primary sequence. However, most of these domains bear 3-box/BACK extension and represent typical KLHL-BTBs which are member of E3 ubiquitin-ligase complexes, they are often associated with protein-protein interacting MATH domain or WD40 repeats. We found only one protein in Trichoplax proteome with beta1 strand devoid of 3-box/BACK (B3RQ74_TRIAD), thus resembling ZBTB topology. Thus, likely emergence of BTB domains of this subtype occurred early in Metazoan evolution. At this point ZBTBs were not yet associated with zinc-fingers. According to our survey, actual fusion of ZBTB domain with zinc-finger domains occurred in the evolution of earlier bilaterian organisms since proteins with such domain architecture are not found in Radiata but are present in basal Protostomia and Deuterostomia clades. TTK-type sequence is characteristic only for Arthropoda and emerged early in their evolution. We added all these data to the article.

      3) Exploring the ancestral presence of the aforementioned extension in metazoan transcription factors could serve as a foundation for understanding the evolutionary pathway of hexamerization. This analysis could shed light on exposed structural regions that had the potential to interact post-dimerization with the N-terminal extension and also might provide insights into the evolution of multimer interfaces, as observed in the Potassium channel.

      We added this important comparison as well as comparison with recent structures of filament-forming BTB domains.

      4) Considering the role of conserved residues in the multimer interface is crucial. Reference to conserved residues involved in multimer formation, such as discussed in PMID: 9917379, would enrich the discussion.

      We updated our description of multimer interface with respect to conservation of residues.

      Reviewer #2 (Public Review):

      BTB domains are protein-protein interaction domains found in diverse eukaryotic proteins, including transcription factors. It was previously known that many of the Drosophila transcription factor BTB domains are of the TTK-type - these are defined as having a highly-conserved motif, FxLRWN, at their N-terminus, and they thereby differ from the mammalian BTB domains. Whereas the well-characterised mammalian BTB domains are dimeric, several Drosophila TTK-BTB domains notably form multimers and function as chromosome architectural proteins. The aims of this work were (i) to determine the structural basis of multimerisation of the Drosophila TTK-BTB domains, (ii) to determine how different Drosophila TTK-BTB domains interact with each other, and (iii) to investigate the evolution of this subtype of BTB domain.

      The work significantly advances our understanding of the biology of BTB domains. The conclusions of the paper are mostly well-supported, although some aspects need clarification:

      Hexameric organisation of the TTK-type BTB domains:

      Using cryo-EM, the authors showed that the CG6765 TTK-type BTB domain forms a hexameric assembly in which three "classic" BTB dimers interact via a beta-sheet interface involving the B3 strand. This is particularly interesting, as this region of the BTB domain has recently been implicated in protein-protein interactions in a mammalian BTB-transcription factor, MIZ1. SEC-MALS analysis indicated that the LOLA TTK-type BTB domain is also hexameric, and SAXS data was consistent with a hexameric assembly of the CG6765- and LOLA BTB domains.

      The data regarding the hexameric organisation is convincing. However, interpreting the role of specific regions of the BTB domain is difficult because the description of the molecular contacts lacks depth.

      Heteromeric interactions between TTK-type BTB domains:

      The authors use yeast two-hybrid assays to study heteromeric interactions between various Drosophila TTK-type BTB domains. Such assays are notorious for producing false positives, and this needs to be mentioned. Although the authors suggest that the heteromeric interactions are mediated via the newly-identify B3 interaction interface, there is no evidence to support this, since mutation of B3 yielded insoluble proteins.

      We are aware that Y2H can give false positive results in cases where the BTB domain fused to the DNA binding domain can activate reporter genes. Therefore, all tested BTB domains were examined for their ability to activate transcription. Furthermore, in our study, assays with non-TTK-type BTB domains, which showed almost no interactions, provide additional negative control. We have added a corresponding disclaimer in the text. We agree that our data do not explain the basis for heteromeric interactions. Design of mutations in B3 beta-sheet proved to be complicated, using of biochemical methods to study the principles of heteromer assembly also does not seem to be feasible since most TTK-type BTBs tend to form aggregates and are difficult to be expressed and purified. But most important issue is that demonstrated ability of heteromer assembly through B3 in few tested pairs cannot be applied for all pairs, some of them still may use different mechanism. We used AlphaFold to predict possible mechanisms of heteromer assemblies. AlphaFold suggested that usage of both B3 and conventional dimerization interfaces for heteromeric interactions are possible in various cases, with preference of one over another in different pairs. Thus, most likely the presence of two potential heteromerization interfaces extends the heteromerization capability of these domains. We changed the text accordingly.

      Evolution of the TTK-type BTB domains:

      The authors carried out a bioinformatics analysis of BTB proteins and showed that most of the Drosophila BTB transcription factors (24 out of 28) are of the TTK-type. They investigated how the TTK-type BTB domains emerged during evolution, and showed that these are only found in Arthropoda, and underwent lineage-specific expansion in the modern phylogenetic groups of insects. These findings are well-supported by the evidence.

    2. Reviewer #2 (Public Review):

      BTB domains are protein-protein interaction domains found in diverse eukaryotic proteins, including transcription factors. It was previously known that many of the Drosophila transcription factor BTB domains are of the TTK-type - these are defined as having a highly-conserved motif, FxLRWN, at their N-terminus, and they thereby differ from the mammalian BTB domains. Whereas the well-characterised mammalian BTB domains are dimeric, several Drosophila TTK-BTB domains notably form multimers and function as chromosome architectural proteins. The aims of this work were (i) to determine the structural basis of multimerisation of the Drosophila TTK-BTB domains, (ii) to determine how different Drosophila TTK-BTB domains interact with each other, and (iii) to investigate the evolution of this subtype of BTB domain.

      The work significantly advances our understanding of the biology of BTB domains. The conclusions of the paper are mostly well-supported, although some aspects need clarification:

      Hexameric organisation of the TTK-type BTB domains:<br /> Using cryo-EM, the authors showed that the CG6765 TTK-type BTB domain forms a hexameric assembly in which three "classic" BTB dimers interact via a beta-sheet interface involving the B3 strand. This is particularly interesting, as this region of the BTB domain has recently been implicated in protein-protein interactions in a mammalian BTB-transcription factor, MIZ1. SEC-MALS analysis indicated that the LOLA TTK-type BTB domain is also hexameric, and SAXS data was consistent with a hexameric assembly of the CG6765- and LOLA BTB domains.

      The data regarding the hexameric organisation is convincing. However, interpreting the role of specific regions of the BTB domain is difficult because the description of the molecular contacts lacks depth.

      Heteromeric interactions between TTK-type BTB domains:<br /> The authors use yeast two-hybrid assays to study heteromeric interactions between various Drosophila TTK-type BTB domains. Such assays are notorious for producing false positives, and this needs to be mentioned. Although the authors suggest that the heteromeric interactions are mediated via the newly-identify B3 interaction interface, there is no evidence to support this, since mutation of B3 yielded insoluble proteins.

      Evolution of the TTK-type BTB domains:<br /> The authors carried out a bioinformatics analysis of BTB proteins and showed that most of the Drosophila BTB transcription factors (24 out of 28) are of the TTK-type. They investigated how the TTK-type BTB domains emerged during evolution, and showed that these are only found in Arthropoda, and underwent lineage-specific expansion in the modern phylogenetic groups of insects. These findings are well-supported by the evidence.

    1. Author response:

      Reviewer #1 - Public Review

      This report describes work aiming to delineate multi-modal MRI correlates of psychopathology from a large cohort of children of 9-11 years from the ABCD cohort. While uni-modal characterisations have been made, the authors rightly argue that multi-modal approaches in imaging are vital to comprehensively and robustly capture modes of large-scale brain variation that may be associated with pathology. The primary analysis integrates structural and resting-state functional data, while post-hoc analyses on subsamples incorporate task and diffusion data. Five latent components (LCs) are identified, with the first three, corresponding to p-factor, internal/externalising, and neurodevelopmental Michelini Factors, described in detail. In addition, associations of these components with primary and secondary RSFC functional gradients were identified, and LCs were validated in a replication sample via assessment of correlations of loadings.

      1.1) This work is clearly novel and a comprehensive study of associations within this dataset. Multi-modal analyses are challenging to perform, but this work is methodologically rigorous, with careful implementation of discovery and replication assessments, and primary and exploratory analyses. The ABCD dataset is large, and behavioural and MRI protocols seem appropriate and extensive enough for this study. The study lays out comprehensive associations between MRI brain measures and behaviour that appear to recapitulate the established hierarchical structure of psychopathology.

      We thank Reviewer 1 for appreciating our methods and findings, and we address their suggestions below:

      1.2) The work does have weaknesses, some of them acknowledged. There is limited focus on the strength of observed associations. While the latent component loadings seem reliably reproducible in the behavourial domain, this is considerably less the case in the imaging modalities. A considerable proportion of statistical results focuses on spatial associations in loadings between modalities - it seems likely that these reflect intrinsic correlations between modalities, rather than associations specific to any latent component.

      We appreciate the Reviewer’s comment, and minimized the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). We now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      For completeness, we report the intrinsic correlations between the different modalities in Supplementary file 1c (P.19):

      “Lastly, although the current work aimed to reduce intrinsic correlations between variables within a given modality through running a PCA before the PLS approach, intrinsic correlations between measures and modalities may potentially be a remaining factor influencing the PLS solution. We, thus, provided an additional overview of the intrinsic correlations between the different neuroimaging data modalities in the supporting results (Supplementary file 1c).”

      1.3) Assessment of associations with functional gradients is similarly a little hard to interpret. Thus, it is hard to judge the implications for our understanding of the neurophysiological basis of psychopathology and the ability of MRI to provide clinical tools for, say, stratification.

      We now provide additional context, including a rising body of theoretical and empirical work, that outlines the value of functional gradients and cortical hierarchies in the understanding of brain development and psychopathology. Please see P.26.

      “Initially demonstrated at the level of intrinsic functional connectivity (Margulies et al., 2016), follow up work confirmed a similar cortical patterning using microarchitectural in-vivo MRI indices related to cortical myelination (Burt et al., 2018; Huntenburg et al., 2017; Paquola et al., 2019), post-mortem cytoarchitecture (Goulas et al., 2018; Paquola et al., 2020, 2019), or post-mortem microarray gene expression (Burt et al., 2018). Spatiotemporal patterns in the formation and maturation of large-scale networks have been found to follow a similar sensory-to-association axis; moreover, there is the emerging view that this framework may offer key insights into brain plasticity and susceptibility to psychopathology (Sydnor et al., 2021). In particular, the increased vulnerability of transmodal association cortices in late childhood and early adolescence has been suggested to relate to prolonged maturation and potential for plastic reconfigurations of these systems (Paquola et al., 2019; Park et al., 2022b). Between mid-childhood and early adolescence, heteromodal association systems such as the default network become progressively more integrated among distant regions, while being more differentiated from spatially adjacent systems, paralleling the development of cognitive control, as well as increasingly abstract and logical thinking. [...] This suggests that neurodevelopmental difficulties might be related to alterations in various processes underpinned by sensory and association regions, as well as the macroscale balance and hierarchy of these systems, in line with previous findings in several neurodevelopmental conditions, including autism, schizophrenia, as well as epilepsy, showing a decreased differentiation between the two anchors of this gradient (Hong et al., 2019). In future work, it will be important to evaluate these tools for diagnostics and population stratification. In particular, the compact and low dimensional perspective of gradients may provide beneficial in terms of biomarker reliability as well as phenotypic prediction, as previously demonstrated using typically developing cohorts (Hong et al. 2020) On the other hand, it will be of interest to explore in how far alterations in connectivity along sensory-to-transmodal hierarchies provide sufficient graduality to differentiate between specific psychopathologies, or whether they, as the current work suggests, mainly reflect risk for general psychopathology and atypical development.”

      1.4) The observation of a recapitulation of psychopathology hierarchy may be somewhat undermined by the relatively modest strength of the components in the imaging domain.

      We thank the Reviewer for this comment, and now expressed this limitation in the revised Discussion, P.23.

      “The p factor, internalizing, externalizing, and neurodevelopmental dimensions were each associated with distinct morphological and intrinsic functional connectivity signatures, although these relationships varied in strength.”

      1.5) The task fMRI was assessed with a fairly basic functional connectivity approach, not using task timings to more specifically extract network responses.

      In the revised Discussion on P.24, we acknowledge that more in-depth analyses of task-based fMRI may have offered additional insights into state-dependent changes in functional architecture.

      “While the current work derived main imaging signatures from resting-state fMRI as well as grey matter morphometry, we could nevertheless demonstrate associations to white matter architecture (derived from diffusion MRI tractography) and recover similar dimensions when using task-based fMRI connectivity. Despite subtle variations in the strength of observed associations, the latter finding provided additional support that the different behavioral dimensions of psychopathology more generally relate to alterations in functional connectivity. Given that task-based fMRI data offers numerous avenues for analytical exploration, our findings may motivate follow-up work assessing associations to network- and gradient-based response strength and timing with respect to external stimuli across different functional states.”

      1.6) Overall, the authors achieve their aim to provide a detailed multimodal characterisation of MRI correlations of psychopathology. Code and data are available and well organised and should provide a valuable resource for researchers wanting to understand MRI-based neural correlates of psycho-pathology-related behavioural traits in this important age group. It is largely a descriptive study, with comparisons to previous uni-modal work, but without particularly strong testing of neuroscience hypotheses.

      We thank the Reviewer for recognizing the detail and rigor of data-driven study and extensive code and data documentation.

      Reviewer #2 - Public Review

      In "Multi-modal Neural Correlates of Childhood Psychopathology" Krebets et al. integrate multi-modal neuroimaging data using machine learning to delineate dissociable links to diverse dimensions of psychopathology in the ABCD sample. This paper had numerous strengths including a superb use of a large resource dataset, appropriate analyses, beautiful visualizations, clear writing, and highly interpretable results from a data-driven analysis. Overall, I think it would certainly be of interest to a general readership. That being said, I do have several comments for the authors to consider.

      We thank Dr Satterthwaite for the positive evaluation and helpful comments.

      2.1) Out-of-sample testing: while the permutation testing procedure for the PLS is entirely appropriate, without out-of-sample testing the reported effect sizes are likely inflated.

      As discussed in the editorial summary of essential revisions, we agree that out-of-sample prediction indeed provides stronger estimates of generalizability. We assess this by applying the PCA coefficients derived from the discovery cohort imaging data to the replication cohort imaging data. The resulting PCA scores and behavioral data were then z-scored using the mean and standard deviation of the replication cohort. The SVD weights derived from the discovery cohort were applied to the normalized replication cohort data to derive imaging and behavioral composite scores, which were used to recover the contribution of each imaging and behavioral variable to the LCs (i.e., loadings). Out-of-sample replicability of imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings was generally high across LCs 1-5. This analysis is reported in the revised manuscript (P.18).

      “Generalizability of reported findings was also assessed by directly applying PCA coefficients and latent components weights from the PLS analysis performed in the discovery cohort to the replication sample data. Out-of-sample prediction was overall high across LCs1-5 for both imaging (mean r=0.681, S.D.=0.131) and behavioral (mean r=0.948, S.D.=0.022) loadings.”

      2.2) Site/family structure: it was unclear how site/family structure were handled as covariates.

      Only unrelated participants were included in discovery and replication samples (see P.6). The site variable was regressed out of the imaging and behavioral data prior to the PLS analysis using the residuals from a multiple linear model which also included age, age2, sex, and ethnicity. This is now clarified on P.29:

      “Prior to the PLS analysis, effects of age, age2, sex, site, and ethnicity were regressed out from the behavioral and imaging data using a multiple linear regression to ensure that the LCs would not be driven by possible confounders (Kebets et al., 2021, 2019; Xia et al., 2018). The imaging and behavioral residuals of this procedure were input to the PLS analysis.”

      2.3) Anatomical features: I was a bit surprised to see volume, surface area, and thickness all evaluated - and that there were several comments on the correspondence between the SA and volume in the results section. Given that cortical volume is simply a product of SA and CT (and mainly driven by SA), this result may be pre-required.

      As suggested, we reduced the reporting of correlations between the loadings from the different modalities in the revised Results (specifically subsections on LC1, LC2, and LC3). Instead, we now refer to Table S4 in each subsection for this information: “Spatial correlations between modality-specific loadings are reported in Supplementary file 1c.”

      We also reran the PLS analysis while only including thickness and surface area as our structural metrics, to account for potential redundancy of these measures with volume. This analysis and associated findings are reported on P.36 and P.19:

      “As cortical volume is a result of both thickness and surface area, we repeated our main PLS analysis while excluding cortical volume from our imaging metrics and report the consistency of these findings with our main model.”

      “Third, to account for redundancy within structural imaging metrics included in our main PLS model (i.e., cortical volume is a result of both thickness and surface area), we also repeated our main analysis while excluding cortical volume from our imaging metrics. Findings were very similar to those in our main analysis, with an average absolute correlation of 0.898±0.114 across imaging composite scores of LCs 1-5.”

      2.4) Ethnicity: the rationale for regressing ethnicity from the data was unclear and may conflict with current best practices.

      We thank the Reviewer for this comment. In light of recent discussions on including this covariate in large datasets such as ABCD (e.g., Saragosa-Harris et al., 2022), we elaborate on our rationale for including this variable in our model in the revised manuscript on P.30:

      “Of note, the inclusion of ethnicity as a covariate in imaging studies has been recently called into question. In the present study, we included this variable in our main model as a proxy for social inequalities relating to race and ethnicity alongside biological factors (age, sex) with documented effects on brain organization and neurodevelopmental symptomatology queried in the CBCL.”

      We also assess the replicability of our analyses when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models. We report resulting correlations in the revised manuscript (P.37, 19, and 27):

      “We also assessed the replicability of our findings when removing race and ethnicity covariates prior to computing the PLS analysis and correlating imaging and behavioral composite scores across both models.”

      “Moreover, repeating the PLS analysis while excluding this variable as a model covariate yielded overall similar imaging and behavioral composites scores across LCs to our original analysis. Across LCs 1-5, the average absolute correlations reached r=0.636±0.248 for imaging composite scores, and r=0.715±0.269 for behavioral composite scores. Removing these covariates seemed to exert stronger effects on LC3 and LC4 for both imaging and behavior, as lower correlations across models were specifically observed for these components.”

      “Although we could consider some socio-demographic variables and proxies of social inequalities relating to race and ethnicity as covariates in our main model, the relationship of these social factors to structural and functional brain phenotypes remains to be established with more targeted analyses.”

      2.5) Data quality: the authors did an admirable job in controlling for data quality in the analyses of functional connectivity data. However, it is unclear if a comparable measure of data quality was used for the T1/dMRI analyses. This likely will result in inflated effect sizes in some cases; it has the potential to reduce sensitivity to real effects.

      We agree that data quality was not accounted for in our analysis of T1w- and diffusion-derived metrics. We now accounted for T1w image quality by adding manual quality control ratings to the regressors applied to all structural imaging metrics prior to performing the PLS analysis, and reported the consistency of this new model with original findings. See P.36, P.19:

      “We also considered manual quality control ratings as a measure of T1w scan quality. This metric was included as a covariate in a multiple linear regression model accounting for potential confounds in the structural imaging data, in addition to age, age2, sex, site, ethnicity, ICV, and total surface area. Downstream PLS results were then benchmarked against those obtained from our main model.”

      “Considering scan quality in T1w-derived metrics (from manual quality control ratings) yielded similar results to our main analysis, with an average correlation of 0.986±0.014 across imaging composite scores.”

      As for diffusion imaging, we also regressed out effects of head motion in addition to age, age2, sex, site, and ethnicity from FA and MD measures and reported the consistency with our original results (P.36, P.19):

      “We tested another model which additionally included head motion parameters as regressors in our analyses of FA and MD measures, and assessed the consistency of findings from both models.”

      “Additionally considering head motion parameters from diffusion imaging metrics in our model yielded consistent results to those in our main analyses (mean r=0.891, S.D.=0.103; r=0.733-0.998).”

      Reviewer #3 - Public Review

      In this study, the authors utilized the Adolescent Brain Cognitive Development dataset to investigate the relationship between structural and functional brain network patterns and dimensions of psychopathology. They identified multiple components, including a general psychopathology (p) factor that exhibited a strong association with multimodal imaging features. The connectivity signatures associated with the p factor and neurodevelopmental dimensions aligned with the sensory-to-transmodal axis of cortical organization, which is linked to complex cognition and psychopathology risk. The findings were consistent across two separate subsamples and remained robust when accounting for variations in analytical parameters, thus contributing to a better understanding of the biological mechanisms underlying psychopathology dimensions and offering potential brain-based vulnerability markers.

      3.1) An intriguing aspect of this study is the integration of multiple neuroimaging modalities, combining structural and functional measures, to comprehensively assess the covariance with various symptom combinations. This approach provides a multidimensional understanding of the risk patterns associated with mental illness development.

      We thank the Reviewer for acknowledging the multimodal approach, and for the constructive suggestions.

      3.2) The paper delves deeper into established behavioral latent variables such as the p factor, internalizing, externalizing, and neurodevelopmental dimensions, revealing their distinct associations with morphological and intrinsic functional connectivity signatures. This sheds light on the neurobiological underpinnings of these dimensions.

      We are happy to hear the Reviewer appreciates the gain in understanding neural underpinnings of dimensions of psychopathology resulting from the current work.

      3.3) The robustness of the findings is a notable strength, as they were validated in a separate replication sample and remained consistent even when accounting for different parameter variations in the analysis methodology. This reinforces the generalizability and reliability of the results.

      We appreciate that the Reviewer found our robustness and generalizability assessment convincing.

      3.4) Based on their findings, the authors suggest that the observed variations in resting-state functional connectivity may indicate shared neurobiological substrates specific to certain symptoms. However, it should be noted that differences in resting-state connectivity between groups can stem from various factors, as highlighted in the existing literature. For instance, discrepancies in the interpretation of instructions during the resting state scan can influence the results. Hence, while their findings may indicate biological distinctions, they could also reflect differences in behavior.

      For the ABCD dataset, resting-state fMRI scans were based on eyes open and passive viewing of a crosshair, and are thus homogenized. We acknowledge, however, that there may still be state-to-state fluctuations contributing to the findings, and this is now discussed in the revised Discussion, on P.28. Note, however, that prior literature has generally also suggested rather modest impacts of cognitive and daily variation on resting-state functional networks, compared to much more dominating inter-individual and inter-group factors.

      “Finally, while prior research has shown that resting-state fMRI networks may be affected by differences in instructions and study paradigm (e.g., with respect to eyes open vs closed) (Agcaoglu et al., 2019), the resting-state fMRI paradigm is homogenized in the ABCD study to be passive viewing of a centrally presented fixation cross. It is nevertheless possible that there were slight variations in compliance and instructions that contributed to differences in associated functional architecture. Notably, however, there is a mounting literature based on high-definition fMRI acquisitions suggesting that functional networks are mainly dominated by common organizational principles and stable individual features, with substantially more modest contributions from task-state variability (Gratton et al. 2018). These findings, thus, suggest that resting-state fMRI markers can serve as powerful phenotypes of psychiatric conditions, and potential biomarkers (Abraham et al., 2017; Gratton et al., 2020; Parkes et al., 2020).”

      3.5) The authors conducted several analyses to investigate the relationship between imaging loadings associated with latent components and the principal functional gradient. They found several associations between principal gradient scores and both within- and between-network resting-state functional connectivity (RSFC) loadings. Assessing the analysis presented here proves challenging due to the nature of relating loadings, which are partly based on the RSFC, to gradients derived from RSFC. Consequently, a certain level of correlation between these two variables would be expected, making it difficult to determine the significance of the authors' findings. It would be more intriguing if a direct correlation between the composite scores reflecting behavior and the gradients were to yield statistically significant results.

      We thank the Reviewer for the comment, and agree that investigating gradient-behavior relationships could offer additional insights into the neural basis of psychiatric symptomatology. However, the current analysis pipeline precludes this direct comparison which is performed on a region-by-region basis across the span of the cortical gradient. Indeed, the behavioral loadings are provided for each CBCL item, and not cortical regions.

      The Reviewer also evokes concerns of potential circularity in our analysis, as we compared imaging loadings, which are partially based on RSFC, and gradient values generated from the same RSFC data. In response to this comment, we cross-validated our findings using an RSFC gradient derived from an independent dataset (HCP), showing highly consistent findings to those presented in the manuscript. This correlation is now reported in the Results section P.15.

      “A similar pattern of findings was observed when cross-validating between- and within-network RSFC loadings to a RSFC gradient derived from an independent dataset (HCP), with strongest correlations seen for between-network RSFC loadings for LC1 and LC3 (LC1: r=0.50, pspin<0.001; LC3: r=0.37, pspin<0.001).”

      We furthermore note similar correlations between imaging loadings and T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). These findings are now detailed in the revised Results, P.15-16:

      “Of note, we obtain similar correlations when using T1w/T2w ratio in the same participants, a proxy of intracortical microstructure and hierarchy (Glasser et al., 2011). Specifically, we observed the strongest association between this microstructural marker of the cortical hierarchy and between-network RSFC loadings related to LC1 (r=-0.43, pspin<0.001).”

      3.6) Lastly, regarding the interpretation of the first identified latent component, I have some reservations. Upon examining the loadings, it appears that LC1 primarily reflects impulse control issues rather than representing a comprehensive p-factor. Furthermore, it is worth noting that within the field, there is an ongoing debate concerning the interpretation and utilization of the p-factor. An insightful publication on this topic is "The p factor is the sum of its parts, for now" (Fried et al, 2021), which explains that the p-factor emerges as a result of a positive manifold, but it does not necessarily provide insights into the underlying mechanisms that generated the data.

      We thank the Reviewer for this comment, and added greater nuance into the discussion of the association to the p factor. We furthermore discuss some of the ongoing debate about the use of the p factor, and cite the recommended publication on P.27.

      “Other factors have also been suggested to impact the development of psychopathology, such as executive functioning deficits, earlier pubertal timing, negative life events (Brieant et al., 2021), maternal depression, or psychological factors (e.g., low effortful control, high neuroticism, negative affectivity). Inclusion of such data could also help to add further mechanistic insights into the rather synoptic proxy measure of the p factor itself (Fried et al., 2021), and to potentially assess shared and unique effects of the p factor vis-à-vis highly correlated measures of impulse control.”

    1. Author response:

      Reviewer #2 (Public Review):

      This is, to my knowledge, the most scalable method for phylogenetic placement that uses likelihoods. The tool has an inter- esting and innovative means of using gaps, which I haven’t seen before. In the validation the authors demonstrate superior performance to existing tools for taxonomic annotation (though there are questions about the setup of the validation as described below).

      The program is written in C with no library dependencies. This is great. However, I wasn’t able to try out the software because the linking failed on Debian 11, and the binary artifact made by the GitHub Actions pipeline was too recent for my GLIBC/kernel. It’d be nice to provide a binary for people stuck on older kernels (our cluster is still on Ubuntu 18.04). Also, would it be hard to publish your .zipped binaries as packages?

      We have provided a binary (and zipped package) that supports Ubuntu 18.04 in GitHub Actions ( https://github.com/lpipes/tronko/actions/runs/9947708087). This should facilitate the use of our software on older sys- tems like yours. We were not able to test the binary however, since GitHub did not seem to find any nodes with Ubuntu 18.04. It is important to note that Ubuntu 18.04 is deprecated. The latest version of Ubuntu is 24.04, and we recommend users to upgrade to newer, supported versions of their operating systems to benefit from the latest security updates and features.

      Thank you for publishing your source files for the validation on zenodo. Please provide a script that would enable the user to rerun the analysis using those files, either on zenodo or on GitHub somewhere.

      We have posted all datasets as well as scripts to Zenodo.

      The validations need further attention as follows.

      First, the authors have not chosen data sets that are not well-aligned with real-world use cases for this software, and as a re- sult, its applicability is difficult to determine. First, the leave-one-species-out experiment made use of COI gene sequences representing 253 species from the order Charadriiformes, which includes bird species such as gulls and terns. What is the reasoning for selecting this data set given the objective of demonstrating the utility of Tronko for large scale community profiling experiments which by their nature tend to include microorganisms as subjects? If the authors are interested in evaluating COI (or another gene target) as a marker for characterizing the composition of eukaryotic populations, is the heterogeneity and species distribution of bird species within order Charadriiformes comparable to what one would expect in populations of organisms that might actually be the target of a metagenomic analysis?

      Our reasoning for selecting Charadriiformes is that these species are often misidentified for each other and there is a heavy reliance on COI for their species identification. This choice allows us to demonstrate Tronko’s ability to handle difficult and realistic identification challenges. Additionally, we aimed to simulate a challenging dataset to effectively differentiate between the methods used, showcasing Tronko’s robustness. Including more distantly related bird species would have simplified the identification process, which would not serve our objective of demonstrating the utility of Tronko for dis- tinguishing closely related species. It is also important to note that all methods used the exact same reference database which is not always the case in other species assignment comparative studies.

      Furthermore, while our study uses bird species, the principles and techniques applied are broadly applicable to other taxa, including microorganisms. By selecting a datase tknown for its identification difficulties, we underscore Tronko’spotential utility in a wide range of taxonomic profiling scenarios, including those involving high heterogeneity and closely related species, such as in microbial communities.

      Second, It appears that experiments evaluating performance for 16S were limited to reclassification of sequencing data from mock communities described in two publications, Schirmer (2015, 49 bacteria and 10 archaea, all environmental), and Gohl (2016; 20 bacteria - this is the widely used commercial mock community from BEI, all well-known human pathogens or commensals). The authors performed a comparison with kraken2, metaphlan2, and MEGAN using both the default database for each as well as the same database used for Tronko (kudos for including the latter). This pair of experiments provide a reasonable high-level indication of Tronko’s performance relative to other tools, but the total number of organ- isms is very limited, and particularly limited with respect to the human microbiome. It is also important to point out that these mock communities are composed primarily of type strains and provide limited species-level heterogeneity. The per- formance of these classification tools on type strains may not be representative of what one would find in natural samples. Thus, the leave-one-individual-out and leave-one-species-out experiments would have been more useful and informative had they been applied to extended 16S data sets representing more ecologically realistic populations.

      We thank the reviewer for this comment and we have included both an additional bacterial mock community dataset from Lluch et al. (2015) and an additional leave-one-species-out experiment. We describe how this leave-one-species-out dataset was constructed in our previous response to ’Essential Revisions’ #1. We also added Figure 5, S5, and S6.

      Finally, the authors should describe the composition of the databases used for classification as well as the strategy (and toolchain) used to select reference sequences. What databases were the reference sequences drawn from and by what criteria? Were the reference databases designed to reflect the composition of the mock communities (and if so, are they limited to species in those communities, or are additional related species included), or have the authors constructed general pur- pose reference databases? How many representatives of each species were included (on average), and were there efforts to represent a diversity of strains for each species? The methods should include a section detailing the construction of the data sets: as illustrated in this very study, the choice of reference database influences the quality of classification results, and the authors should explain the process and design considerations for database construction.

      To construct our databases, we used CRUX (Curd et al., 2018). This is described in the Methods section under ’Custom 16S and COI Tronko-build reference database construction’. All missing outs tests were downsamples of these two databases. It is beyond the scope of the manuscript to discuss how CRUX works. Additionally, we added the following text:

      To compare the new method (Tronko) to previous methods, we constructed reference databases for COI and 16S for com- mon amplicon primer sets using CRUX (See Methods for exact primers used).

    1. Author response:

      Reviewer #1 (Public Review):

      In this manuscript, Perez-Lopez et al. examine the function of the chemokine CCL28, which is expressed highly in mucosal tissues during infection, but its role during infection is poorly understood. They find that CCL28 promotes neutrophil accumulation in the intestines of mice infected with Salmonella and in the lungs of mice infected with Acinetobacter. They find that Ccl28-/- mice are highly susceptible to Salmonella infection, and highly resistant and protected from lethality following Acinetobacter infection. They find that neutrophils express the CCL28 receptors CCR3 and CCR10. CCR3 was pre-formed and intracellular and translocated to the cell surface following phagocytosis or inflammatory stimuli. They also find that CCL28 stimulation of CCR3 promoted neutrophil antimicrobial activity, ROS production, and NET formation, using a combination of primary mouse and human neutrophils for their studies. Overall, the authors' findings provide new and fundamental insight into the role of the CCL28:CCR3 chemokine:chemokine receptor pair in regulating neutrophil recruitment and effector function during infection with the intestinal pathogen Salmonella Typhimurium and the lung pathogen Acinetobacter baumanii.

      We would like to thank the reviewer for their positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      Reviewer #2 (Public Review):

      In this manuscript by Perez-Lopez et al., the authors investigate the role of the chemokine CCL28 during bacterial infections in mucosal tissues. This is a well-written study with exciting results. They show a role for CCL28 in promoting neutrophil accumulation to the guts of Salmonella-infected mice and to the lung of mice infected with Acinetobacter. Interestingly, the functional consequences of CCL28 deficiency differ between infections with the two different pathogens, with CCL28-deficiency increasing susceptibility to Salmonella, but increasing resistance to Acinetobacter. The underlying mechanistic reasons for this suggest roles for CCL28 in enhanced neutrophil antimicrobial activity, production of reactive oxygen species, and formation of extracellular traps. However, additional experiments are required to shore up these mechanisms, including addressing the role of other CCL28-dependent cell types and further characterization of neutrophils from CCL28-deficient mice.

      We would like to thank the reviewer for the positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      Reviewer #3 (Public Review):

      The manuscript by Perez-Lopez and colleagues uses a combination of in vivo studies using knockout mice and elegant in vitro studies to explore the role of the chemokine CCL28 during bacterial infection on mucosal surfaces. Using the streptomycin model of Salmonella Typhimurium (S. Tm) infection, the authors demonstrate that CCL28 is required for neutrophil influx in the intestinal mucosa to control pathogen burden both locally and systemically. Interestingly, CCL28 plays the opposite role in a model lung infection by Acinetobacter baumanii, as Ccl28-/- mice are protected from Acinetobacter infection. Authors suggest that the mechanism by which CCL28 plays a role during bacterial infection is due to its role in modulating neutrophil recruitment and function.

      We would like to thank the reviewer for the positive assessment of our work and for providing us with constructive comments that have helped us to improve the manuscript.

      The major strengths of the manuscript are:

      The novelty of the findings that are described in the manuscript. The role of the chemokine CCL28 in modulating neutrophil function and recruitment in mucosal surfaces is intriguing and novel.

      Authors use Ccl28-/- mice in their studies, a mouse strain that has only recently been available. To assess the impact of CCL28 on mucosal surfaces during pathogen-induced inflammation, the authors choose not one but two models of bacterial infection (S. Tm and A. baumanii). This approach increases the rigor and impact of the data presented.

      Authors combine the elegant in vivo studies using Ccl28 -/- with in vitro experiments that explore the mechanisms by which CCL28 affects neutrophil function.

      The major weaknesses of the manuscript in its present form are:

      Authors use different time points in the S. Tm model to characterize the influx of immune cells and pathology. They do not provide a clear justification as to why distinct time points were chosen for their analysis.

      The reviewer raises a good point. As discussed in the detailed response to the reviewers, we have now generated extensive results at different time points and included these in the revised manuscript.

      Authors provide puzzling data that Ccl28-/- mice have the same numbers of CCR3 and CCR10- expressing neutrophils in the mucosa during infection. It is unclear why the lack of CCL28 expression would not affect the recruitment of neutrophils that express the ligands (CCR3 and CCR10) for this chemokine. Thus, these results need to be better explained.

      As discussed in the detailed response to the reviewers, we clarified that Ccl28-/- mice have reduced numbers of neutrophils in the mucosa during infection, but the percentage of CCR3+ and CCR10+ neutrophils does not change. We provide additional discussion of this point in the manuscript and in the response to the reviewers.

      The in vitro studies focus primarily on characterizing how CCL28 affects the function of neutrophils in response to S. Tm infection. There is a lack of data to demonstrate whether Acinetobacter affects CCR3 and CCR10 expression and recruitment to the cell surface and whether CCL28 plays any role in this process.

      We agree and have performed additional studies with Acinetobacter and CCL28, which we discuss in greater detail below in the response to the reviewers.

    1. eLife assessment

      This is a useful study on sex differences in gene expression across organs of four mice taxa, although there are some shortcomings in the data analyses and interpretations that should to be better placed in the broader context of the current literature. Hence, the evidence in the current form is incomplete, with several overstated key conclusions.

    2. Reviewer #1 (Public Review):

      The authors describe a comprehensive analysis of sex-biased expression across multiple tissues and species of mouse. Their results are broadly consistent with previous work, and their methods are robust, as the large volume of work in this area has converged toward a standardized approach.

      I have a few quibbles with the findings, and the main novelty here is the rapid evolution of sex-biased expression over shorter evolutionary intervals than previously documented, although this is not statistically supported. The other main findings, detailed below, are somewhat overstated.

      (1) In the introduction, the authors conflate gametic sex, which is indeed largely binary (with small sperm, large eggs, no intermediate gametic form, and no overlap in size) with somatic sexual dimorphism, which can be bimodal (though sometimes is even more complicated), with a large variance in either sex and generally with a great deal of overlap between males and females. A good appraisal of this distinction is at https://doi.org/10.1093/icb/icad113. This distinction in gene expression has been recognized for at least 20 years, with observations that sex-biased expression in the soma is far less than in the gonad.

      For example, the authors frame their work with the following statement:<br /> "The different organs show a large individual variation in sex-biased gene expression, making it impossible to classify individuals in simple binary terms. Hence, the seemingly strong conservation of binary sex-states does not find an equivalent underpinning when one looks at the gene-expression makeup of the sexes"

      The authors use this conflation to set up a straw man argument, perhaps in part due to recent political discussions on this topic. They seem to be implying one of two things. a) That previous studies of sex-biased expression of the soma claim a binary classification. I know of no such claim, and many have clearly shown quite the opposite, particularly studies of intra-sexual variation, which are common - see https://doi.org/10.1093/molbev/msx293, https://doi.org/10.1371/journal.pgen.1003697, https://doi.org/10.1111/mec.14408, https://doi.org/10.1111/mec.13919, https://doi.org/10.1111/j.1558-5646.2010.01106.x for just a few examples. Or b) They are the first to observe this non-binary pattern for the soma, but again, many have observed this. For example, many have noted that reproductive or gonad transcriptome data cluster first by sex, but somatic tissue clusters first by species or tissue, then by sex (https://doi.org/10.1073/pnas.1501339112, https://doi.org/10.7554/eLife.67485)<br /> Figure 4 illustrates the conceptual difference between bimodal and binary sexual conceptions. This figure makes it clear that males and females have different means, but in all cases the distributions are bimodal.

      I would suggest that the authors heavily revise the paper with this more nuanced understanding of the literature and sex differences in their paper, and place their findings in the context of previous work.

      (2) The authors also claim that "sexual conflict is one of the major drivers of evolutionary divergence already at the early species divergence level." However, making the connection between sex-biased genes and sexual conflict remains fraught. Although it is tempting to use sex-biased gene expression (or any form of phenotypic dimorphism) as an indicator of sexual conflict, resolved or not, as many have pointed out, one needs measures of sex-specific selection, ideally fitness, to make this case (https://doi.org/10.1086/595841, 10.1101/cshperspect.a017632). In many cases, sexual dimorphism can arise in one sex only without conflict (e.g. 10.1098/rspb.2010.2220). As such, sex-biased genes alone are not sufficient to discriminate between ongoing and resolved conflict.

      (3) To make the case that sex-biased genes are under selection, the authors report alpha values in Figure 3B. Alpha value comparisons like this over large numbers of genes often have high variance. Are any of the values for male- female- and un-biased genes significantly different from one another? This is needed to make the claim of positive selection.

    3. Reviewer #2 (Public Review):

      The manuscript by Xie and colleagues presents transcriptomic experiments that measure gene expression in eight different tissues taken from adult female and male mice from four species. These data are used to make inferences regarding the evolution of sex-biased gene expression across these taxa. The experimental methods and data analysis are appropriate; however, most of the conclusions drawn in the manuscript have either been previously reported in the literature or are not fully supported by the data.

      There are two ways the manuscript could be modified to better strengthen the conclusions.

      First, some of the observed differences in gene expression have very little to no effect on other phenotypes, and are not relevant to medicine or fitness. Selectively neutral gene expression differences have been inferred in previous studies, and consistent with that work, sex-biased and between-species expression differences in this study may also be enriched for selectively neutral expression differences. This idea is supported by the analysis of expression variance, which indicates that genes that show sex-biased expression also tend to show more inter-individual variation. This perspective is also supported by the MK analysis of molecular evolution, which suggests that positive selection is more prevalent among genes that are sex-biased in both mus and dom, and genes that switch sex-biased expression are under less selection at the level of both protein-coding sequence and gene expression.

      As an aside, I was confused by (line 176): "implying that the enhanced positive selection pressure is triggered by their status of being sex-biased in either taxon." - don't the MK values suggest an excess of positive selection on genes that are sex-biased in both taxa?

      Without an estimate of the proportion of differentially expressed genes that might be relevant for broader physiological or organismal phenotypes, it is difficult to assess the accuracy and relevance of the manuscript's conclusions. One (crude) approach would be to analyze subsets of genes stratified by the magnitude of expression differences; while there is a weak relationship between expression differences and fitness effects, on average large gene expression differences are more likely to affect additional phenotypes than small expression differences. Another perspective would be to compare the within-species variance to the between-species variance to identify genes with an excess of the latter relative to the former (similar logic to an MK test of amino acid substitutions).

      Second, the analysis could be more informative if it distinguished between genes that are expressed across multiple tissues in both sexes that may show greater expression in one sex than the other, versus genes with specialized function expressed solely in (usually) reproductive tissues of one sex (e.g. ovary-specific genes). One approach to quantify this distinction would be metrics like those used defined by [Yanai I, et al. 2005. Genome-wide midrange transcription profiles reveal expression-level relationships in human tissue specification. Bioinformatics 21:650-659.] These approaches can be used to separate out groups of genes by the extent to which they are expressed in both sexes versus genes that are primarily expressed in sex-specific tissue such as testes or ovaries. This more fine-grained analysis would also potentially inform the section describing the evolution/conservation of sex-biased expression: I expect there must be genes with conserved expression specifically in ovaries or testes (these are ancient animal structures!) but these may have been excluded by the requirement that genes be sex-biased and expressed in at least two organs.

      There are at least three examples of statements in the discussion that at the moment misinterpret the experimental results.

      The discussion frames the results in the context of sexual selection and sexually antagonistic selection, but these concepts are not synonymous. Sexual selection can shape phenotypes that are specific to one sex, causing no antagonism; and fitness differences between males and females resulting from sexually antagonistic variation in somatic phenotypes may not be acted on by sexual selection. Furthermore, the conditions promoting and consequence of both kinds of selection can be different, so they should be treated separately for the purposes of this discussion.

      The discussion claims that "Our data show that sex-biased gene expression evolves extremely fast" but a comparison or expectation for the rate of evolution is not provided. Many other studies have used comparative transcriptomics to estimate rates of gene expression evolution between species, including mice; are the results here substantially and significantly different from those previous studies? Furthermore, the experimental design does not distinguish between those gene expression phenotypes that are fixed between species as compared to those that are polymorphic within one or more species which prevents straightforward interpretation of differences in gene expression as interspecific differences.

      The conclusion that "Our results show that most of the genetic underpinnings of sex differences show no long-term evolutionary stability, which is in strong contrast to the perceived evolutionary stability of two sexes" - seems beyond the scope of this study. This manuscript does not address the genetic underpinnings of sex differences (this would involve eQTL or the like), rather it looks at sex differences in gene expression phenotypes. Simply addressing the question of phenotypic evolutionary stability would be more informative if genes expressed specifically in reproductive tissues were separated from somatic sex-biased genes to determine if they show similar patterns of expression evolution.

    4. Reviewer #3 (Public Review):

      This manuscript reports some interesting and important patterns. The results on sex-bias in different tissues and across four taxa would benefit from alternative (or additional) presentation styles. In my view, the most important results are with respect to alpha (fraction of beneficial amino acid changes) in relation to sex-bias (though the authors have made this as a somewhat minor point in this version).

      The part that the authors emphasize I don't find very interesting (i.e., the sexes have overlapping expression profiles in many nongonadal tissues), nor do I believe they have the appropriate data necessary to convincingly demonstrate this (which would require multiple measures from the same individual).

      This study reports several interesting patterns with respect to sex differences in gene expression across organs of four mice taxa. An alternative presentation of the data would yield a clearer and more convincing case that the patterns the authors claim are legitimate.

      I recommend that the authors clarify what qualifies as "sex-bias".

    5. Author response:

      We appreciate the time of the reviewers and their detailed comments, which will help to improve the manuscript.

      We are sorry that at least one reviewer seems to have had the impression that we have conflated issues about gonadal and non-gonadal sex phenotypes. This referee suggests that we should use Sharpe et al. (2023) to develop our concepts. However, what is discussed in Sharpe et al. was already the guiding principle for our study (without knowing this paper before). In our paper, we introduce the gonadal binary sex (which is self-evidently also the basis for creating the dataset in the first place, because we needed to separate males from females) and go then on to the question of (adult) sex phenotypes for the rest of the paper. The gonadal data are included only as comparison for contrasting the patterns in the non-gonadal tissues.

      Our study presents the largest systematic dataset so far on the evolution of sex-biased gene expression. It is also the first that explores the patterns of individual variation in sex-biased gene expression and the SBI is an entirely new procedure to directly visualize these variance patterns in an intuitive way (note that the relative position of the distributions along the X-axis is indeed not relevant). The results are actually quite nuanced (e.g. the rather dynamv changes seen in mouse kidney and liver comparisons) and go certainly beyond what would have been predictable based on the current literature.

      Also, we should like to point out that our study contradicts recent conclusions that were published in high profile journals, that had suggested that a substantial set of sex-biased genes has conserved functions between humans and mice and that mice can therefore be informative for gender-specific medicine studies. Our data suggest that that only a very small set of genes are conserved in their sex-biased expression. These are epigenetic regulator genes and it will therefore be interesting in the future to focus on their roles in generating the differences between sexual phenotypes in given species.

      We will be happy to use the referee comments to clarify all of these points in a revised version. But we do not think that our "evidence is incomplete" and that there are several "overstated key conclusions". We have used all canonical statistical analyses that are typically used in papers of sex-biased gene expression, as acknowledged by reviewers 1 and 2. The additional statistical analyses that are requested are not within the scope of such papers, but could be subject to separate general studies, independent of the sex-bias analysis (e.g. the role of highly expressed genes versus low expressed genes, or the analysis of the fraction of neutrally evolving loci).

      Finally, it is unclear why the overall rating of the paper is at the lowest possible category ("useful study"), given that it adds a substantial amount of data and new insights into the exploration of the non-binary nature of sexual phenotypes.

    1. eLife assessment

      This paper provides valuable findings related to the impact and timing of exogenous interleukin 2 on the balance of exhausted (Tex) versus effector (Teff) that differentiate from precursors T cells (Tpex) during chronic viral infection. While the data appear solid, the overall claims that IL-2 suppresses Tpex are only partially supported, with the rationale for the timing of IL-2 treatment and its underlying mechanisms remaining unclear.

    2. Reviewer #1 (Public Review):

      Summary:

      The title states "IL-2 enhances effector function but suppresses follicular localization of CD8+ T cells in chronic infection" which data from the paper show but does not seem to be the major goal of the authors. As stated in the short assessment above, the goal of this work seems to connect IL-2 signals, mostly given exogenously, to the differentiation of progenitor T cells (TPEX) that will help sustain effector T cell responses against chronic viral infection (TEX/TEFF). The authors mostly use chronic LCMV infection in mice as their model of choice, Flow cytometry, fluorescent microscopy, and some in vitro assays to explore how IL2 regulates TPEX and TEX/TEFF differentiation. Gain and loss of functions experiments are also conducted to explore the roles of L2 signaling and BLIMP-1 in regulating these processes. Lastly, a loose connection of their mouse findings on TPEX/TEX cells to a clinical study using low-dose IL-2 treatment in SLE patients is attempted.

      Strengths:

      (1) The impact of IL-2 treatment of TPEX/TEX differentiation is very clear.

      (2) The flow cytometry data are convincing and state-of-the-art.

      Weaknesses:

      (1) The title appears disconnected from the major focus of the work.

      (2) The number of TPEX cells is not changed. IL2 treatment increases the number of TEFF and the proportion of TPEX is lower suggesting it does not target TPEX formation. The conclusion about an inhibitory role of IL2 treatment on TPEX formation seems therefore largely overstated.

      (3) Are the expanded TEX/TEFF cells really effectors? Only GrB and some cell surface markers are monitored (44, 62L). Other functions should be included, e.g., CD107a, IFNg, TNF, chemokines - Tbet?

      (4) The rationale for IL2 treatment timing is unclear. Seems that this is given at the T cell contraction time and this is interesting compared to the early treatment that ablate TPEX generation. Maybe this should really be explored further?

      (5) The TGFb/IL6/IL2 in vitro experiment does not bring much to the paper.

      (6) The Figure 2 data try to provide an explanation for a prior lack of difference in viral titers after IL2 treatment. It is hard to be convinced by these tissue section data as presented. It also begs the question of how the host would benefit from the low dose IL-2 treatment if IL-2 TEFF are not contributing to viral control as a result of their inappropriate localization to viral reservoirs.

      (7) It is unclear what the STA5CA and BLIMP-1 KO experiments in Figure 3 add to the story that is not already expected/known.

      (8) The connection to the low-dose IL2 treatment in SLE patients is very loose and weak. This version is likely not the ligand that preferentially signals to CD122 either. SLE is different from a chronic viral infection and the question of timing seems critical from all the data shown in this manuscript. So it is very difficult to make any robust link to the mechanistic data.

      (9) It is really unclear what the take-home message is. IL-2 is signaling via STAT5 and BLIMP1 is also a known target as published by many groups including this one, and these results are more than expected. The observation that TEFF may be differentially localized in the WP area is interesting but no mechanisms are really provided (guessing CXCR5 but again expected). Also, all these observations are highly dependent on the timing of IL2 administration which is fascinating but not explored at all. It also limits significance since underlying mechanisms are unknown and we do not know when such treatment would have to be given.

    3. Reviewer #2 (Public Review):

      This study utilized the LCMV Docile infection model, which induces chronic and persistent infection in mice, leading to T cell exhaustion and dysfunction. Through exogenous IL-2 fusion protein treatment during the late stage of infection, the researchers found that IL-2 treatment significantly enlarges the antigen-specific effector CD8 T cells, expanding the CXCR5-TCF1- exhausted population (Tex) while maintaining the size of the CXCR5+TCF1+ precursors of exhausted T cell population (Tpex). This preservation of the Tpex population's self-renewing capacity allows for sustained T cell proliferation and antiviral responses.

      The authors discovered a dual effect of IL-2 treatment: it decreases CXCR5 expression on Tpex cells, restricting their entry into the B cell follicle. This may explain why IL-2 treatment has little impact on overall viral control. However, this finding also suggests a potential application of IL-2 treatment for autoimmune diseases, as it can suppress specific immune responses within the B cell follicle. Using imaging-based approaches, the team provided direct evidence that IL-2 treatment shifts the viral load to concentrate within the B cell follicle, correlating with the observed decrease in CXCR5 expression.

      Further, the researchers showed that ectopic expression of constitutively active STAT5, downstream of IL-2 induced cytokine signaling, in P14 TCR transgenic T cells (specific for an LCMV epitope), drove the T cell population toward the CXCR5- Tex phenotype over the CXCR5+ Tpex cells in vivo. Additionally, abrogating Blimp1, upregulated by active IL-2-phosphorylated STAT5 signaling, restored the CXCR5+ Tpex population.

      Building on these results, the researchers used an engineered IL-2 fusion protein, ANV410, targeting the beta-chain of the IL-2 receptor CD122, which successfully replicated their earlier findings. Importantly, the Tpex-sustaining effect of IL-2 was only observed when treatment was administered during the late stage of infection, as early treatment suppressed Tpex cell generation. Immune profiling of SLE patients undergoing low-dose IL-2 treatment showed a similar reduction in the CXCR5+ Tpex cell population.

      This study provides compelling data on the physiological consequences of IL-2 treatment during chronic viral infection. By leveraging the chronic and persistent LCMV Docile infection model, the researchers identified the temporal effects of IL-2 fusion protein treatment, offering strategic insights for therapies targeting cancer and autoimmune diseases.

    1. eLife assessment

      This is a mechanistic study showing the effect of combining inhibition of autophagy (through ULK1/2) and KRAS (using sotorasib) on KRAS mutant NSCLC making the study valuable to cancer biologists and more broadly in a clinical setting. The evidence generated by GEM mouse models and cell lines is solid but could be further strengthened by increasing the mouse cohort size. This study holds translational relevance beyond NSCLC to other indications that carry KRAS mutations.

    2. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Ghazi et reported that inhibition of KRASG12C signaling increases autophagy in KRASG12C expressing lung cancer cells. Moreover, the combination of DCC 3116, a selective ULK1/2 inhibitor, plus sotorasib displays cooperative/synergistic suppression of human KRASG12C driven lung cancer cell proliferation in vitro and tumor growth in vivo. Additionally, in genetically engineered mouse models of KRASG12C driven NSCLC, inhibition of either KRASG12C or ULK1/2 decreases tumor burden and increases mouse survival. Additionally, this study found that LKB1 deficiency diminishes the sensitivity of KRASG12C/LKB1Null-driven lung cancer to the combination treatment, perhaps through the emergence of mixed adeno/squamous cell carcinomas and mucinous adenocarcinomas.

      Strengths:

      Both human cancer cells and mouse models were employed in this study to illustrate that inhibiting ULK1/2 could enhance the responsiveness of KRASG12C lung cancer to sotorasib. This research holds translational importance.

      Weaknesses:

      The revised manuscript has addressed most of my previous concerns. However, I still have one issue: the sample size (n) for the GEMM study in Figures 4E and 4F is too small, despite the authors' explanation. The data do not support the conclusion due to the lack of significant difference in tumor burden. Additionally, the significance labels in Figure 4E are not clearly explained.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Given that KRAS inhibition approaches are a relatively new innovation and that resistance is now being observed to such therapies in patients with NSCLC, investigation of combination therapies is valuable. The manuscript furthers our understanding of combination therapy for KRAS mutant non-small cell lung cancer by providing evidence that combined inhibition of ULK1/2 (and therefore autophagy) and KRAS can inhibit KRAS-mutant lung cancer growth. The manuscript will be of interest to the lung cancer community but also to researchers in other cancer types where KRAS inhibition is relevant.

      Strengths:

      The manuscript combines cell line, cell line-derived xenograft, and genetically-engineered mouse model data to provide solid evidence for the proposed combination therapy.  The manuscript is well written, and experiments are broadly well performed and presented.

      We thank Reviewer #1 (R1) for the generally favorable review of our manuscript, and also for the more detailed critique that identifies potential weaknesses in the research, which we address on a point-by-point basis below. 

      Weaknesses:

      With 3-4 mice per group in many experiments, experimental power is a concern and some comparisons (e.g. mono vs combination therapy) seem to be underpowered to detect a difference. Both male and female mice are used in experiments which may increase variability.

      We thank R1 for pointing out concerns regarding statistical power in our various mouse models of NSCLC experiments, and agree that more mice per group would certainly increase statistical power.  However, there are certain logistical considerations that impact the generation of cohorts of experimental KrasLSL-G12C mice.  Because mice homozygous for the KrasLSL-G12C allele display embryonic lethality, we are required to generate experimental mice by crossing heterozygous male and female KrasLSL-G12C mice.  Although 66% of the progeny of such crosses are predicted to be KrasLSL-G12C/+, experience tells us that we only obtain ~40-50% heterozygous KrasLSL-G12C/+ mice with litter sizes around 6-8 mice from such crosses.  Therefore, there are usually only about 4 heterozygous KrasLSL-G12C mice per litter, which presents a substantial challenge in generating larger cohorts of age-matched mice suitable for experiments, especially under conditions where we wish to euthanize mice at multiple time points for analysis.  For the GEM model experiments, Figure 3B is the only experiment that has n=3.  All other experiments contain 4-6 mice per experimental condition.  We rationalized using both male and female mice because both human males and females have high lung cancer rates.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Ghazi et reported that inhibition of KRASG12C signaling increases autophagy in KRASG12C-expressing lung cancer cells. Moreover, the combination of DCC 3116, a selective ULK1/2 inhibitor, plus sotorasib displays cooperative/synergistic suppression of human KRASG12C-driven lung cancer cell proliferation in vitro and tumor growth in vivo. Additionally, in genetically engineered mouse models of KRASG12C-driven NSCLC, inhibition of either KRASG12C or ULK1/2 decreases tumor burden and increases mouse survival. Additionally, this study found that LKB1 deficiency diminishes the sensitivity of KRASG12C/LKB1Null-driven lung cancer to the combination treatment, perhaps through the emergence of mixed adeno/squamous cell carcinomas and mucinous adenocarcinomas.

      Strengths:

      Both human cancer cells and mouse models were employed in this study to illustrate that inhibiting ULK1/2 could enhance the responsiveness of KRASG12C lung cancer to sotorasib. This research holds translational importance.

      We thank Reviewer #2 (R2) for the generally favorable review of our manuscript, and also for the more detailed critique that identifies potential weaknesses in the research, which we address on a point-by-point basis below. 

      Weaknesses:

      Additional validation of certain data is necessary.

      (1) mCherry-EGFP-LC3 reporter was used to assess autophagy flux in Figure 1A. Please explain how autophagy status (high, medium, and low) was defined. It's also suggested to show WB of LC3 processing in different treatments as in Figure 1A at 48 hours.

      We thank the reviewer for this comment and agree that a more thorough description of how autophagy status is assessed using the Fluorescent Autophagy Reporter (FAR) would benefit the readers of our manuscript.  Cells engineered to express the FAR are analyzed by flow cytometry in which we defined autophagy status by gating viable (based Sytox Blue staining), DMSO-treated control cells into three bins based on the ratio of EGFP:mCherry fluorescence.  We gate all live cells into the 33% highest EGFP-positive cells (autophagy low) and the 33% highest mCherry-positive cells (autophagy high), and therefore, the proportion in the middle is also approximately 33% and considered the medium autophagy status.  Again, these gates are based entirely on the DMSO-treated control cells, and all other treatments within the experiment are compared to settings on these gates.  In response to a specific manipulation (sotorasib, trametinib, DCC-3116 etc) we assess how the specific treatment changes the percentages of cells in each of the pre-specified gates to assess increased autophagy (decreased EGFP:mCherry ratio) or decreased autophagy (increased increased EGFP:mCherry ratio). 

      Although LC3 processing and/or the expression of p62SQSTM1 are used by others as markers of autophagy, there is much debate in the literature as to how reliable immunoblotting analysis of LC3 processing or p62SQSTM1 expression are as measures of autophagy.  Certainly, in our hands, we find that the Fluorescent Autophagy Reporter is a much more sensitive measure of changes in autophagy in various different cancer cell lines as we have described in previous papers (Kinsey et al., PMID: 30833748, Truong et al., PMID: 32933997 and Silvis & Silva et al., PMID: 36719686).  Furthermore, in the omnibus publication that describes techniques for measuring autophagy (Klionsky et al., PMID: 33634751) the use of the FAR (or similarly configured reporters) is regarded as the gold standard for measuring autophagy status in cells.  We have amended the Materials & Methods section of our manuscript to better describe the use of the FAR in measuring autophagy. 

      (2) For Figures 1J, K, and L, please provide immunohistochemistry (IHC) images demonstrating RAS downstream signaling blockade by sotorasib and autophagy blockade by DCC 3116 in tumors.

      We thank the reviewer for the comment and have probed the tumors from the xenograft experiments in Figures 1J, K, and L for pERK1/2 and p62SQSTM1 to determine the biochemical activity of sotorasib or DCC-3116, respectively and have provided representative images below. We observed the expected decrease in pERK and p62 signal after sotorasib treatment in all three xenografted cell lines. We did observe the expected accumulation of p62 in the DCC-3116 treated tumors from the NCI-H2122 and NCI-H358 cell lines. There appears to be no difference between the vehicle and DCC-3116 treated tumors in the NCI-H358 cell line-derived tumors as detected by IHC.

      Author response image 1.

      (3) Given that both DCC 3116 and ULK1K46N exhibit the ability to inhibit autophagy and synergize with sotorasib in inhibiting cell proliferation, in addition to demonstrating decreased levels of pATG13 via ELISA assay, please include Western blot analyses of LC3 or p62 to confirm the blockade of autophagy by DCC 3116 and ULK1K46N in Figure 1 & Figure 2.

      We appreciate the reviewer's comment and have performed an immunoblot analysis of cells treated with DCC-3116 or expressing ULK1K46N and probed for p62SQSTM1 and LC3 expression.  We did observe the expected accumulation of p62 SQSTM1 in NCI-H2122 (ULK1K46N) cells treated with 1ug/ml doxycycline to induce expression of ULK1K46N compared to DMSO treatment.  Additionally, we treated the human cell lines from Figure 1 with sotorasib and/or DCC-3116 and tested for p62SQSTM1 expression after 48 hours of treatment. In the human cell lines NCI-H2122 and NCI-H358, there was a decrease in the p62 signal with increasing doses of sotorasib, as expected. There was no detectable change in p62 levels in the Calu-1 cells by immunoblot. For LC3-I/LC3-II, there was only one detectable band in the NCI-H2122 cells, which makes it difficult to interpret the results and further emphasizes why we use the fluorescent autophagy reporter which is more sensitive than immunoblotting. There is no detectable change in LC3-I/LC3-II in the Calu-1 cells treated with increasing doses of sotorasib, but the expected decrease in LC3-I is observed with sotorasib treatment in the NCI-H358 cells.

      Author response image 2.

      (4) Since adenocarcinomas, adenosquamous carcinomas (ASC), and mucinous adenocarcinomas were detected in KL lung tumors, please conduct immunohistochemistry (IHC) to detect these tumors, including markers such as p63, SOX2, Katrine 5.

      We have included IHC analysis of the adenosquamous carcinomas for the markers p63, SOX2, and Keratin 5 from the KL mouse in Figure 3 and the ASC tumors in Supplemental Figure 4, and thank the reviewer for this excellent suggestion. The straining for these markers is below. Of note, we tried two different SOX2 antibodies (cell signaling technologies #14962 and cell signaling technologies # 3728) and could not detect any staining in any section.

      Author response image 3.

      (5) Please provide the sample size (n) for each treatment group in the survival study (Figure 4E). It appears that all mice were sacrificed for tumor burden analysis in Figure 4F. However, there doesn't seem to be a significant difference among the treatment groups in Figure 4F, which contrasts with the survival analysis in Figure 4E. It is suggested to increase the sample size in each treatment group to reduce variation.

      We have updated Figure 4E to indicate sample size for each treatment group and thank the reviewer for this suggestion.  Any mice that remained on study through the entire 8-week treatment regimen were sacrificed after the last day of treatment (Day 56).  Figure 4F indicates analysis of total tumor burden in all mice that remained on treatment for the full 8 weeks and mice that reached euthanasia criteria before the end of the 8-week treatment.  Therefore, it is important to note that the mice in Figure 4F were not all euthanized on the same day.  There is no statistically significant difference between the 3 treatment groups (sotorasib, DCC-3116, combination).  This may be due to a lower sample size as well as ending the treatment at 8 weeks as opposed to continuing the treatment for a longer period of time.  Although we agree that increasing the sample size would benefit the study, due to how long the GEMM model experiments take (12-16 weeks of breeding, 6 weeks for the mice to reach adulthood, 10 weeks of tumor formation post-initiation, 8 weeks of treatment= ~40 weeks) we would respectfully submit that the analysis of additional mice is outside the scope of the current revised manuscript.

      (6) In KP mice (Figure 5), it seems that a single treatment alone is sufficient to inhibit established KP lung tumor growth. Combination treatment does not further enhance anti-tumor efficacy. Therefore, this result doesn't support the conclusion generated from human cancer cell lines. Please discuss.

      We thank the reviewer for this observation.  Indeed, KP lung tumors were sensitive to single agent DCC-3116 treatment, which is reflected in the tumor burden analysis.  This was somewhat surprising to us as we have not previously detected much anti-tumor activity using 4-amino-quinoloines (chloroquine or hydroxychloroquine) or other autophagy inhibitors.  It should be noted however that the KRASG12C/TP53R175H NSCLC model has a very low tumor burden overall (~4% in vehicle-treated mice).  Additionally, our microCT imager cannot detect AAH and small tumors at the settings/resolution used.  Therefore, we were limited in our ability to detect small tumors or hyperplasia by microCT imaging.  Although there was a decrease in overall tumor burden with single agent DCC-3116 treatment, we could not demonstrate using microCT imaging that KRASG12C/TP53R175H lung tumors were actually regressing with single agent DCC-3116 treatment.  The larger tumors that were detected appeared to show a cytostatic effect (i.e. no or slow growth) with DCC-3116 monotherapy.  This may reflect our inability to detect regression of AAH or small tumors with the microCT.  In all human cell lines tested, the only cell line that responded to single agent DCC-3116 treatment was NCI-H358 cells, which do have a complete heterozygous loss of the TRP53 gene and lack TP53 protein.  However, other cells that also have a loss of expression of TP53 expression (Calu-1) are insensitive to single-agent DCC-3116 treatment. Due to the low mutational burden of the KP mouse model compared to human NSCLC cell lines driven by mutationally-activated KRASG12C and the loss of TP53 function, it is difficult to directly compare GEM models to the human cell line models.  Most of the human cell lines have alterations in other genes that are not altered in the KP mouse model which could affect the sensitivity of treatment.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments:

      (1) Figure legends are currently not adequate - information about the number and nature of replicates, stats, and definitions of the labelling used for stats should be added throughout. In Figure 5B, only two lines of four are labelled with * or ns.

      We thank the reviewer for this comment and have included more details in the figure legends that describe replicates, statistical analysis and definitions of labeling.  We also note that the methods section has a detailed description of the statistical analysis used.

      (2) What statistical test is performed on Figure 5E to get a p < 0.05 between the vehicle and DCC group?

      We performed a one-way ANOVA for all statistical analyses with more than 2 experiential groups. We thank the reviewer for pointing out this typo. These data points (vehicle vs. DCC-3116) are not statistically significant, which has been revised in the figure.

      (3) The manuscript figures would be improved by the use of a colourblind-friendly palette.

      We have previously published multiple manuscripts using this color scheme for the fluorescent autophagy reporter experiments and chose to use red and green as the reporter uses EGFP and mCherry.  We wanted to keep this color scheme consistent across our publications and would prefer not to change the colors.  However, we agree with the reviewer that the data should be accessible to all people and, therefore, have updated these graphs to include slashes over the red color to ease in telling the differences between the red and green colors.  Thank you to the reviewer for this excellent suggestion.

      (4) The manuscript should be fully checked for mouse (sentence case) and human (caps) gene (italics) and protein (non-italics).

      In this manuscript we are using the nomenclatures approved by the HUGO Gene Nomenclature Committee (https://en.wikipedia.org/wiki/HUGO_Gene_Nomenclature_Committee) in which:

      Human genes are written as KRAS, TP53 etc i.e. ITALICIZED CAPS

      Mouse genes are written as Kras, Trp53 etc:  i.e. Italicized and sentence case

      Human and mouse proteins are written as KRAS, TP53 etc:  i.e. NON-ITALICIZED CAPS

      In response to the reviewer’s suggestion, we have gone through the manuscript to check for this and make any appropriate changes.  Of note, we intentionally refer to the mouse protein changes as KRASG12C/LKB1null or KRASG12C/TP53R172H (capitalized), as this references the protein change and not the nucleotide change that occurs in the gene.

      (5) Adenosquamous is the correct term for the disease.  In parts, it's referred to as adeno/squamous or adeno-squamous.  The abbreviation ADC is also defined many times.

      Thank you to the reviewer for this comment.  We have corrected the manuscript text to only use adenosquamous and only define ADC in the first instance.

      (6) Line 434 - "as previously described" but no reference.

      Typos:

      (1) Line 117 – either

      (2) Line 314 – synergistic

      (3) Line 317 – therefore

      (4) Line 502 – medium

      We thank the reviewer for pointing out these typos and have modified the text appropriately.

      Reviewer #2 (Recommendations For The Authors):

      (1) The statement on Page 4, Lines 119-120, lacks clarity: 'Furthermore, LKB1 silencing diminishes the sensitivity of KRASG12C/LKB1Null-driven lung cancer perhaps through the emergence of mixed adeno/squamous cell carcinomas and mucinous adenocarcinomas.  It is unclear whether this refers to the sensitivity to the combination treatment or to the KRASc inhibitor alone.

      We thank the reviewer for this comment and agree that the statement lacks clarity.  The intent of this statement was to refer to both single agent sotorasib treatment as well as the combination with DCC-3116.  

      (2) Page 5 Line 147 "KRASG12X ". Please correct this typo.

      We thank the reviewer for this comment, but this is not a typo. We intended for this line to state KRASG12X to refer to cell lines with any KRASG12 alteration, e.g KRASG12D, KRASG12C, KRASG12S, KRASG12R etc.  

      (3) The color of the dots in Figure 5B labeling does not match the dots in the graph.

      For all bar graphs in the manuscript, the dots representing individual mice are black, and the bar itself is color-coded based on treatment type. The dots in Figure 5B follow this pattern and are intended to be this way.

      (4) Figure 5C depicts lung weight rather than tumor growth, contrary to the text description "regression of pre-existing lung tumors was detected by microCT scanning (Figure 5C, Figure S5)".

      Figure 5C does not depict lung weight but the percent body weight change in treated mice, described in the figure legend.  We thank the reviewer for pointing this out because we referenced the wrong panel in the text.  The figures referenced should be Figure 5B, Figure S5.  We have corrected this in the text.

    1. Reviewer #1 (Public Review):

      Summary:

      The authors profile gene expression, chromatin accessibility and chromosomal architecture (by Hi-C) in activated CD4 T cells and use this information to link non-coding variants associated with autoimmune diseases with putative target genes. They find over a 1000 genes physically linked with autoimmune disease loci in these cells, many of which are upregulated upon T cell activation. Focusing on IL2, they dissect the regulatory architecture of this locus, including the allelic effects of GWAS variants. They also intersect their variant-to-gene lists with data from CRISPR screens for genes involved in CD4 T cell activation and expression of inflammatory genes, finding enrichments for regulators. Finally, they showed that pharmacological inhibition of some of these genes impacts T cell activation.

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as explore the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation.

      Autoimmune disease variants were already linked with genes in CD28-stimulated CD4 T cells using chromosome conformation capture, specifically Promoter CHi-C and the COGS pipeline (Javierre et al., Cell 2016; Burren et al., Genome Biol 2017; Yang et al., Nat Comms 2020). The authors cite these papers and present a comparative analysis of their variant-to-gene assignments (in addition to scRNA-seq eQTL-based assignments). Furthermore, they find that the Burren analysis yields a higher enrichment for gold standard genes.

      I thank the authors for their revisions in response to my initial review. The revised version now includes a more comprehensive comparative analysis of different datasets and V2G approaches and discusses the potential sources of differences in the results. Most significantly, the authors have now included an interesting comparison of their methodology with the popular ABC technique and outlined the key limitations of ABC relative to their method and other (Capture) Hi-C-based V2G approaches.

    2. eLife assessment

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as exploring the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation.

    3. Reviewer #2 (Public Review):

      Summary:

      There is significant interest in characterizing the mechanisms by which genetic mutations linked to autoimmunity perturb immune processes. Pahl et al. collect information of dynamic accessible regions, genes, and 3D contacts in primary CD4+ T cell samples that have been stimulated ex vivo. The study includes a variety of analyses characterizing these dynamic changes. With TF footprinting they propose factors linked to active regulatory elements. They compare the performance of their variant mapping pipeline that uses their data versus existing datasets. Most compelling there was a deep dive into additional study of regulatory elements nearby the IL2 gene. Finally, they perform a pharmacological screen targeting several genes they suggest are involved in T cell proliferation.

      Strengths:

      - The work done characterizing elements at the IL2 locus is impressive.

      Weaknesses:

      - There are extensive studies performed on resting and activated immune cell states (CD4+ T cells and other cell types) and some at multiple time points or concentrations of stimuli that collect ATAC-seq and/or RNA-seq. Several analyses performed in published studies were similarly performed in this study. I expected the authors to at least briefly mention published studies and whether their conclusions generally agree or disagree. Are the same dynamic regulatory regions or genes identified upon T cell activation? Are the same TF footprints enriched in these dynamic regulatory elements? In the revision, I appreciate that the authors now include additional data from several studies that I had initially suggested for the purposes of nominating disease genes in their precision-recall analysis.

    4. Reviewer #3 (Public Review):

      Summary:

      This paper used RNAseq, ATACseq, and Hi-C to assess gene expression, chromatin accessibility, and chromatin physical associations for native CD4+ T cells as they respond to stimulation through TCR and CD28. With these data in hand, the author identified 423 GWAS signals to their respective target genes, where most of these were not in the proximal promoter, but rather distal enhancers. The IL-2 gene was used as an example to identify new distal cis regulatory regions required for optimal IL-2 gene transcription. These distal elements interact with the proximal IL2 promoter region. When the distal enhancer contained an autoimmune SNP, it affected IL-2 gene transcription. The authors also identified genetic risk variants that were associated to genes upon activation. Some of these regulate proliferation and cytokine production, but others were novel.

      Strengths:

      This paper provides a wealth of data related to gene expression after CD4 T cells are activated through the TCR and CD28. An important strength of this paper is that these data were intensively analyzed to uncover autoimmune disease SNPs in cis acting regions. Many of these could be assigned to likely target genes even though they often are in distal enhancers. These findings help to provide a better understanding concerning the mechanism by which GWAS risk elements impact gene expression.

      Another strength to this study was the proof-of-principle studies examining the IL-2 gene. Not only were new cis acting enhancers discovered, but they were functionally shown to be important in regulating IL-2 expression, including susceptibility to colitis. Their importance was also established with respect to such distal enhancers harboring disease relevant SNPs, which were shown to affect IL-2 transcription.

      The data from this study were also mined against past Crispr screens that identified genes that control aspects of CD4 T cell activation. From these comparisons, novel genes were identified that function during T cell activation.

      Weaknesses:

      A weakness from this study is that few individuals were analyzed, i.e., RNAseq and ATACseq (n=3) and HiC (n=2). Thus, the authors may have underestimated potentially relevant risk associations by their chromatin capture-based methodology. This might account for low overlap of their data with the eQTL-based approach or the HIEI truth set.

      The authors explain that the low overlap is not due to few GWAS associations by HiC. The expanded discussion in the revised manuscript provides a framework to help explain inherent differences between these methods that may contribute to the low overlap.

      Impact:

      This study indicates that defining distal chromatin interacting regions help to identify distal genetic elements, including relevant variants, that contribute to gene activation.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary:

      The authors profile gene expression, chromatin accessibility, and chromosomal architecture (by Hi-C) in activated CD4 T cells and use this information to link non-coding variants associated with autoimmune diseases with putative target genes. They find over 1000 genes physically linked with autoimmune disease loci in these cells, many of which are upregulated upon T cell activation. Focusing on IL2, they dissect the regulatory architecture of this locus, including the allelic effects of GWAS variants. They also intersect their variant-to-gene lists with data from CRISPR screens for genes involved in CD4 T cell activation and expression of inflammatory genes, finding enrichments for regulators. Finally, they showed that pharmacological inhibition of some of these genes impacts T-cell activation. 

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as exploring the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation. 

      Suggestions for improvement:

      Autoimmune disease variants were already linked with genes in CD28-stimulated CD4 T cells using chromosome conformation capture, specifically Promoter CHi-C and the COGS pipeline (Javierre et al., Cell 2016; Burren et al., Genome Biol 2017; Yang et al., Nat Comms 2020). The authors cite these papers and present a comparative analysis of their variant-to-gene assignments (in addition to scRNA-seq eQTL-based assignments). Furthermore, they find that the Burren analysis yields a higher enrichment for gold standard genes. 

      The obvious question that the authors don't venture into is why the results are quite different. In principle, this could be due to the differences between: 

      (a) the cell stimulation procedure 

      (b) the GWAS datasets used 

      (c)  the types of assay (Hi-C vs Capture Hi-C) 

      (d) approaches for defining gene-linked regions (loops vs neighbourhoods) 

      (e) how the GWAS signals at gene-linked regions are aggregated (e.g., the flavours of COGS in Javierre and Burren vs the authors' approach)

      Re (a), I'm not sure the authors make it explicitly clear in the main text that the Capture Hi-Cbased studies also use *stimulated* CD4 T cells, particularly in the section "Comparative predictive power...". So the cells used are pretty much the same, and the differences likely arise from points (b) to (e).

      It would be useful for the community to understand more clearly what is driving these differences, ideally with some added data. Could the authors, for example, take the PCHi-C data from Javierre/Burren and use their GWAS data and variant-to-gene assignment algorithms? 

      We greatly appreciate the referee’s expert assessment of our work and its value to the field, and we are glad that the referee was enthused by our comparison of the predictive power of the various V2G approaches. A point not emphasized enough in the original version of the manuscript is that we actually did harmonize the various datasets in the way the referee suggests for the precision/recall analysis. We took the contact maps presented from each paper, mapped genes using the same set of GWAS SNPs, and defined all gene-linked regions using our loop calling approach. This has been clarified in the revised version of the manuscript. We have now included a more thoughtful discussion of the possible sources of discrepancy between the different studies included in the comparison, and our thoughts on the potential sources raised by the referee are outlined below:

      (a) The modes of stimulation used are similar between studies, but timepoints and donors did vary, and ours was the only study that sorted naïve CD4+ T cells before stimulation. These aspects could represent a source of variability. 

      (b) The GWAS is not a source of variability because we re-ran the raw data from all the orthogonal studies through our V2G pipeline using the same GWAS as in the current manuscript. 

      (c) The use of HiC vs. Capture HiC is a likely source of variability. The Capture-HiC datasets included in our comparison are lower resolution (i.e. HindIII) but focus higher sequencing depth at promoters compared to our HiC datasets – i.e., Capture-HiC may mis-call loops to the wrong promoters due to lower resolution as we have shown in our previous study [Su, Human Genetics, 2021], and will miss distal SNP interactions at promoters not included in the capture set. While HiC is unbiased in this regard, HiC will fail to call some SNP-promoter loops called by CaptureHiC because the sequencing power is not specifically focused at promoters. 

      (d) For studies using neighborhood approaches, we re-ran the raw data through our loop calling algorithm to connect distal SNP to gene promoters, and regarding (e) above, we ran the raw data through our V2G pipeline to allow a better comparison.

      In addition, given that the authors use Hi-C, a popular method for V2G prioritisation for this type of data is currently ABC (Nasser et al, Nature 2021). Could the authors provide a comparative analysis with respect to the V2G assignments in the paper and, if they see it appropriate, also run ABC-based GWAS integration on their own Hi-C data?

      This is an excellent suggestion, which we have followed in the revised version of our manuscript. It should be noted (and we do so in the text of the revision) that there is an important caveat to bringing in the ABC model. Chromosome conformation-based approaches are biologically constrained (i.e., informed) by the natural structure of chromatin in the nucleus that controls how gene transcription is regulated in cis, and it does so in a way that brings value to GWAS data. However, the ABC model further constrains the input data by imposing non-biological filters that allow the algorithm to be applied, but impose artifactual limitations that may negatively impact interpretation and discovery. In addition to filtering out pseudogenes, bidirectional RNA, antisense RNAs, and small RNAs, the ABC model gene set eliminates genes ubiquitously expressed across tissues (based on the assumption that these genes are driven primarily by elements adjacent to their promoters) and only allows annotation of one promoter per gene, even though the median number of promoters per gene in the human genome is three. In contrast, our chromatin-based V2G removes pseudogenes, but includes lincRNA and small RNAs, and includes all alternative transcription start sites annotated by gencode. 

      To apply the ABC GWAS gene nomination model to our CD4+ T cell chromatin-based V2G data, we used our ATAC-seq data and publicly available CD4+ T cell H3K27ac ChIP-seq data as input, and integrated this with GWAS and the average ENCODE-derived HiC dataset from the original ABC paper. The activity-by-contact model nominated 650 genes, compared to 1836 genes when using our cell type-matched HiC data and analysis pipeline. Only 357 of these genes were nominated by both approaches; 1479 genes nominated by our approach were not nominated by ABC, while 293 genes not implicated by our approach were newly implicated by ABC. To determine how the ABC-constrained approach performs against the HIEI gold standard set, we subjected all datasets used for the comparison depicted in the new Figure 5D to the same promoter filter used by the ABC model prior as part of the precision-recall re-analysis. Firstly, we found that applying the restricted ABC model promoter annotation to all datasets did not have a large effect on recall, however, the precision of several of the datasets were affected. For example, using the restricted promoter set reduced the precision of our (Pahl) V2G approach and inflated the precision of the nearest gene to SNP metric. Second, the new precision-recall analysis shows that the ABC score-based approach is only half as sensitive at predicting HIEI genes as the chromatin-based V2G approaches. This indicates that constraining GWAS data with cell type- and state-specific 3D chromatin-based data brings more GWAS target gene predictive power than application of the multi-tissue-averaged HiC used by the ABC model. We thank the reviewer for helpful suggestions that have improved the quality of our study.

      Reviewer #2 (Public Review): 

      Summary:

      There is significant interest in characterizing the mechanisms by which genetic mutations linked to autoimmunity perturb immune processes. Pahl et al. collect information on dynamic accessible regions, genes, and 3D contacts in primary CD4+ T cell samples that have been stimulated ex vivo. The study includes a variety of analyses characterizing these dynamic changes. With TF footprinting they propose factors linked to active regulatory elements. They compare the performance of their variant mapping pipeline that uses their data versus existing datasets. Most compelling there was a deep dive into additional study of regulatory elements nearby the IL2 gene. Finally, they perform a pharmacological screen targeting several genes they suggest are involved in T cell proliferation. 

      Strengths:

      The work done characterizing elements at the IL2 locus is impressive. 

      Weaknesses:

      Missing critical context to evaluate claims. There are extensive studies performed on resting and activated immune cell states (CD4+ T cells and other cell types) and some at multiple time points or concentrations of stimuli that collect ATAC-seq and/or RNA-seq that have been ignored by this study. How do conclusions from previous studies compare to what the authors conclude here? It is impossible to evaluate the claims without this additional context. These are a few studies I am familiar with (the authors should perform a more comprehensive search to be sure they're not ignoring existing observations) that would be important to compare/contrast conclusions:  o Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424-431 (2018). 

      - Calderon, D., Nguyen, M.L.T., Mezger, A. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat Genet 51, 1494-1505 (2019). 

      - Gate, R.E., Cheng, C.S., Aiden, A.P. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat Genet 50, 1140-1150 (2018).  o Glinos, D.A., Soskic, B., Williams, C. et al. Genomic profiling of T-cell activation suggests increased sensitivity of memory T cells to CD28 costimulation. Genes Immun 21, 390-408 (2020).  o Gutierrez-Arcelus, M., Baglaenko, Y., Arora, J. et al. Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci. Nat Genet 52, 247-253 (2020). 

      - Kim-Hellmuth, S. et al. Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. Nat. Commun. 8, 266 (2017).  o Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014). 

      - As a general point, I appreciate it when each claim includes a corresponding effect size and p-value, which helps me evaluate the strength of significance of supporting evidence. 

      We greatly appreciate the referee’s expert assessment of our work and emphasis on the value of our functional follow-up studies. Our precision-recall analyses were not meant to represent an exhaustive comparison of all prior GWAS gene nomination studies, although we agree that this could (and should) be done as part of a separate study in a future manuscript. Instead, we focused on gene nomination studies that 1) analyzed resting and activated human CD4+ T cells, 2) whose experimental design was most comparable to our own studies, and 3) had raw data readily available in the appropriate formats to allow re-analysis and harmonization before comparison. This is a point we did not make sufficiently clear in the original version of the manuscript, but have clarified in the revision. 

      Based on this rationale, we agree that the studies by Gate et al. and Ye et al. should be included in our comparative precision-recall analysis, and we have done so in the revised manuscript. The Gate study reported ATAC-seq peak co-accessibility, caQTL, eQTL, and HiC data, and we now include the resulting gene nominations from these datasets in the precision-recall analysis. These datasets performed poorly with respect to nomination of HIEI genes, likely due to small sample numbers and low sequencing depth compared to the other eQTL and chromatin capture-based studies. The eQTL reported by Ye et al. nominated 15 genes for autoimmune traits, two of which were in the ‘truth’ HIEI set (IL7R and IL2RB). This resulted low predictive power but a high precision due to the low number of nominated genes compared to the other V2G datasets. As suggested by referee 1, we have also subjected our data to the ‘activity-by-contact’ (ABC) algorithm and have included this dataset in the comparison as well. Please see Figure 5 in the revised manuscript. 

      We have elected not to include data from the other studies suggested by the referee for the following reasons: The stimulation paradigm used in the Glinos study is very different from that used in other studies. Also, this study and the study by Calderon did not nominate genes. The studies by Alasoo et al. and Kim-Hellmuth et al. analyzed macrophages, which are not a comparable cell type to CD4+ T cells. The allele-specific eQTL study by Gutierrez-Arcelus et al. included relevant the cell type and activation states, but included a relatively small number of samples (24) and variants (561), and the raw data in dbGAP does not readily allow for re-analysis and harmonization with the other studies. We thank the reviewer for helpful suggestions that have improved the quality of our study.

      Reviewer #3 (Public Review): 

      Summary:

      This paper used RNAseq, ATACseq, and Hi-C to assess gene expression, chromatin accessibility, and chromatin physical associations for native CD4+ T cells as they respond to stimulation through TCR and CD28. With these data in hand, the authors identified 423 GWAS signals to their respective target genes, where most of these were not in the proximal promoter, but rather distal enhancers. The IL-2 gene was used as an example to identify new distal cisregulatory regions required for optimal IL-2 gene transcription. These distal elements interact with the proximal IL2 promoter region. When the distal enhancer contained an autoimmune SNP, it affected IL-2 gene transcription. The authors also identified genetic risk variants that were associated with genes upon activation. Some of these regulate proliferation and cytokine production, but others are novel. 

      Strengths:

      This paper provides a wealth of data related to gene expression after CD4 T cells are activated through the TCR and CD28. An important strength of this paper is that these data were intensively analyzed to uncover autoimmune disease SNPs in cis-acting regions. Many of these could be assigned to likely target genes even though they often are in distal enhancers. These findings help to provide a better understanding concerning the mechanism by which GWAS risk elements impact gene expression. 

      Another strength of this study was the proof-of-principle studies examining the IL-2 gene. Not only were new cis-acting enhancers discovered, but they were functionally shown to be important in regulating IL-2 expression, including susceptibility to colitis. Their importance was also established with respect to such distal enhancers harboring disease-relevant SNPs, which were shown to affect IL-2 transcription. 

      The data from this study were also mined against past CRISPR screens that identified genes that control aspects of CD4 T cell activation. From these comparisons, novel genes were identified that function during T cell activation. 

      Weaknesses:

      A weakness of this study is that few individuals were analyzed, i.e., RNAseq and ATACseq (n=3) and HiC (n=2). Thus, the authors may have underestimated potentially relevant risk associations by their chromatin capture-based methodology. This might account for the low overlap of their data with the eQTL-based approach or the HIEI truth set. 

      Impact:

      This study indicates that defining distal chromatin interacting regions helps to identify distal genetic elements, including relevant variants, that contribute to gene activation. 

      We greatly appreciate the referee’s expert assessment of our work and emphasis on the value of our functional follow-up studies. We have ensured that all sample sizes, effect sizes, p values and FDR statistics are included in the figures and figure legends. We agree that including more donors for the HiC studies would increase the number of implicated variants and genes, however, all the chromatin-based V2G approaches described in our manuscript use relatively small sample sizes, but implicate more variants and genes than the comparable eQTL studies. I.e., the low overlap is not driven by a paucity of GWAS-chromatin-based associations. An alternative explanation for the low overlap between GWAS-chromatin-based approaches and eQTL approaches was recently by Pritchard and colleagues, who reported that GWAS and eQTL studies systematically implicate different types of variants (Mostafavi et al., Nature Genetics 2023). Among other differences, eQTL tend to implicate nearby genes while GWAS variants implicate distant genes, and our results support this contention. We referred to this study in the original version of the manuscript, but have included a more extensive discussion of potential explanations in the revised version. We thank the reviewer for helpful suggestions that have improved the quality of our study.

    1. eLife assessment

      This is a useful manuscript describing the competitive binding between Parkin domains to define the importance of dimerization in the mechanism of Parkin regulation and catalytic activity. The evidence supporting the importance of Parkin dimerization for an 'in trans' model of Parkin activity described in this manuscript is solid, but lacks more stringent and biochemical characterization of competitive binding that could provide more direct evidence to support the author's conclusions. This work will be of interest to those focused on defining the molecular mechanisms involved in ubiquitin ligase interactions, PINK-Parkin-mediated mitophagy, and mitochondrial organellar quality control.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors used structural and biophysical methods to provide insight into Parkin regulation. The breadth of data supporting their findings was impressive and generally well-orchestrated.

      Strengths:

      (1) They have done a better job explaining the rationale for their experiments thought-out.

      (2) The use of molecular scissors in their construct represents a creative approach to examine inter-domain interactions. Appropriate controls were included.

      (3) From my assessment, the experiments are well-conceived and executed.

      (4) The authors do a better job of highlighting the question being addressed experimentally.

    3. Reviewer #2 (Public Review):

      In the revised manuscript, the authors tried to address some of my comments from the previous round of review. Notably, they have performed some additional ITC experiments where protein precipitation is not an issue to probe interactions between PARKIN and different domains. In addition, they have toned down some of the language in the text to better reflect their data and results. However, I still feel that the manuscript lacks some key answers regarding the relative interactions between p-PARKIN and different domains, as discussed in my previous review. A deeper dive into the underlying biophysical and biochemical features that drive these interactions is important to fully understand the importance of their work. However, this manuscript does provide some interesting potential insights into the mechanisms of PARKIN activation that could be useful for the field moving forward.

    4. Reviewer #3 (Public Review):

      Summary:

      In their manuscript, Lenka et al present data that could suggest an "in trans" model of Parkin ubiquitination activity. Parkin is an intensely studied E3 ligase implicated in mitophagy, whereby missense mutations to the PARK2 gene are known to cause autosomal recessive juvenile parkinsonism. From a mechanistic point of view, Parkin is extremely complex. Its activity is tightly controlled by several modes of auto-inhibition that must be released by queues of mitochondrial damage. While the general overview of Parkin activation has been mapped out in recent years, several details have remained murky. In particular, whether Parkin dimerizes as part of its feed-forward signaling mechanism, and whether said dimerization can facilitate ligase activation, has remained unclear. Here, Lenka et al. use various truncation mutants of Parkin in an attempt to understand the likelihood of dimerization (in support of an "in trans" model for catalysis).

      Strengths:

      The results are bolstered by several distinct approaches including analytical SEC with cleavable Parkin constructs, ITC interaction studies, ubiquitination assays, protein crystallography, and cellular localization studies.

      Weaknesses:

      As presented, however, the storyline is very confusing to follow and several lines of experimentation felt like distractions from the primary message. Furthermore, many experiments could only indirectly support the author's conclusions, and therefore the final picture of what new features can be firmly added to the model of Parkin activation and function is unclear.

      Following peer review and revision, the claims are still not fully supported by direct evidence. While the experimental system may be necessary and/or convenient given the unique challenges in studying Parkin, it does not directly speak toward the conclusions that the authors make, nor does it provide an accurate representation of biology.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors used structural and biophysical methods to provide insight into Parkin regulation. The breadth of data supporting their findings was impressive and generally well-orchestrated. Still, the impact of their results builds on recent structural studies and the stated impact is based on these prior works.

      Strengths:

      (1) After reading through the paper, the major findings are:

      - RING2 and pUbl compete for binding to RING0.

      - Parkin can dimerize.

      - ACT plays an important role in enzyme kinetics.

      (2) The use of molecular scissors in their construct represents a creative approach to examining inter-domain interactions.

      (3) From my assessment, the experiments are well-conceived and executed.

      We thank the reviewer for their positive remark and extremely helpful suggestions.

      Weaknesses:

      The manuscript, as written, is NOT for a general audience. Admittedly, I am not an expert on Parkin structure and function, but I had to do a lot of homework to try to understand the underlying rationale and impact. This reflects, I think, that the work generally represents an incremental advance on recent structural findings.

      To this point, it is hard to understand the impact of this work without more information highlighting the novelty. There are several structures of Parkin in various auto-inhibited states, and it was hard to delineate how this is different.

      For the sake of the general audience, we have included all the details of Parkin structures and conformations seen (Extended Fig. 1). The structures in the present study are to validate the biophysical/biochemical experiments, highlighting key findings. For example, we solved the phospho-Parkin (complex with pUb) structure after treatment with 3C protease (Fig. 2C), which washes off the pUbl-linker, as shown in Fig 2B. The structure of the pUbl-linker depleted phospho-Parkin-pUb complex showed that RING2 returned to the closed state (Fig. 2C), which is confirmation of the SEC assay in Fig. 2B. Similarly, the structure of the pUbl-linker depleted phospho-Parkin R163D/K211N-pUb complex (Fig. 3C), was done to validate the SEC data showing displacement of pUbl-linker is independent of pUbl interaction with the basic patch on RING0 (Fig. 3B). In addition, the latter structure also revealed a new donor ubiquitin binding pocket in the linker (connecting REP and RING2) region of Parkin (Fig. 9). Similarly, trans-complex structure of phospho-Parkin (Fig. 4D) was done to validate the biophysical data (Fig. 4A-C, Fig. 5A-D) showing trans-complex between phospho-Parkin and native Parkin. The latter also confirmed that the trans-complex was mediated by interactions between pUbl and the basic patch on RING0 (Fig. 4D). Furthermore, we noticed that the ACT region was disordered in the trans-complex between phospho-Parkin (1-140 + 141-382 + pUb) (Fig. 8A) which had ACT from the trans molecule, indicating ACT might be present in the cis molecule. The latter was validated from the structure of trans-complex between phospho-Parkin with cis ACT (1-76 + 77-382 + pUb) (Fig. 8C), showing the ordered ACT region. The structural finding was further validated by biochemical assays (Fig. 8 D-F, Extended Data Fig. 9C-E).

      The structure of TEV-treated R0RBR (TEV) (Extended Data Fig. 4C) was done to ensure that the inclusion of TEV and treatment with TEV protease did not perturb Parkin folding, an important control for our biophysical experiments.

      As noted, I appreciated the use of protease sites in the fusion protein construct. It is unclear how the loop region might affect the protein structure and function. The authors worked to demonstrate that this did not introduce artifacts, but the biological context is missing.

      We thank the reviewer for appreciating the use of protease sites in the fusion protein construct.  Protease sites were used to overcome the competing mode of binding that makes interactions very transient and beyond the detection limit of methods such as ITC or SEC. While these interactions are quite transient in nature, they could still be useful for the activation of various Parkin isoforms that lack either the Ubl domain or RING2 domain (Extended Data Fig. 6, Fig. 10). Also, our Parkin localization assays also suggest an important role of these interactions in the recruitment of Parkin molecules to the damaged mitochondria (Fig. 6).

      While it is likely that the binding is competitive between the Ubl and RING2 domains, the data is not quantitative. Is it known whether the folding of the distinct domains is independent? Or are there interactions that alter folding? It seems plausible that conformational rearrangements may invoke an orientation of domains that would be incompatible. The biological context for the importance of this interaction was not clear to me.

      This is a great point. In the revised manuscript, we have included quantitative data between phospho-Parkin and untethered ∆Ubl-Parkin (TEV) (Fig. 5B) showing similar interactions using phospho-Parkin K211N and untethered ∆Ubl-Parkin (TEV) (Fig. 4B). Folding of Ubl domain or various combinations of RING domains lacking Ubl seems okay. Also, folding of the RING2 domain on its own appears to be fine. However, human Parkin lacking the RING2 domain seems to have some folding issues, majorly due to exposure of hydrophobic pocket on RING0, also suggested by previous efforts (Gladkova et al.ref. 24, Sauve et al. ref. 29).  The latter could be overcome by co-expression of RING2 lacking Parkin construct with PINK1 (Sauve et al. ref. 29) as phospho-Ubl binds on the same hydrophobic pocket on RING0 where RING2 binds. A drastic reduction in the melting temperature of phospho-Parkin (Gladkova et al.ref. 24), very likely due to exposure of hydrophobic surface between RING0 and RING2, correlates with the folding issues of RING0 exposed human Parkin constructs.

      From the biological context, the competing nature between phospho-Ubl and RING2 domains could block the non-specific interaction of phosphorylated-ubiquitin-like proteins (phospho-Ub or phospho-NEDD8) with RING0 (Lenka et al. ref. 33), during Parkin activation. 

      (5) What is the rationale for mutating Lys211 to Asn? Were other mutations tried? Glu? Ala? Just missing the rationale. I think this may have been identified previously in the field, but not clear what this mutation represents biologically.

      Lys211Asn is a Parkinson’s disease mutation; therefore, we decided to use the same mutation for biophysical studies.  

      I was confused about how the phospho-proteins were generated. After looking through the methods, there appear to be phosphorylation experiments, but it is unclear what the efficiency was for each protein (i.e. what % gets modified). In the text, the authors refer to phospho-Parkin (T270R, C431A), but not clear how these mutations might influence this process. I gather that these are catalytically inactive, but it is unclear to me how this is catalyzing the ubiquitination in the assay.

      This is an excellent question. Because different phosphorylation statuses would affect the analysis, we ensured complete phosphorylation status using Phos-Tag SDS-PAGE, as shown below.

      Author response image 1.

      Our biophysical experiments in Fig. 5C show that trans complex formation is mediated by interactions between the basic patch (comprising K161, R163, K211) on RING0 and phospho-Ubl domain in trans. These interactions result in the displacement of RING2 (Fig. 5C). Parkin activation is mediated by displacement of RING2 and exposure of catalytic C431 on RING2. While phospho-Parkin T270R/C431A is catalytically dead, the phospho-Ubl domain of phospho-Parkin T270R/C431would bind to the basic patch on RING0 of WT-Parkin resulting in activation of WT-Parkin as shown in Fig. 5E. A schematic figure is shown below to explain the same.

      Author response image 2.

      (7) The authors note that "ACT can be complemented in trans; however, it is more efficient in cis", but it is unclear whether both would be important or if the favored interaction is dominant in a biological context.

      First, this is an excellent question about the biological context of ACT and needs further exploration. While due to the flexible nature of ACT, it can be complemented both in cis and trans, we can only speculate cis interactions between ACT and RING0 could be more relevant from the biological context as during protein synthesis and folding, ACT would be translated before RING2, and thus ACT would occupy the small hydrophobic patch on RING0 in cis. Unpublished data shows the replacement of the ACT region by Biogen compounds to activate Parkin (https://doi.org/10.21203/rs.3.rs-4119143/v1). The latter finding further suggests the flexibility in this region.        

      (8) The authors repeatedly note that this study could aid in the development of small-molecule regulators against Parkin to treat PD, but this is a long way off. And it is not clear from their manuscript how this would be achieved. As stated, this is conjecture.

      As suggested by this reviewer, we have removed this point in the revised manuscript.

      Reviewer #2 (Public Review):

      This manuscript uses biochemistry and X-ray crystallography to further probe the molecular mechanism of Parkin regulation and activation. Using a construct that incorporates cleavage sites between different Parkin domains to increase the local concentration of specific domains (i.e., molecular scissors), the authors suggest that competitive binding between the p-Ubl and RING2 domains for the RING0 domain regulates Parkin activity. Further, they demonstrate that this competition can occur in trans, with a p-Ubl domain of one Parkin molecule binding the RING0 domain of a second monomer, thus activating the catalytic RING1 domain. In addition, they suggest that the ACT domain can similarly bind and activate Parkin in trans, albeit at a lower efficiency than that observed for p-Ubl. The authors also suggest from crystal structure analysis and some biochemical experiments that the linker region between RING2 and repressor elements interacts with the donor ubiquitin to enhance Parkin activity.<br /> Ultimately this manuscript challenges previous work suggesting that the p-Ubl domain does not bind to the Parkin core in the mechanism of Parkin activation. The use of the 'molecular scissors' approach to probe these effects is an interesting approach to probe this type of competitive binding. However, there are issues with the experimental approach manuscript that detract from the overall quality and potential impact of the work.

      We thank the reviewer for their positive remark and constructive suggestions.

      The competitive binding between p-Ubl and RING2 domains for the Parkin core could have been better defined using biophysical and biochemical approaches that explicitly define the relative affinities that dictate these interactions. A better understanding of these affinities could provide more insight into the relative bindings of these domains, especially as it relates to the in trans interactions.

      This is an excellent point regarding the relative affinities of pUbl and RING2 for the Parkin core (lacking Ubl and RING2). While we could purify p-Ubl, we failed to purify human Parkin (lacking RING2 and phospho-Ubl). The latter folding issues were likely due to the exposure of a highly hydrophobic surface on RING0 (as shown below) in the absence of pUbl and RING2 in the R0RB construct. Also, RING2 with an exposed hydrophobic surface would be prone to folding issues, which is not suitable for affinity measurements. A drastic reduction in the melting temperature of phospho-Parkin (Gladkova et al.ref. 24) also highlights the importance of a hydrophobic surface between RING0 and RING2 on Parkin folding/stability. A separate study would be required to try these Parkin constructs from different species and ensure proper folding before using them for affinity measurements.

      Author response image 3.

      I also have concerns about the results of using molecular scissors to 'increase local concentrations' and allow for binding to be observed. These experiments are done primarily using proteolytic cleavage of different domains followed by size exclusion chromatography. ITC experiments suggest that the binding constants for these interactions are in the µM range, although these experiments are problematic as the authors indicate in the text that protein precipitation was observed during these experiments. This type of binding could easily be measured in other assays. My issue relates to the ability of a protein complex (comprising the core and cleaved domains) with a Kd of 1 µM to be maintained in an SEC experiment. The off-rates for these complexes must be exceeding slow, which doesn't really correspond to the low µM binding constants discussed in the text. How do the authors explain this? What is driving the Koff to levels sufficiently slow to prevent dissociation by SEC? Considering that the authors are challenging previous work describing the lack of binding between the p-Ubl domain and the core, these issues should be better resolved in this current manuscript. Further, it's important to have a more detailed understanding of relative affinities when considering the functional implications of this competition in the context of full-length Parkin. Similar comments could be made about the ACT experiments described in the text.

      This is a great point. In the revised manuscript, we repeated ITC measurements in a different buffer system, which gave nice ITC data. In the revised manuscript, we have also performed ITC measurements using native phospho-Parkin. Phospho-Parkin and untethered ∆Ubl-Parkin (TEV) (Fig. 5B) show similar affinities as seen between phospho-Parkin K211N and untethered ∆Ubl-Parkin (TEV) (Fig. 4B). However, Kd values were consistent in the range of 1.0 ± 0.4 µM which could not address the reviewer’s point regarding slow off-rate. The crystal structure of the trans-complex of phospho-Parkin shows several hydrophobic and ionic interactions between p-Ubl and Parkin core, suggesting a strong interaction and, thus, justifying the co-elution on SEC. Additionally, ITC measurements between E2-Ub and P-Parkin-pUb show similar affinity (Kd = 0.9 ± 0.2 µM) (Kumar et al., 2015, EMBO J.), and yet they co-elute on SEC (Kumar et al., 2015, EMBO J.).

      Ultimately, this work does suggest additional insights into the mechanism of Parkin activation that could contribute to the field. There is a lot of information included in this manuscript, giving it breadth, albeit at the cost of depth for the study of specific interactions. Further, I felt that the authors oversold some of their data in the text, and I'd recommend being a bit more careful when claiming an experiment 'confirms' a specific model. In many cases, there are other models that could explain similar results. For example, in Figure 1C, the authors state that their crystal structure 'confirms' that "RING2 is transiently displaced from the RING0 domain and returns to its original position after washing off the p-Ubl linker". However, it isn't clear to me that RING2 ever dissociated when prepared this way. While there are issues with the work that I feel should be further addressed with additional experiments, there are interesting mechanistic details suggested by this work that could improve our understanding of Parkin activation. However, the full impact of this work won't be fully appreciated until there is a more thorough understanding of the regulation and competitive binding between p-Ubl and RIGN2 to RORB both in cis and in trans.

      We thank the reviewer for their positive comment. In the revised manuscript, we have included the reviewer’s suggestion. The conformational changes in phospho-Parkin were established from the SEC assay (Fig. 2A and Fig. 2B), which show displacement/association of phospho-Ubl or RING2 after treatment of phospho-Parkin with 3C and TEV, respectively. For crystallization, we first phosphorylated Parkin, where RING2 is displaced due to phospho-Ubl (as shown in SEC), followed by treatment with 3C protease, which led to pUbl wash-off. The Parkin core separated from phospho-Ubl on SEC was used for crystallization and structure determination in Fig. 2C, where RING2 returned to the RING0 pocket, which confirms SEC data (Fig. 2B).

      Reviewer #3 (Public Review):

      Summary:

      In their manuscript "Additional feedforward mechanism of Parkin activation via binding of phospho-UBL and RING0 in trans", Lenka et al present data that could suggest an "in trans" model of Parkin ubiquitination activity. Parkin is an intensely studied E3 ligase implicated in mitophagy, whereby missense mutations to the PARK2 gene are known to cause autosomal recessive juvenile parkinsonism. From a mechanistic point of view, Parkin is extremely complex. Its activity is tightly controlled by several modes of auto-inhibition that must be released by queues of mitochondrial damage. While the general overview of Parkin activation has been mapped out in recent years, several details have remained murky. In particular, whether Parkin dimerizes as part of its feed-forward signaling mechanism, and whether said dimerization can facilitate ligase activation, has remained unclear. Here, Lenka et al. use various truncation mutants of Parkin in an attempt to understand the likelihood of dimerization (in support of an "in trans" model for catalysis).

      Strengths:

      The results are bolstered by several distinct approaches including analytical SEC with cleavable Parkin constructs, ITC interaction studies, ubiquitination assays, protein crystallography, and cellular localization studies.

      We thank the reviewer for their positive remark.

      Weaknesses:

      As presented, however, the storyline is very confusing to follow and several lines of experimentation felt like distractions from the primary message. Furthermore, many experiments could only indirectly support the author's conclusions, and therefore the final picture of what new features can be firmly added to the model of Parkin activation and function is unclear.

      We thank the reviewer for their constructive criticism, which has helped us to improve the quality of this manuscript.

      Major concerns:

      (1) This manuscript solves numerous crystal structures of various Parkin components to help support their idea of in trans transfer. The way these structures are presented more resemble models and it is unclear from the figures that these are new complexes solved in this work, and what new insights can be gleaned from them.

      The structures in the present study are to validate the biophysical/biochemical experiments highlighting key findings. For example, we solved the phospho-Parkin (complex with pUb) structure after treatment with 3C protease (Fig. 2C), which washes off the pUbl-linker, as shown in Fig. 2B. The structure of pUbl-linker depleted phospho-Parkin-pUb complex showed that RING2 returned to the closed state (Fig. 2C), which is confirmation of the SEC assay in Fig. 2B. Similarly, the structure of the pUbl-linker depleted phospho-Parkin R163D/K211N-pUb complex (Fig. 3C), was done to validate the SEC data showing displacement of pUbl-linker is independent of pUbl interaction with the basic patch on RING0 (Fig. 3B). In addition, the latter structure also revealed a new donor ubiquitin binding pocket in the linker (connecting REP and RING2) region of Parkin (Fig. 9). Similarly, trans-complex structure of phospho-Parkin (Fig. 4D) was done to validate the biophysical data (Fig. 4A-C, Fig. 5A-D) showing trans-complex between phospho-Parkin and native Parkin. The latter also confirmed that the trans-complex was mediated by interactions between pUbl and the basic patch on RING0 (Fig. 4D). Furthermore, we noticed that the ACT region was disordered in the trans-complex between phospho-Parkin (1-140 + 141-382 + pUb) (Fig. 8A) which had ACT from the trans molecule, indicating ACT might be present in the cis molecule. The latter was validated from the structure of trans-complex between phospho-Parkin with cis ACT (1-76 + 77-382 + pUb) (Fig. 8C), showing the ordered ACT region. The structural finding was further validated by biochemical assays (Fig. 8 D-F, Extended Data Fig. 9C-E).

      The structure of TEV-treated R0RBR (TEV) (Extended Data Fig. 4C) was done to ensure that the inclusion of TEV and treatment with TEV protease did not perturb Parkin folding, an important control for our biophysical experiments.

      (2) There are no experiments that definitively show the in trans activation of Parkin. The binding experiments and size exclusion chromatography are a good start, but the way these experiments are performed, they'd be better suited as support for a stronger experiment showing Parkin dimerization. In addition, the rationale for an in trans activation model is not convincingly explained until the concept of Parkin isoforms is introduced in the Discussion. The authors should consider expanding this concept into other parts of the manuscript.

      We thank the reviewer for appreciating the Parkin dimerization. Our biophysical data in Fig. 5C shows that Parkin dimerization is mediated by interactions between phospho-Ubl and RING0 in trans, leading to the displacement of RING2. However, Parkin K211N (on RING0) mutation perturbs interaction with phospho-Parkin and leads to loss of Parkin dimerization and loss of RING2 displacement (Fig. 5C). The interaction between pUbl and K211 pocket on RING0 leads to the displacement of RING2 resulting in Parkin activation as catalytic residue C431 on RING2 is exposed for catalysis. The biophysical experiment is further confirmed by a biochemical experiment where the addition of catalytically in-active phospho-Parkin T270R/C431A activates autoinhibited WT-Parkin in trans using the mechanism as discussed (a schematic representation also shown in Author response image 2).

      We thank this reviewer regarding Parkin isoforms. In the revised manuscript, we have included Parkin isoforms in the results section, too.

      (2a) For the in trans activation experiment using wt Parkin and pParkin (T270R/C431A) (Figure 3D), there needs to be a large excess of pParkin to stimulate the catalytic activity of wt Parkin. This experiment has low cellular relevance as these point mutations are unlikely to occur together to create this nonfunctional pParkin protein. In the case of pParkin activating wt Parkin (regardless of artificial point mutations inserted to study specifically the in trans activation), if there needs to be much more pParkin around to fully activate wt Parkin, isn't it just more likely that the pParkin would activate in cis?

      To test phospho-Parkin as an activator of Parkin in trans, we wanted to use the catalytically inactive version of phospho-Parkin to avoid the background activity of p-Parkin. While it is true that a large excess of pParkin (T270R/C431A) is required to activate WT-Parkin in the in vitro set-up, it is not very surprising as in WT-Parkin, the unphosphorylated Ubl domain would block the E2 binding site on RING1. Also, due to interactions between pParkin (T270R/C431A) molecules, the net concentration of pParkin (T270R/C431A) as an activator would be much lower. However, the Ubl blocking E2 binding site on RING1 won’t be an issue between phospho-Parkin molecules or between Parkin isoforms (lacking Ubl domain or RING2).

      (2ai) Another underlying issue with this experiment is that the authors do not consider the possibility that the increased activity observed is a result of increased "substrate" for auto-ubiquitination, as opposed to any role in catalytic activation. Have the authors considered looking at Miro as a substrate in order to control for this?

      This is quite an interesting point. However, this will be only possible if Parkin is ubiquitinated in trans, as auto-ubiquitination is possible with active Parkin and not with catalytically dead (phospho-Parkin T270R, C431A) or autoinhibited (WT-Parkin). Also, in the previous version of the manuscript, where we used only phospho-Ubl as an activator of Parkin in trans, we tested Miro1 ubiquitination and auto-ubiquitination, and the results were the same (Author response image 4).

      Author response image 4.

      (2b) The authors mention a "higher net concentration" of the "fused domains" with RING0, and use this to justify artificially cleaving the Ubl or RING2 domains from the Parkin core. This fact should be moot. In cells, it is expected there will only be a 1:1 ratio of the Parkin core with the Ubl or RING2 domains. To date, there is no evidence suggesting multiple pUbls or multiple RING2s can bind the RING0 binding site. In fact, the authors here even show that either the RING2 or pUbl needs to be displaced to permit the binding of the other domain. That being said, there would be no "higher net concentration" because there would always be the same molar equivalents of Ubl, RING2, and the Parkin core.

      We apologize for the confusion. “Higher net concentration” is with respect to fused domains versus the domain provided in trans. Due to the competing nature of the interactions between pUbl/RING2 and RING0, the interactions are too transient and beyond the detection limit of the biophysical techniques. While the domains are fused (for example, RING0-RING2 in the same polypeptide) in a polypeptide, their effective concentrations are much higher than those (for example, pUbl) provided in trans; thus, biophysical methods fail to detect the interaction. Treatment with protease solves the above issue due to the higher net concentration of the fused domain, and trans interactions can be measured using biophysical techniques. However, the nature of these interactions and conformational changes is very transient, which is also suggested by the data. Therefore, Parkin molecules will never remain associated; rather, Parkin will transiently interact and activate Parkin molecules in trans.

      (2c) A larger issue remaining in terms of Parkin activation is the lack of clarity surrounding the role of the linker (77-140); particularly whether its primary role is to tether the Ubl to the cis Parkin molecule versus a role in permitting distal interactions to a trans molecule. The way the authors have conducted the experiments presented in Figure 2 limits the possible interactions that the activated pUbl could have by (a) ablating the binding site in the cis molecule with the K211N mutation; (b) further blocking the binding site in the cis molecule by keeping the RING2 domain intact. These restrictions to the cis parkin molecule effectively force the pUbl to bind in trans. A competition experiment to demonstrate the likelihood of cis or trans activation in direct comparison with each other would provide stronger evidence for trans activation.

      This is an excellent point. In the revised manuscript, we have performed experiments using native phospho-Parkin (Revised Figure 5), and the results are consistent with those in Figure 2 ( Revised Figure 4), where we used the K211N mutation.

      (3) A major limitation of this study is that the authors interpret structural flexibility from experiments that do not report directly on flexibility. The analytical SEC experiments report on binding affinity and more specifically off-rates. By removing the interdomain linkages, the accompanying on-rate would be drastically impacted, and thus the observations are disconnected from a native scenario. Likewise, observations from protein crystallography can be consistent with flexibility, but certainly should not be directly interpreted in this manner. Rigorous determination of linker and/or domain flexibility would require alternative methods that measure this directly.

      We also agree with the reviewer that these methods do not directly capture structural flexibility. Also, rigorous determination of linker flexibility would require alternative methods that measure this directly. However, due to the complex nature of interactions and technical limitations, breaking the interdomain linkages was the best possible way to capture interactions in trans. Interestingly, all previous methods that report cis interactions between pUbl and RING0 also used a similar approach (Gladkova et al.ref. 24, Sauve et al. ref. 29).  

      (4) The analysis of the ACT element comes across as incomplete. The authors make a point of a competing interaction with Lys48 of the Ubl domain, but the significance of this is unclear. It is possible that this observation could be an overinterpretation of the crystal structures. Additionally, the rationale for why the ACT element should or shouldn't contribute to in trans activation of different Parkin constructs is not clear. Lastly, the conclusion that this work explains the evolutionary nature of this element in chordates is highly overstated.

      We agree with the reviewer that the significance of Lys48 is unclear. We have presented this just as one of the observations from the crystal structure. As the reviewer suggested, we have removed the sentence about the evolutionary nature of this element from the revised manuscript.

      (5) The analysis of the REP linker element also seems incomplete. The authors identify contacts to a neighboring pUb molecule in their crystal structure, but the connection between this interface (which could be a crystallization artifact) and their biochemical activity data is not straightforward. The analysis of flexibility within this region using crystallographic and AlphaFold modeling observations is very indirect. The authors also draw parallels with linker regions in other RBR ligases that are involved in recognizing the E2-loaded Ub. Firstly, it is not clear from the text or figures whether the "conserved" hydrophobic within the linker region is involved in these alternative Ub interfaces. And secondly, the authors appear to jump to the conclusion that the Parkin linker region also binds an E2-loaded Ub, even though their original observation from the crystal structure seems inconsistent with this. The entire analysis feels very preliminary and also comes across as tangential to the primary storyline of in trans Parkin activation.

      We agree with the reviewer that crystal structure data and biochemical data are not directly linked. In the revised manuscript, we have also highlighted the conserved hydrophobic in the linker region at the ubiquitin interface (Fig. 9C and Extended Data Fig. 11A), which was somehow missed in the original manuscript. We want to add that a very similar analysis and supporting experiments identified donor ubiquitin-binding sites on the IBR and helix connecting RING1-IBR (Kumar et al., Nature Str. and Mol. Biol., 2017), which several other groups later confirmed. In the mentioned study, the Ubl domain of Parkin from the symmetry mate Parkin molecule was identified as a mimic of “donor ubiquitin” on IBR and helix connecting RING1-IBR.

      In the present study, a neighboring pUb molecule in the crystal structure is identified as a donor ubiquitin mimic (Fig. 9C) by supporting biophysical/biochemical experiments. First, we show that mutation of I411A in the REP linker of Parkin perturbs Parkin interaction with E2~Ub (donor) (Fig. 9F). Another supporting experiment was performed using a Ubiquitin-VS probe assay, which is independent of E2. Assays using Ubiquitin-VS show that I411A mutation in the REP-RING2 linker perturbs Parkin charging with Ubiquitin-VS (Extended Data Fig. 11 B). Furthermore, the biophysical data showing loss of Parkin interaction with donor ubiquitin is further supported by ubiquitination assays. Mutations in the REP-RING2 linker perturb the Parkin activity (Fig. 9E), confirming biophysical data. This is further confirmed by mutations (L71A or L73A) on ubiquitin (Extended Data Fig. 11C), resulting in loss of Parkin activity. The above experiments nicely establish the role of the REP-RING2 linker in interaction with donor ubiquitin, which is consistent with other RBRs (Extended Data Fig. 11A).

      While we agree with the reviewer that this appears tangential to the primary storyline in trans-Parkin activation, we decided to include this data because it could be of interest to the field.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) For clarity, a schematic of the domain architecture of Parkin would be helpful at the outset in the main figures. This will help with the introduction to better understand the protein organization. This is lost in the Extended Figure in my opinion.

      We thank the reviewer for suggesting this, which we have included in Figure 1 of the revised manuscript.

      (2) Related to the competition between the Ubl and RING2 domains, can competition be shown through another method? SPR, ITC, etc? ITC was used in other experiments, but only in the context of mutations (Lys211Asn)? Can this be done with WT sequence?

      This is an excellent suggestion. In the revised Figure 5, we have performed ITC experiment using WT Parkin, and the results are consistent with what we observed using Lys211Asn Parkin.

      (3) The authors also note that "the AlphaFold model shows a helical structure in the linker region of Parkin (Extended Data Figure 10C), further confirming the flexible nature of this region"... but the secondary structure would not be inherently flexible. This is confusing.

      The flexibility is in terms of the conformation of this linker region observed under the open or closed state of Parkin. In the revised manuscript, we have explained this point more clearly.

      (4) The manuscript needs extensive revision to improve its readability. Minor grammatical mistakes were prevalent throughout.

      We thank the reviewer for pointing out this and we have corrected these in the revised manuscript.

      (5) The confocal images are nice, but inset panels may help highlight the regions of interest (ROIs).

      This is corrected in the revised manuscript.

      (6) Trans is misspelled ("tans") towards the end of the second paragraph on page 16.

      This is corrected in the revised manuscript.

      (7) The schematics are helpful, but some of the lettering in Figure 2 is very small.

      This is corrected in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      (1) A significant portion of the results section refers to the supplement, making the overall readability very difficult.

      We accept this issue as a lot of relevant data could not be added to the main figures and thus ended up in the supplement.  In the revised manuscript, we have moved some of the supplementary figures to the main figures.

      (2) Interpretation of the experiments utilizing many different Parkin constructs and cleavage scenarios (particularly the SEC and crystallography experiments) is extremely difficult. The work would benefit from a layout of the Parkin model system, highlighting cleavage sites, key domain terminology, and mutations used in the study, presented together and early on in the manuscript. Using this to identify a simpler system of referencing Parkin constructs would also be a large improvement.

      This is a great suggestion. We have included these points in the revised manuscript, which has improved the readability.

      (3) Lines 81-83; the authors say they "demonstrate the conformational changes in Parkin during the activation process", but fail to show any actual conformational changes. Further, much of what is demonstrated in this work (in terms of crystal structures) corroborates existing literature. The authors should use caution not to overstate their original conclusions in light of the large body of work in this area.

      We thank the reviewer for pointing out this. We have corrected the above statement in the revised manuscript to indicate that we meant it in the context of trans conformational changes.

      (4) Line 446 and 434; there is a discrepancy about which amino acid is present at residue 409. Is this a K408 typo? The authors also present mutational work on K416, but this residue is not shown in the structure panel.

      We thank the reviewer for pointing out this. In the revised manuscript, we have corrected these typos.

    1. eLife assessment

      This study presents important findings on the different polymorphs of alpha-synuclein filaments that form at various pH's during in vitro assembly reactions with purified recombinant protein. Of particular note is the discovery of two new polymorphs (1M and 5A) that form in PBS buffer at pH 7. The strength of the evidence presented is convincing. The work will be of interest to biochemists and biophysicists working on protein aggregation and amyloids.

    2. Reviewer #2 (Public Review):

      Summary:

      This is an exciting paper that explores the in vitro assembly of recombinant alpha-synuclein into amyloid filaments. The authors changed the pH and the composition of the assembly buffers, as well as the presence of different types of seeds, and analysed the resulting structures by cryo-EM.

      Strengths:

      By doing experiments at different pHs, the authors found that so-called type 2 and type-3 polymorphs form in a pH dependent manner. In addition, they find that type-1 filaments form in the presence of phosphate ions. One of their in vitro assembled type-1 polymorphs is similar to the alpha-synuclein filaments that were extracted from the brain of an individual with juvenile-onset synucleinopathy (JOS). They hypothesize that additional densities in a similar place as additional densities in the JOS fold correspond to phosphate ions.

      Comments on the revised version:

      This is OK now. I thank the authors for their constructive engagement with my comments.

    3. Reviewer #3 (Public Review):

      Summary

      The high heterogeneity nature of α-synuclein (α-syn) fibrils posed significant challenges in structural reconstruction of the ex vivo conformation. A deeper understanding of the factors influencing the formation of various α-syn polymorphs remains elusive. The manuscript by Frey et al. provides a comprehensive exploration of how pH variations (ranging from 5.8 to 7.4) affect the selection of α-syn polymorphs (specifically, Type1, 2 and 3) in vitro by using cryo-electron microscopy (cryo-EM) and helical reconstruction techniques. Crucially, the authors identify two novel polymorphs at pH 7.0 in PBS. These polymorphs bear resemblance to the structure of patient-derived juvenile-onset synucleinopathy (JOS) polymorph and diseased tissue amplified α-syn fibrils. The revised manuscript more strongly supports the notion that seeding is a non-polymorph-specific in the context of secondary nucleation-dominated aggregation, underscoring the irreplaceable role of pH in polymorph formation.

      Strengths

      This study systematically investigates the effects of environmental conditions and seeding on the structure of α-syn fibrils. It emphasizes the significant influence of environmental factors, especially pH, in determining the selection of α-syn polymorphs. The high-resolution structures obtained through cryo-EM enable a clear characterization of the composition and proportion of each polymorph in the sample. Collectively, this work provides a strong support for the pronounced sensitivity of α-syn fibril structures to the environmental conditions and systematically categorizes previously reported α-syn fibril structures. Furthermore, the identification of JOS-like polymorph also demonstrates the possibility of in vitro reconstruction of brain-derived α-syn fibril structures.

      Weaknesses

      All my previous concerns have been resolved to my satisfaction.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Revisions Round 1

      Reviewer #1

      We thank the reviewer for their careful reading of our manuscript and have taken all of their grammatical corrections into account.

      Reviewer #2 (Public Review): 

      Weaknesses: 

      The paper contains multiple instances of non-scientific language, as indicated below. It would also benefit from additional details on the cryo-EM structure determination in the Methods and inclusion of commonly accepted requirements for cryo-EM structures, like examples of 2D class averages, raw micrographs, and FSC curves (between half-maps as well as between rigid-body fitted (or refined) atomic models of the different polymorphs and their corresponding maps). In addition, cryo-EM maps for the control experiments F1 and F2 should be presented in Figure 9.

      We tried to correct the non-scientific language and have included the suggested data on the Cryo-EM analyses including new Figures 11-17.  We did not collect data on the sample used for the seeds in the cross seeding experiments because we had already confirmed in multiple datasets that the conditions in F1 and F2 reproducibly produce fibrils of Type 1 and Type 3, respectively. We have now analyzed cryo-EM data for 6 more samples at pH 7.0 and found that several kinds of polymorphs (Types 1A, 1M, 2A, 2B and 5) are accessible at this pH, however the Type 3 polymorphs are not formed at pH 7.0 under the conditions that we used for aggregation.

      Reviewer #2 (Recommendations For The Authors): 

      - remove unscientific language: "it seems that there are about as many unique atomic-resolution structures of these aggregates as there are publications describing them"   

      We have rephrased this sentence.

      - for same reason, remove "Obviously, " 

      Done

      - What does this mean? “polymorph-unspecific” 

      Rephrased as non-polymorph-specific

      - What does this mean? "shallow amyloid energy hypersurface"  

      By “shallow hypersurface” we mean that the minimum of the multi-dimensional function that describes the energy of the amyloid is not so deep that subtle changes to the environment will not favor another fold/energy minimum. We have left the sentence because while it may not be perfect, it is concise and seems to get the point across.

      - "The results also confirm the possibility of producing disease-relevant structure in vitro." -> This is incorrect as no disease-relevant structure was replicated in this work. Use another word like “suggest”.

      We have changed to “suggest” as suggested.

      - Remove "historically" 

      Done

      - Rephrase “It has long been understood that all amyloids contain a common structural scaffold” 

      Changed to “It has long been established that all amyloids contain a common structural scaffold..” 

      - "Amyloid polymorphs whose differences lie in both their tertiary structure (the arrangement of the beta-strands) and the quaternary structure (protofilamentprotofilament assembly) have been found to display distinct biological activities [8]" -> I don't think this is true, different biological activities of amyloids have never been linked to their distinct structures.  

      We have added 5 new references (8-12) to support this sentence.

      - Reference 10 is a comment on reference 9; it should be removed. Instead, as for alpha-synuclein, all papers describing the tau structures should be included.  

      We have removed the reference, but feel that the addition of all Tau structure references is not merited in this manuscript since we are not comparing them.

      - Rephrase: "is not always 100% faithful"

      Removed “100%”

      - What is pseudo-C2 symmetry? Do the authors mean pseudo 2_1 symmetry (ie a 2-start helical symmetry)?

      Thank for pointing this out.  We did indeed mean pseudo 21 helical symmetry.  

      - Re-phrase: "alpha-Syn's chameleon-like behavior" 

      We have removed this phrase.

      - "In the case of alpha-Syn, the secondary nucleation mechanism is based on the interaction of the positively charged N-terminal region of monomeric alpha-Syn and the disordered, negatively charged C-terminal region of the alpha-Syn amyloid fibrils [54]" -> I would say the mechanisms of secondary nucleation are not that well understood yet, so one may want to tune this down a bit. 

      We have changed this to “mechanism has been proposed to be”

      - The paragraphs describing experiments by others are better suited for a Discussion rather than a Results section. Perhaps re-organize this part? 

      We have left the text intact as we are using a Results and Discussion format.

      - A lot of information about Image processing seems to be missing: what steps were performed after initial model generation? 

      We have added more details in the methods section on the EM data processing and model analysis.

      - Figure 1: Where is Type 4 on the pH scale?

      We have adjusted the Fig 1 legend to clarify that pH scale is only applicable to the structures presented in this manuscript. 

      - Figure 2: This might be better incorporated as a subpanel of Figure 1.

      We agree that this figure is somewhat of a loner on its own and we only added it in order to avoid confusion with the somewhat inconsistent naming scheme used for the Type 1B structure. However, we prefer to leave it as a separate figure so that it does not get dilute the impact of figure 1.

      - Figure 3: What is the extra density at the bottom of Type 3B from pH 5.8 samples 1 and 2. pH 5.8 + 50mM NaCl (but not pH 5.8 + 100 mM NaCl)? Could this be an indication of a local minimum and the pH 5.8 + 100 mM NaCl structure is correct? Or is this a real difference between 0/50mM NaCl and 100 mM NaCl? 

      We did not see the extra density to which the reviewer is referring, however the images used in this panel are the based on the output of 3D-classification which is more likely to produce more artifacts than a 3D refinement. With this in mind, we did not see any significant differences in the refined structures and therefore only deposited the better quality map and model for each of the polymorph types.

      - Figure 3: To what extent is Type 3B of pH 6.5 still a mixture of different types? The density looks poor. In general, in the absence of more details about the cryo-EM maps, it is hard to assess the quality of the structures presented.  

      In order to improve the quality of the images in this panel, a more complete separation of the particles from each polymorph was achieved via the filament subset selection tool in RELION 5. In each case, an unbiased could be created from the 2D classes via the relion_helix_inimodel2D program, further supporting the coexistence of 4 polymorphs in the pH 6.5 sample. The particles were individually refined to produce the respective maps that are now used in this figure.

      - Many references are incorrect, containing "Preprint at (20xx)" statements.

      This has been corrected.  

      Reviewer #3 (Public Review): 

      Weaknesses: 

      (1) The authors reveal that both Type 1 monofilament fibril polymorph (reminiscent of JOSlike polymorph) and Type 5 polymorph (akin to tissue-amplified-like polymorph) can both form under the same condition. Additionally, this condition also fosters the formation of flat ribbon-like fibril across different batches. Notably, at pH 5.8, variations in experimental groups yield disparate abundance ratios between polymorph 3B and 3C, indicating a degree of instability in fibrillar formation. The variability would potentially pose challenges for replicability in subsequent research. In light of these situations, I propose the following recommendations: 

      (a) An explicit elucidation of the factors contributing to these divergent outcomes under similar experimental conditions is warranted. This should include an exploration of whether variations in purified protein batches are contributing factors to the observed heterogeneity.

      We are in complete agreement that understanding the factors that lead to polymorph variability is of utmost importance (and was the impetus for the manuscript itself). However the number of variables to explore is overwhelming and we will continue to investigate this in our future research. Regarding the variability between batches of purified protein, we also think that this could be a factor in the polymorph variability observed for otherwise “identical” aggregation conditions, particularly at pH 7 where the largest variety of polymorphs have been observed. However, even variation between identical replicates (samples created from the same protein solution and simply aggregated simultaneously in separate tubes) can lead to different outcomes (see datasets 15 and 16 in the revised Table 1) suggesting that there are stochastic processes that can determine the outcome of an individual aggregation experiment. While our data still indicates that Type 1,2 and 3 polymorphs are strongly selected by pH, the selection between interface variants 3B vs. 3C and 2A vs. 2B might also be affected by protein purity. Our standard purification protocol produces a single band by coomassie-stained SDS-PAGE however minor truncations and other impurities below a few percent would go undetected and, given the proposed roles of the N and C-termini in secondary nucleation, could have a large effect on polymorph selection and seeding. In line with the reviewer’s comments we now include a batch number for each EM dataset. While no new conclusions can be drawn from the inclusion of this additional data, we feel that it is important to acknowledge the possible role of batch to batch variability. 

      (b) To enhance the robustness of the conclusions, additional replicates of the experiments under the same condition should be conducted, ideally a minimum of three times.  

      The pH 5.8 conditions that yield Type 3 fibrils has already been repeated several times in the original manuscript. Since the pH 7.4 conditions produce the most common a-Syn polymorph (Type 1A) and were produced twice in this manuscript (once as an unseeded and once as a cross-seeded fibrilization) we decided to focus on the intermediate condition where the most variability had been seen (pH 7.0). The revised table 1 now has 6 new datasets (11-16) representing 6 independent aggregations at pH 7.0 starting from two different protein purification batches. The results is that we now produce the type 2A/B polymorphs in three samples and in two of these samples we once again observed the type 1M polymorph.  The other samples produced Type 1A or non-twisted fibrils.

      (c) Further investigation into whether different polymorphs formed under the same buffer condition could lead to distinct toxicological and pathology effects would be a valuable addition to the study.  

      The correlation of toxicity with structure would in principle be interesting. However the Type 1 and Type 3 polymorphs formed at pH 5.8 and 7.4 are not likely to be biologically relevant. The pH 7 polymorphs (Type 5 and 1M) would be more interesting because they form under the same conditions and might be related to some disease relevant structures. Still, it is rare that a single polymorph appears at 7.0 (the Type 5 represented only 10-20% of the fibrils in the sample and the Type 1M also had unidentified double-filament fibrils in the sample). We plan to pursue this line of research and hope to include it in a future publication.

      (2) The cross-seeding study presented in the manuscript demonstrates the pivotal role of pH conditions in dictating conformation. However, an intriguing aspect that emerges is the potential role of seed concentration in determining the resultant product structure. This raises a critical question: at what specific seed concentration does the determining factor for polymorph selection shift from pH condition to seed concentration? A methodological robust approach to address this should be conducted through a series of experiments across a range of seed concentrations. Such an approach could delineate a clear boundary at which seed concentration begins to predominantly dictate the conformation, as opposed to pH conditions. Incorporating this aspect into the study would not only clarify the interplay between seed concentration and pH conditions, but also add a fascinating dimension to the understanding of polymorph selection mechanisms.

      A more complete analysis of the mechanisms of aggregation, including the effect of seed concentration and the resulting polymorph specificity of the process, are all very important for our understanding of the aggregation pathways of alpha-synuclein and are currently the topic of ongoing investigations in our lab.

      Furthermore, the study prompts additional queries regarding the behavior of cross-seeding production under the same pH conditions when employing seeds of distinct conformation. Evidence from various studies, such as those involving E46K and G51D cross-seeding, suggests that seed structure plays a crucial role in dictating polymorph selection. A key question is whether these products consistently mirror the structure of their respective seeds. 

      We thank the reviewer for reminding us to cite these studies as a clear example of polymorph selection by cross-seeding. Unfortunately, it is not 100% clear from the G51D cross seeding manuscript (https://doi.org/10.1038/s41467-021-26433-2) what conditions were used in the cross-seeding since different conditions were used for the seedless wild-type and mutant aggregations… however it appears that the wildtype without seeds was Tris pH 7.5 (although at 37C the pH could have dropped to 7ish) and the cross-seeded wild-type was in Phosphate buffer at pH 7.0. In the E46K cross-seeding manuscript, it appears that pH 7.5 Tris was used for all fibrilizations (https://doi.org/10.1073/pnas.2012435118).  In any event, both results point to the fact that at pH 7.0-7.5 under low-seed conditions (0.5%) the Type 4 polymorph can propagate in a seed specific manner.

      (3) In the Results section of "The buffer environment can dictate polymorph during seeded nucleation", the authors reference previous cell biological and biochemical assays to support the polymorph-specific seeding of MSA and PD patients under the same buffer conditions. This discussion is juxtaposed with recent research that compares the in vivo biological activities of hPFF, ampLB as well as LB, particularly in terms of seeding activity and pathology. Notably, this research suggests that ampLB, rather than hPFF, can accurately model the key aspects of Lewy Body Diseases (LBD) (refer to: https://doi.org/10.1038/s41467-023-42705-5). The critical issue here is the need to reconcile the phenomena observed in vitro with those in in-vivo or in-cell models. Given the low seed concentration reported in these studies, it is imperative for the authors to provide a more detailed explanation as to why the possible similar conformation could lead to divergent pathologies, including differences in cell-type preference and seeding capability.  

      We thank the reviewer for bring this recent report to our attention. The findings that ampLB and hPFF have different PK digestion patterns and that only the former is able to model key aspects of Lewy Body disease are in support of the seed-specific nature of some types of alpha-synuclein aggregation.  We have added this to the discussion regarding the significant role that seed type and seed conditions likely play in polymorph selection.

      (4) In the Method section of "Image processing", the authors describe the helical reconstruction procedure, without mentioning much detail about the 3D reconstruction and refinement process. For the benefit of reproducibility and to facilitate a deeper understanding among readers, the authors should enrich this part to include more comprehensive information, akin to the level of detail found in similar studies (refer to: https://doi.org/10.1038/nature23002).

      As also suggested by reviewer #2, we have now added more comprehensive information on the 3D reconstruction and refinement process.

      (5) The abbreviation of amino acids should be unified. In the Results section "On the structural heterogeneity of Type 1 polymorphs", the amino acids are denoted using three-letter abbreviation. Conversely, in the same section under "On the structural heterogeneity of Type 2 and 3 structures", amino acids are abbreviated using the one-letter format. For clarity and consistency, it is essential that a standardized format for amino acid abbreviations be adopted throughout the manuscript.

      That makes perfect sense and had been corrected.

      Reviewing Editor: 

      After discussion among the reviewers, it was decided that point 2 in Reviewer #3's Public Review (about the experiments with different concentrations of seeds) would probably lie outside the scope of a reasonable revision for this work. 

      We agree as stated above and will continue to work on this important point.

      Revisions Round 2

      Reviewer #2 (Public Review): 

      I do worry that the FSC values of model-vs-map appear to be higher than expected from the corresponding FSCs between the half-maps (e.g. see Fig 13). The implication of this observation is that the atomic models may have been overfitted in the maps, which would have led to a deterioration of their geometry. A table with rmsd on bond lengths, angles, etc would probably show this. In addition, to check for overfitting, the atomic model for each data set could be refined in one of the half-maps, and then that same model could be used to calculate 2 FSC model-vs-map curves: one against the half-map it was refined in and one against the other half-map. Deviations between these two curves are an indication of overfitting. 

      Thank you for the recommendations for model validation.  We have added the suggested statistics to Table 2 and performed the suggested model fitting to one of the half-maps and plotted 3 FSC model-vs-map curves: one for each half-map versus the model fit against only one half map and one for the model fit against the full map. We feel that the degree of overfitting is reasonable and does not  significantly impact the quality of the models. 

      In addition, the sudden drop in the FSC curves in Figure 16 shows that something unexpected has happened to this refinement. Are the authors sure that only the procedures outlined in the Methods were used to create these curves? The unexpected nature of the FSC curve for this type (2A) raises doubts about the correctness of the reconstruction. 

      We thank the reviewer for the attention to detail.  We should have caught this mistake. It turns out that in the last round of 3D refinement, the two half-maps become shifted with respect to each other in the z direction. We realigned the two maps using Chimera and then re-ran the postprocessing. The new maps have been deposited in EMD-50850. This mistake motivated us to inspect all of the maps and we found the same problem had occurred in the Type 3B maps.  This was not noticed by the reviewer because we accidentally plotted the FSC curves from postprocessing from one refinement round before the one deposited in the EMD. We performed the same half-map shifting procedure for the Type 3B data and performed a final round of real-space refinement to produce new maps and models that have been deposited as EMD-50888 and 9FYP (superseding the previous entries).

      Reviewer #3 (Public Review): 

      There are two minor points I recommend the authors to address: 

      (1) In the response to Weakness 1, point (3), the authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      We aim to be as transparent as possible and this information was included in the main text however we did not label the percentage of Type 5 fibrils in Figure 4 because that would make the other percentages ambiguous.  The percentages in Figure 4 represent the ratio of helical segments used for each type of refined structure in the dataset (always adding up to 100%), not the percent of all fibrils in the dataset.  That is, there are sometimes untwisted or unidentifiable fibrils in datasets and these were not accounted for in the listed percentages. We have added a sentence to the Figure 4 legend to explain to what the percentages refer.

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Thank you for reminding us to add the scale bars. This is now done for the 2D classes in Figures 11-17.

      Recommendations for the authors: 

      Reviewer #2 (Recommendations For The Authors): 

      A critical look at the maps and models of the various structures at this stage may prevent the authors from entering suboptimal structures into the databases.  

      We agree. Thank you for suggesting this.

      Reviewer #3 (Recommendations For The Authors): 

      The authors have responded adequately to these critiques in the revised version of the manuscript. There are two minor points. 

      (1) The authors state that "the Type 5 represented only 10-20% of the fibrils in the sample." However, this information is not labeled in the corresponding Figure 4. I suggest the authors verify and label all relevant percentages in the figures to prevent misunderstandings. 

      (2) While the authors have detailed the helical reconstruction procedure in the Methods section, it is necessary to indicate the scale bar or box size in the figure legend of the 2D representative classes to ensure clarity and reproducibility. 

      Answered in public comments

    1. eLife assessment

      Using multiple public datasets, this study investigates associations between retrotransposon element expression and methylation with age and inflammation. The study is valuable because a systematic analysis of retrotransposon element expression during human aging has been lacking, but the provided data must be considered incomplete due to the sole reliance on microarray expression data for the core analyses.

    2. Reviewer #1 (Public Review):

      Tsai and Seymen et al. investigate associations between RTE expression and methylation and age and inflammation, using multiple public datasets. Compared to the previous round of review, the text of the manuscript has been polished and the phrasing of several findings has been made clearer and more precise. The authors also provided ample discussion to the prior reviewer comments in their rebuttal, including new analyses. All these changes are in the correct direction, however, I believe that part of the content of the rebuttal should be incorporated in the main text, for reasons that I will outline below.

      Both reviewers found the reliance on microarray expression data to detract from the study. The authors argued that their choices are supported by existing publications which performed a similar quantification of TE expression using microarray data. It could still be argued that (as far as I can tell) Reichmann et al. used a substantially larger number of probes than this study, as a consequence of starting from different arrays, however, this is a minor point which the authors do not need to address. It is still undeniable that including the validation with RNA-seq data performed in the rebuttal would strengthen the manuscript. I especially believe that many readers would want to see this analysis be prominent in the manuscript, considering that both reviewers independently converged on the issue with microarray expression data. Personally, I would have included an RNA-seq dataset next to the microarray data in the main figures, however, I understand that this would require considerable restructuring and that placing RNA-seq data besides array data might be misleading. Instead, I would ask that the authors include their rebuttal figures R1 and R2 as supplementary figures.<br /> I would suggest introducing a new paragraph, between the section dedicated to expression data and the one dedicated to DNA methylation, mentioning the issues with microarray data (Some of which were mentioned by the reviewers and other which were mentioned by the authors in the discussion and introduction) to then introduce the validation with RNA-seq data.

      Figure R3 is also a good addition and should be expanded to include the GTP and MESA study and possibly mentioned in the paragraph titled "RTE expression positively correlates with BAR gene signature scores except for SINEs."

      "In this study, we did not compare MESA with GTP etc. We have analysed each dataset separately based on the available data for that dataset. Therefore, sacrificing one analysis because of the lack of information from the other does not make sense. We would do that if we were after comparing different datasets. Moreover, the datasets are not comparable because they were collected from different types of blood samples."

      Indeed, the datasets are not compared directly, but the associations between age, BER and TE expression for each dataset are plotted and discussed right next to each other. It is therefore natural to wonder if the differences between datasets are due to differences in the type of blood sample or if they are a consequence of the different probe sets. Using a common set of probes would help answer that question.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      This study investigates associations between retrotransposon element expression and methylation with age and inflammation, using multiple public datasets. The study is valuable because a systematic analysis of retrotransposon element expression during human aging has been lacking. However, the data provided are incomplete due to the sole reliance on microarray expression data for the core analysis of the paper. 

      Both reviewers found this study to be important. We have selected the microarray datasets of human blood adopted by a comprehensive study of ageing published in a Nature

      Communications manuscript (DOI: doi: 10.1038/ncomms9570). We only included the datasets specifically collected for ageing studies. Therefore, the large RNA-seq cohorts for cancer, cardiovascular, and neurological diseases were not relevant to this study and cannot be included.   

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Tsai and Seymen et al. investigate associations between RTE expression and methylation and age and inflammation, using multiple public datasets. The concept of the study is in principle interesting, as a systematic analysis of RTE expression during human aging is lacking. 

      We thank the reviewer for the positive comment. 

      Unfortunately, the reliance on expression microarray data, used to perform the core analysis of the paper places much of the study on shaky ground. The findings of the study would not be sufficiently supported until the authors validate them with more suitable methods. 

      In our discussion section in the manuscript, we have clarified that “we are aware of the limitations imposed by using microarray in this study, particularly the low number of intergenic probes in the expression microarray data. Our study can be enriched with the advent of large  RNA-seq cohorts for aging studies in the future.”  However, the application of microarray for RTE expression analysis was introduced previously (DOI: 10.1371/journal.pcbi.1002486) and applied in some highly cited and important publications before (DOI: 10.1038/ncomms1180, DOI: 10.1093/jnci/djr540). In fact, in a manuscript published by Reichmann et al.  (DOI: 10.1371/journal.pcbi.1002486) which was cited 76 times, the authors showed and experimentally verified that cryptic repetitive element probes present in Illumina and Affymetrix gene expression microarray platforms can accurately and sensitively monitor repetitive element expression data. Inspired by this methodological manuscript with reasonable acceptance by other researchers, we trusted that the RTE microarray probes could accurately quantify RTE expression at class and family levels.

      Strengths: 

      This is a very important biological problem. 

      Weaknesses: 

      RNA microarray probes are obviously biased to genes, and thus quantifying transposon analysis based on them seems dubious. Based on how arrays are designed there should at least be partial (perhaps outdated evidence) that the probe sites overlap a protein-coding or non-coding RNA. 

      We disagree with the reviewer that quantifying transposon analysis based on microarray data is dubious. As previously shown by Reichmann et al., the quantification is reliable as long as the probes do not overlap with annotated genes and they are in the correct orientation to detect sense repetitive element transcripts. Reichman et al. identified 1,400 repetitive element probes in version 1.0, version 1.1 and version 2.0 of the Illumina Mouse WG-6 Beadchips by comparing the genomic locations of the probes with the Repeatmasked regions of the mouse genome. We applied the same criteria for Illumina Human HT-12 V3 (29431 probes) and V4 (33963) to identify the RTE-specific probes. 

      The authors state they only used intergenic probes, but based on supplementary files, almost half of RTE probes are not intergenic but intronic (n=106 out of 264). 

      All our identified RTE probes overlap with intergenic regions. However, due to their repetitive natures, some probes overlap with intronic regions, too. We have replaced "intergenic" with "non-coding" in our resubmission to show that they do not overlap with the exons of protein-coding genes. However, we do not rule out the possibility that some of our detected RTE probes might overlap non-coding RNAs. In fact, the border between coding and non-coding genomes has recently become very fuzzy with new annotations of the genome. RTE RNAs can be easily considered as non-coding RNAs if we challenge our traditional junk DNA view. 

      This is further complicated by the fact that not all this small subset of probes is available in all analyzed datasets. For example, 232 probes were used for the MESA dataset but only 80 for the GTP dataset. Thus, RTE expression is quantified with a set of probes which is extremely likely to be highly affected by non-RTE transcripts and that is also different across the studied datasets. Differences in the subsets of probes could very well explain the large differences between datasets in multiple of the analyses performed by the authors, such as in Figure 2a, or 3a. It is nonetheless possible that the quantification of RTE expression performed by the authors is truly interpretable as RTE expression, but this must be validated with more data from RNA-seq. Above all, microarray data should not be the main type of data used in the type of analysis performed by the authors. 

      In this study, we did not compare MESA with GTP etc. We have analysed each dataset separately based on the available data for that dataset. Therefore, sacrificing one analysis because of the lack of information from the other does not make sense. We would do that if we were after comparing different datasets. Moreover, the datasets are not comparable because they were collected from different types of blood samples. 

      Reviewer #2 (Public Review): 

      Summary: 

      Yi-Ting Tsai and colleagues conducted a systematic analysis of the correlation between the expression of retrotransposable elements (RTEs) and aging, using publicly available transcriptional and methylome microarray datasets of blood cells from large human cohorts, as well as single-cell transcriptomics. Although DNA hypomethylation was associated with chronological age across all RTE biotypes, the authors did not find a correlation between the levels of RTE expression and chronological age. However, expression levels of LINEs and LTRs positively correlated with DNA demethylation, and inflammatory and senescence gene signatures, indicative of "biological age". Gene set variation analysis showed that the inflammatory response is enriched in the samples expressing high levels of LINEs and LTRs. In summary, the study demonstrates that RTE expression correlates with "biological" rather than "chronological" aging. 

      Strengths: 

      The question the authors address is both relevant and important to the fields of aging and transposon biology. 

      We thank the reviewer for finding this study relevant and important.

      Weaknesses: 

      The choice of methodology does not fully support the primary claims. Although microarrays can detect certain intergenic transposon sequences, the authors themselves acknowledge in the Discussion section that this method's resolution is limited. More critical considerations, however, should be addressed when interpreting the results. The coverage of transposon sequences by microarrays is not only very limited (232 unique probes) but also predetermined. This implies that any potential age-related overexpression of RTEs located outside of the microarray-associated regions, or of polymorphic intact transposons, may go undetected. Therefore, the authors should be more careful while generalising their conclusions. 

      This is a bioinformatics study, and we have already admitted and discussed the limitations in the discussion section of this manuscript. All technologies have their own limitations, and this should not stop us from shedding light on scientific facts because of inadequate information. In the manuscript, we have discussed that all large and proper ageing studies were performed using microarray technology. Peters et al. (DOI: doi: 10.1038/ncomms9570) adopted all these datasets in their transcriptional landscape of ageing manuscript, which was used in previous studies of ageing as well. Our study essentially applies the Reichmann et al. method to the peripheral blood-related data from the Peters et al. manuscript. Since hypomethylation due to ageing is a well-established and broad epigenetic reprogramming, it is unlikely that only a fraction of RTEs is affected by this phenomenon. Therefore, the subsampling of RTEs should not affect the result so much. Indeed, this is supported in our study by the inverse correlation between DNA methylation and RTE expression for LINE and SINE classes despite having limited numbers of probes for LINE and SINE expressions.    

      Additionally, for some analyses, the authors pool signals from RTEs by class or family, despite the fact that these groups include subfamilies and members with very different properties and harmful potentials. For example, while sequences of older subfamilies might be passively expressed through readthrough transcription, intact members of younger groups could be autonomously reactivated and cause inflammation. The aggregation of signals by the largest group may obscure the potential reactivation of smaller subgroups. I recommend grouping by subfamily or, if not possible due to the low expression scores, by subgroup. For example, all HERV subfamilies are from the ERVL family. 

      We agree with the reviewer that different subfamilies of RTEs play different roles through their activation. However, we will lose our statistical power if we study RTE subfamilies with a few probes. Global epigenetic alteration and derepression of RTEs by ageing have been observed to be genome-wide. While our systematic analysis across RTE classes and families cannot capture alterations in subfamilies due to statistical power, it is still relevant to the research question we are addressing.

      Next, Illumina arrays might not accurately represent the true abundance of TEs due to nonspecific hybridization of genomic transposons. Standard RNA preparations always contain traces of abundant genomic SINEs unless DNA elimination is specifically thorough. The problem of such noise should be addressed. 

      We have checked the RNA isolation step from MESA, GTP, and GARP manuscripts. The total RNA was isolated using the Qiagen mini kit following the manufacturer’s recommendations. The authors of these manuscripts did not mention whether they eliminated genomics DNA, but we assumed they were aware of the DNA contamination and eliminated it based on the manufacturer’s recommendations. We have looked up the literature about nonspecific hybridization of RTEs but could not find any evidence to support this observation. We would appreciate the reviewers providing more evidence about such RTE contaminations.   

      Lastly, scRNAseq was conducted using 10x Genomics technology. However, quantifying transposons in 10x sequencing datasets presents major challenges due to sparse signals. 

      Applying the scTE pipeline (https://www.nature.com/articles/s41467-021-21808-x), we have found that the statical power of quantifying RTE classes (LINE, SINE, and LTR) or  RTE families (L1, L2, All, ERVK, etc.) are as good as each individual gene. However, our proposed method cannot analyse RTE subfamilies, and we did not do that. 

      Smart-seq single-cell technology is better suited to this particular purpose. 

      We agree with the reviewer that Smart-seq provides higher yield than 10x, but there is no Smartseq data available for ageing study.  

      Anyway, it would be more convincing if the authors demonstrated TE expression across different clusters of immune cells using standard scRNAseq UMAP plots instead of boxplots. 

      Since the number of RTE reads per cell is low, showing the expression of RTEs per cell in UMAP may not be the best statistical approach to show the difference between the aged and young groups. This is why we chose to analyse with Pseudobulk and displayed differential expression using boxplot rather than UMAP for each immune cell type. 

      I recommend validating the data by RNAseq, even on small cohorts. Given that the connection between RTE overexpression and inflammation has been previously established, the authors should consider better integrating their observations into the existing knowledge. 

      Please see below. We have analysed RNA-seq data suggested by Reviewer 1 in the Recommendations for the Authors section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      I can recommend two sizeable human PMBC RNA-seq datasets that the authors could use:

      Marquez et al. 2020 (phs001934.v1.p1, controlled access) and Morandini et al. 2023 (GSE193141, public access). There are likely other suitable datasets that I am not aware of. I would also recommend using identical sets of probes to quantify RTE expression across studies. If certain datasets have too few probes and would thus limit the number of probes available across all studies it might be a good idea to exclude the dataset, especially if the analysis has been supplemented by the additional RNA-seq datasets. 

      Until recently, there was no publicly-available, non-cancerous, large cohort of RNA-seq data for ageing studies. We tried to gain access to the two RNA-seq datasets suggested by reviewer 2: Marquez et al. 2020 (phs001934.v1.p1, controlled access) and Morandini et al. 2023 (GSE193141, public access). 

      Unfortunately, Marquez et al. 2020 data is not accessible because the authors only provide the data for projects related to cardiovascular diseases. However, we did analyse Morandini et al. 2023 data, and we can confirm that no association was observed between any class and family of RTEs with chronological ageing (Author response image 1), which is the second strong piece of evidence supporting the statement in the manuscript. However, as expected, we found a positive correlation between RTE expression and IFN-I signature score (Author response image 2).

      Author response image 1.

      Linear analysis of RTE expression and chronological age.

      Author response image 2.

      Linear analysis of RTE expression and IFN gene signature expression.

      The authors use "biological age" and inflammation as interchangeable concepts, including in the title. Please correct this wording. 

      We have now added a new terminology to the manuscript called “biological age-related (BAR)”, which has been clearly addressed this distinction. We don’t think it is needed to change the title.  

      The authors find correlations between RTE expression and age-associated gene signatures but not chronological age itself. This is puzzling because, as the wording suggests, the expression of these inflammatory pathways is age-associated. If RTE expression correlates with inflammation which itself correlates with age, one might expect RTE expression to also correlate with age. Do the authors see a correlation between various inflammatory gene signatures and chronological age, in the analyzed datasets? If yes, then how would you explain that discrepancy? Moreover, in this case, I would recommend using a linear model, rather than correlation, to separate the effects of chronological age and RTE expression on inflammation (Inflammation et al ~ Age + RTE expression), or equivalent designs.

      As described above, we have now introduced the BAR terminology, which resolves this confusion. We did not find a correlation between RTE expression and chronological age. However, we did identify the correlation between BAR gene signatures and RTE expression.

      To separate the effects of chronological age and RTE expression on BAR gene signature scores, we performed a generalized linear model (GLM) analysis using BAR gene signature scores as response variables and RTE expression and chronological age as predictors (BAR gene signature scores ~ RTE expression + chronological age). Significant association was observed between BAR gene signature scores and RTE expression in the GARP cohort (Author response image 3). However, when chronological age is considered as predictor, we did not identify a correlation between chronological age and BAR gene signatures, indicating that BAR events are not corelated with chronological age (Author response image 3).  

      Author response image 3.

      Generalized linear models (GLM) analysis (BAR gene signature scores ~ RTE expression + chronological age). For each RTE family, we separately performed GLM. Age (RTE family) indicates the chronological age when used in the design formula for that specific RTE family. 

      Some of the gene sets used by the authors have considerable overlap with others and are also not particularly comprehensive. I can recommend this very comprehensive gene set: https://www.gsea-msigdb.org/gsea/msigdb/human/geneset/SAUL_SEN_MAYO.  

      We did not choose to use large gene lists such as the suggested SEN_MAYO list, as we found Singscore struggles to generate reliable scores with sufficient variance when the number of genes increase to more than twenty. Although there is some overlap between inflammation-related genes and cellular senescence genes (e.g., IL6, IL1A, IL1B), it is important to note that each gene list focuses on different aspects of biological aging and should not be dismissed as redundant.

      Minor comments: 

      Overall, several sentences in the manuscript feel somewhat unnatural. I would recommend further proofreading. I will mention some examples:  

      Thank you for your feedback. We have fixed all these issues in the new submission.  

      • One line 34, "like the retroviruses" should be "like retroviruses. There are several other places in the text where "the" is not required. 

      Fixed.

      • On line 86, "to generate the RTE expression". "the" is again not necessary and I would replace "generate" with "quantify". 

      Fixed.

      • On line 86, "we mapped the probe locations to RepeatMasker". RepeatMasker is not a genome. Do you mean you mapped the probe location to a genome annotated by RepeatMasker? The same applies to line 99.  

      Fixed. We changed the sentence to: “To quantify RTE expression, we mapped the microarray probe locations to RTE locations in RepeatMasker to extract the list of noncoding (intergenic or intronic) probes that cover the RTE regions.”

      • Figure 1 contains a typo in the aims section: "evetns" instead of "events".  

      Fixed.

      • On line 495 "filtered out" seems to imply your removed intergenic probes. I assume you mean that you specifically selected intergenic probes. 

      Fixed.

      • Figure 1 nicely summarizes your datasets. Could you add a Figure 1b panel showing how you used RNA arrays to quantify RTE expression? This should include the number of probes for each RTE family, so I suggest merging this with Figure S1.  

      We disagree with the reviewer to merge Figure 1 and Figure S1 because they are addressing two different concepts.  

      Reviewer #2 (Recommendations For The Authors): 

      In Figure 2c, it is unclear what colour scale has been used for age. 

      Thank you for the comment. We have added a legend for age in this figure.

      There are no figure legends for Supplementary Figures 1 to 5 and all figures after Supplementary Figure 8. 

      A new version with legends has been submitted.

      For different datasets used, the choice of "healthy" patients should be more clear and explicit.

      Are asymptomatic patients with autoimmune inflammatory disorders considered as "healthy"? If not only healthy patients' blood is analysed (such as PBMS from primary osteoarthrosis), how inflammatory signatures enrichment discovered in this study may be associated not just with "biological age" but with the disease itself? 

      In our analysis, we did not exclusively study "healthy" individuals, as none of our datasets were initially collected from strictly healthy populations. While the microarray datasets were not specifically collected from people with particular diseases, they were also not screened for asymptomatic conditions. To demonstrate the same pattern in healthier cohorts, we added scRNA-seq analysis of confirmed healthy individuals to our study. However, the focus of this study is not on healthy aging. Instead, it is on biological ageing that includes both healthy and non-healthy ageing.

      We included the GARP (primary osteoarthritis) dataset as it is a cohort of age-related diseases (ARD). While we cannot definitively attribute inflammatory signatures enrichment to biological aging or disease, the observation of such enrichment in a cohort of ARD is worth considering. To make this clearer, we have replaced the term “healthy” with “non-cancerous” for microarray analysis throughout the paper.

    1. Reviewer #1 (Public Review):

      In this study, the authors introduced an essential role of AARS2 in maintaining cardiac function. They also investigated the underlying mechanism that through regulating alanine and PKM2 translation are regulated by AARS2. Accordingly, a therapeutic strategy for cardiomyopathy and MI was provided. Several points need to be addressed to make this article more comprehensive:

      (1) Include apoptotic caspases in Figure 2B, and Figure 4 B and E as well.

      (2) It would be better to show the change of apoptosis-related proteins upon the knocking down of AARS2 by small interfering RNA (siRNA).

      (3) In Figure 5, the authors performed Mass Spectrometry to assess metabolites of homogenates. I was wondering if the change of other metabolites could be provided in the form of a heatmap.

      (4) The amounts of lactate should be accessed using a lactate assay kit to validate the Mass Spectrometry results.

      (5) How about the expression pattern of PKM2 before and after mouse MI. Furtherly, the correlation between AARS2 and PKM2?

      (6) In Figure 5, how about the change of apoptosis-related proteins after administration of PKM2 activator TEPP-46?

    2. eLife assessment

      This valuable study demonstrates that AARS2 is crucial for protecting cardiomyocytes from ischemic stress by shifting energy metabolism towards glycolysis through PKM2, presenting a novel therapeutic target for myocardial infarction. The findings are supported by solid evidence, including cardiomyocyte-specific genetic modifications, functional assays, and ribosome profiling, which together robustly validate the AARS2-PKM2 signaling pathway's role in cardiac protection.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors aimed to elucidate the role of AARS2, an alanyl-tRNA synthase, in mouse hearts, specifically its impact on cardiac function, fibrosis, apoptosis, and metabolic pathways under conditions of myocardial infarction (MI). By investigating the effects of both deletion and overexpression of AARS2 in cardiomyocytes, the study aims to determine how AARS2 influences cardiac health and survival during ischemic stress.

      The authors successfully achieved their aims by demonstrating the critical role of AARS2 in maintaining cardiomyocyte function under ischemic conditions. The evidence presented, including genetic manipulation results, functional assays, and mechanistic studies, robustly supports the conclusion that AARS2 facilitates cardiomyocyte survival through PKM2-mediated metabolic reprogramming. The study convincingly links AARS2 overexpression to improved cardiac outcomes post-MI, validating the proposed protective AARS2-PKM2 signaling pathway.

      This work may have a significant impact on the field of cardiac biology and ischemia research. By identifying AARS2 as a key player in cardiomyocyte survival and metabolic regulation, the study opens new avenues for therapeutic interventions targeting this pathway. The methods used, particularly the cardiomyocyte-specific genetic models and ribosome profiling, are valuable tools that can be employed by other researchers to investigate similar questions in cardiac physiology and pathology.

      Understanding the metabolic adaptations in cardiomyocytes during ischemia is crucial for developing effective treatments for MI. This study highlights the importance of metabolic flexibility and the role of specific enzymes like AARS2 in facilitating such adaptations. The identification of the AARS2-PKM2 axis adds a new layer to our understanding of cardiac metabolism, suggesting that enhancing glycolysis can be a viable strategy to protect the heart from ischemic damage.

      Strengths:

      (1) Comprehensive Genetic Models: The use of cardiomyocyte-specific AARS2 knockout and overexpression mouse models allowed for precise assessment of AARS2's role in cardiac cells.

      (2) Functional Assays: Detailed phenotypic analyses, including measurements of cardiac function, fibrosis, and apoptosis, provided evidence for the physiological impact of AARS2 manipulation.

      (3) Mechanistic Insights: This study used ribosome profiling (Ribo-Seq) to uncover changes in protein translation, specifically highlighting the role of PKM2 in metabolic reprogramming.

      (4) Therapeutic Relevance: The use of the PKM2 activator TEPP-46 to reverse the effects of AARS2 deficiency presents a potential therapeutic avenue, underscoring the practical implications of the findings.

      Weaknesses:

      (1) Species Limitation: The study is limited to mouse and rat models, and while these are highly informative, further validation in human cells or tissues would strengthen the translational relevance.

      (2) Temporal Dynamics: The study does not extensively address the temporal dynamics of AARS2 expression and PKM2 activity during the progression of MI and recovery, which could offer deeper insights into the timing and regulation of these processes.

    4. Reviewer #3 (Public Review):

      In the present study, the author revealed that cardiomyocyte-specific deletion of mouse AARS2 exhibited evident cardiomyopathy with impaired cardiac function, notable cardiac fibrosis, and cardiomyocyte apoptosis. Cardiomyocyte-specific AARS2 overexpression in mice improved cardiac function and reduced cardiac fibrosis after myocardial infarction (MI), without affecting cardiomyocyte proliferation and coronary angiogenesis. Mechanistically, AARS2 overexpression suppressed cardiomyocyte apoptosis and mitochondrial reactive oxide species production, and changed cellular metabolism from oxidative phosphorylation toward glycolysis in cardiomyocytes, thus leading to cardiomyocyte survival from ischemia and hypoxia stress. Ribo-Seq revealed that AARS2 overexpression increased pyruvate kinase M2 (PKM2) protein translation and the ratio of PKM2 dimers to tetramers that promote glycolysis. Additionally, PKM2 activator TEPP-46 reversed cardiomyocyte apoptosis and cardiac fibrosis caused by AARS2 deficiency. Thus, this study demonstrates that AARS2 plays an essential role in protecting cardiomyocytes from ischemic pressure via fine-tuning PKM2-mediated energy metabolism, and presents a novel cardiac protective AARS2-PKM2 signaling during the pathogenesis of MI. This study provides some new knowledge in the field, and there are still some questions that need to be addressed in order to better support the authors' views.

      (1) WGA staining showed obvious cardiomyocyte hypertrophy in the AARS2 cKO heart. Whether AARS affects cardiac hypertrophy needs to be further tested.

      (2) The authors observed that AARS2 can improve myocardial infarction, and whether AARS2 has an effect on other heart diseases.

      (3) Studies have shown that hypoxia conditions can lead to mitochondrial dysfunction, including abnormal division and fusion. AARS2 also affects mitochondrial division and fusion and interacts with mitochondrial proteins, including FIS and DRP1, the authors are suggested to verify.

      (4) The authors only examined the role of AARS2 in cardiomyocytes, and fibroblasts are also an important cell type in the heart. Authors should examine the expression and function of AARS2 in fibroblasts.

      (5) Overexpression of AARS2 can inhibit the production of mtROS, and has a protective effect on myocardial ischemia and H/ R-induced injury, and the occurrence of iron death is also closely related to ROS, whether AARS protects myocardial by regulating the occurrence of iron death?

      (6) Please revise the English grammar and writing style of the manuscript, spelling and grammatical errors should be excluded.

      (7) Recent studies have shown that a decrease in oxygen levels leads to an increase in AARS2, and lactic acid rises rapidly without being oxidized. Both of these factors inhibit oxidative phosphorylation and muscle ATP production by increasing mitochondrial lactate acylation, thereby inhibiting exercise capacity and preventing the accumulation of reactive oxygen species ROS. The key role of protein lactate acylation modification in regulating oxidative phosphorylation of mitochondria, and the importance of metabolites such as lactate regulating cell function through feedback mechanisms, i.e. cells adapt to low oxygen through metabolic regulation to reduce ROS production and oxidative damage, and therefore whether AARS2 in the heart also acts in this way.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Little is known about the local circuit mechanisms in the preoptic area (POA) that regulate body temperature. This carefully executed study investigates the role of GABAergic interneurons in the POA that express neurotensin (NTS). The principal finding is that GABA-release from these cells inhibits neighboring neurons, including warm-activated PACAP neurons, thereby promoting hyperthermia, whereas NTS released from these cells has the opposite effect, causing a delayed activation and hypothermia. This is shown through an elegant series of experiments that include slice recordings alongside matched in vivo functional manipulations. The roles of the two neurotransmitters are distinguished using a cell-type-specific knockout of Vgat as well as pharmacology to block GABA and NTS receptors. Overall, this is an excellent study that is noteworthy for revealing local circuit mechanisms in the POA that control body temperature and also for highlighting how amino acid neurotransmitters and neuropeptides released from the same cell can have opposing physiologic effects. I have only minor suggestions for revision.

      Reviewer #2 (Public Review):

      Summary:

      The study has demonstrated how two neurotransmitters and neuromodulators from the same neurons can be regulated and utilized in thermoregulation.

      The study utilized electrophysiological methods to examine the characteristics and thermoregulation of Neurotensin (Nts)-expressing neurons in the medial preoptic area (MPO). It was discovered that GABA and Nts may be co-released by neurons in MPO when communicating with their target neurons.

      Strengths:

      The study has leveraged optogenetic, chemogenetic, knockout, and pharmacological inhibitors to investigate the release process of Nts and GABA in controlling body temperature.

      The findings are relevant to those interested in the various functions of specific neuron populations and their distinct regulatory mechanisms on neurotransmitter/neuromodulator activities

      Weaknesses:

      Key points for consideration include:

      (1) The co-release of GABA and Nts is primarily inferred rather than directly proven. Providing more direct evidence for the release of GABA and the co-release of GABA and Nts would strengthen the argument. Further in vitro analysis could strengthen the conclusion regarding this co-releasing process.

      Measurement of Nts concentrations in various brain regions during thermoregulatory responses is part of a future study.

      (2) The differences between optogenetic and chemogenetic methods were not thoroughly investigated. A comparison of in vitro results and direct observation of release patterns could clarify the mechanisms of GABA release alone or in conjunction with Nts under different stimulation techniques.

      A comparison of chemogenetic and optogenetic stimulation methods is not within the scope of this study.

      (3) Neuronal transcripts were mainly identified through PCR, and alternative methods like single-cell sequencing could be explored.

      Single cell transcriptomics of preoptic neurotensinergic neurons will be part of a different study.

      (4) In Figure 6, the impact of GABA released from Nts neurons in MPO on CBT regulation appears to vary with ambient temperatures, requiring a more detailed explanation for better comprehension.

      The different possible roles of GABA in different thermoregulatory circumstances is discussed on lines 555-581.

      (5) The model should emphasize the key findings of the study.

      The model is presented in Fig 8.

      Reviewer #3 (Public Review):

      Summary:

      Understanding the central neural circuits regulating body temperature is critical for improving health outcomes in many disease conditions and in combating heat stress in an ever-warming environment. The authors present important and detailed new data that characterizes a specific population of POA neurons with a relationship to thermoregulation. The new insights provided in this manuscript are exactly what is needed to assemble a neural network model of the central thermoregulatory circuitry that will contribute significantly to our understanding of regulating the critical homeostatic variable of body temperature. These experiments were conducted with the expertise of an investigator with career-long experience in intracellular recordings from POA neurons. They were interpreted conservatively in the appropriate context of current literature.

      The Introduction begins with "Homeotherms, including mammals, maintain core body temperature (CBT) within a narrow range", but this ignores the frequent hypothermic episodes of torpor that mice undergo triggered by cold exposure. Although the author does mention torpor briefly in the Discussion, since these experiments were carried out exclusively in mice, greater consideration (albeit speculative) of the potential for a role of MPO Nts neurons in torpor initiation or recovery is warranted. This is especially the case since some 'torpor neurons' have been characterized as PACAP-expressing and a population of PACAP neurons represent the target of MPO Nts neurons.

      Additional discussion of a possible role of neurotensinergic neurons in the initiation or recovery from torpor is included (lines 593-597).

    1. eLife assessment

      This study provides compelling data that defines the structure of the S. cerevisiae APC/C. The structure reveals overall conservation of its mechanism of action compared to the human APC/C but some important differences that indicate that activation by co-activator binding and phosphorylation are not identical to the human APC/C. Thus this study will be of considerable value to the field, although the conclusions regarding the effect of phosphorylation would be strengthened by quantification of the phosphopeptides. Recent work on the role of APC7 in APC/C activity in neurones should also be discussed with respect to the mode of action of the APC/C in human versus budding yeast cells.

    2. Reviewer #1 (Public Review):

      Summary:

      This work focuses on the structure and regulation of the Anaphase-Promoting Complex/Cyclosome (APC/C), a large multi-subunit ubiquitin ligase that controls the onset of chromosome segregation in mitosis. Previous high-resolution structural studies have uncovered numerous structural features and regulatory mechanisms of the human APC/C, but it has remained unclear if these mechanisms are conserved in other model eukaryotes. To address this gap in our understanding, the authors employed cryo-electron microscopy to generate structural models of APC/C from the budding yeast S. cerevisiae, a key model organism in cell cycle analysis. In their comparison of the human and yeast complexes, the authors uncover many conserved structural features that are documented here in detail, revealing widespread similarities in the fundamental structural features of the enzyme. Interestingly, the authors also find evidence that two of the key mechanisms of human APC/C regulation are not conserved in the yeast enzyme. Specifically:

      (1) The ubiquitin ligase activity of the APC/C depends on its association with a co-activator subunit such as CDH1 or CDC20, which serves both as a substrate-binding adaptor and as an activator of interactions with the E2 co-enzyme. Previous studies of the human APC/C revealed that co-activator binding induces a conformational change that enables E2 binding. In contrast, the current work shows that this E2-binding conformation already exists in the absence of a co-activator in the yeast enzyme, suggesting that the enhancement of E2 binding in yeast depends on other, as yet undiscovered, mechanisms.

      (2) APC/C phosphorylation on multiple subunits is known to enhance APC/C activation by the CDC20 co-activator in mitosis. Previous studies showed that phosphorylation acts by promoting the displacement of an autoinhibitory loop that occupies part of the CDC20-binding site. In the yeast enzyme, however, there is no autoinhibitory loop in the CDC20-binding site, and there is no apparent effect of APC/C phosphorylation on co-activator binding sites. Thus, phosphorylation activates the yeast CDC20-APC/C by unknown mechanisms.

      Strengths:

      The strength of this paper is that it provides a comprehensive analysis of yeast APC/C structure and how it compares to previously determined human structures. The article systematically unwraps the key features of the structure in a subunit-by-subunit fashion, carefully revealing the key features that are the same or different in the two species. These descriptions are based on a thorough overview of past work in the field; indeed, this article serves as a concise review of the key features, conserved or otherwise, of APC/C structure and regulation.

      Weaknesses:

      No significant weaknesses were identified.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper from the Barford lab describes medium/high-resolution cryo-EM structures of three versions of the S. cerevisiae anaphase-promoting complex/cyclosome (APC/C):

      (1) the recombinant apo complex purified from insect cells,

      (2) the apo complex phosphorylated in vitro by cyclin-dependent kinase, and

      (3) an active APC/C-Cdh1-substrate ternary complex.

      The focus of the paper is on comparing similarities and differences between S. cerevisiae and human APC/C structures, mechanisms of activation by coactivator, and regulation by phosphorylation. The authors find that the overall structures of S. cerevisiae and human APC/C are remarkably similar, including the binding sites and orientation for the substrate-recruiting coactivator, Cdh1. In addition, the mechanism of Cdh1 inhibition by phosphorylation appears conserved across kingdoms. However, key differences were also observed that reveal divergence in APC/C mechanisms that are important for researchers in this field to know. Specifically, the mechanism of APC/C-Cdc20 activation by mitotic phosphorylation appears to be different, due to the absence of the key Apc1 autoinhibition loop in the S. cerevisiae complex. In addition, the conformational activation of human APC/C by coactivator binding was not observed in the S. cerevisiae complex, implying that stimulation of E2 binding must occur via a different mechanism in this species.

      Strengths:

      Consistent with the numerous prior cryo-EM structures of human APC/C from the Barford lab, the technical quality of the structure models is a major strength of this work. In addition, the detailed comparison of similarities and differences between the two species will be a very valuable resource for the scientific community. The manuscript is written very well and allows readers lacking expertise in cryo-EM to understand the important aspects of the conservation of APC/C structure and mechanism across kingdoms.

      Weaknesses:

      The lack of experimentation in this work to test some of the putative differences in APC/C mechanism (e.g. stimulation of E2 binding by coactivator and stimulation of activity by mitotic phosphorylation) could be considered a weakness. Nonetheless, the authors do a nice job explaining how the structure interpretations imply these differences likely exist, and this work sets the stage nicely for future studies to understand these differences at a mechanistic level. There is enough value in having the S. cerevisiae structure models and the comparison to the human structures, without any additional experimentation.

      The validation of APC/C phosphorylation in the unphosphorylated and hyperphosphorylated states is not very robust. Given the lack of significant effects of phosphorylation on APC/C structure observed here (compared to the human complex), this becomes important. A list of phosphorylation sites identified by mass spec before and after in vitro phosphorylation is provided but lacks quantitative information. This list indicates that a significant number of phosphorylation sites are detected in the purified APC/C prior to reaction with purified kinases. Many more sites are detected after in vitro kinase reaction, but it isn't clear how extensively any of the sites are modified. There is reason for caution then, in accepting the conclusions that structures of unphosphorylated and hyperphosphorylated APC/C from S. cerevisiae are nearly identical.

    4. Reviewer #3 (Public Review):

      Vazquez-Fernandez et al. present a comprehensive and detailed analysis of the S. cerevisiae APC/C complex, providing new insights into its structure and function. The authors determined the medium-resolution structures of three recombinant S. cerevisiae APC/C complexes, including unphosphorylated apo-APC/C (4.9 Å), the ternary APC/CCDH1-substrate complex (APC/CCDH1:Hsl1 , 4.0 Å), and phosphorylated apo-APC/C (4.4 Å). Prior structures of human, E. cuniculi, S. cerevisiae, and S. pombe APC/C subunits, as well as AlphaFold2 predictions were used to guide model building. Although the determined structures are not sufficient to fully explain the molecular mechanism of APC/C activation and regulation in S. cerevisiae, they provide valuable insights into the similarities and differences with the human complex, shedding light on the conserved and divergent features of APC/C function.

      The manuscript synthesizes the structural analysis of the APC/C complex in S. cerevisiae, with literature into a cohesive and clear picture of the complex's structure and function. It is well-written and clear, making the complex biology of the APC/C complex accessible to a wide range of readers. The complex forms a triangular shape, with a central cavity surrounded by two modules: the TPR lobe and the platform module. The TPR lobe consists of three TPR proteins (APC3, APC6, and APC8), which stack on top of each other to form a quasi-symmetric structure. The platform module is composed of the large APC1 subunit, together with APC4 and APC5. The authors also analyzed the structure of several smaller subunits that are involved in regulating the activity of the APC/C complex and showed their structural similarities to and discrepancies from their human counterparts. These subunits, including CDC26/APC12, SWM1/APC13, APC9, and MND2/APC15, form extended, irregular structures that simultaneously contact multiple large globular APC/C subunits.

      While the authors report the similarity between the overall structure of S. cerevisiae and human APC/C complexes, they also found two unexpected differences. First, in the S. cerevisiae apo-complex, the E2 binding site on APC11RING is accessible, whereas, in humans, it requires CDH1 binding. Second, a structural element similar to the human APC1 auto-inhibitory segment is missing in S. cerevisiae. In humans, the phosphorylation-dependent displacement of this segment allows CDC20 binding to APC/C. In S. cerevisiae, the binding requires phosphorylation however the structures reported here are suggestive that this could involve a different (presently unknown) mechanism. These structural insights highlight the importance of understanding the species-specific features of APC/C function.

      Strengths:

      The manuscript does a great job of revealing new structures.

      Opportunity for increasing impact: It would have been nice if some functional differences were demonstrated, for example regarding the mechanism of CDC20 binding, and the comparison between apo-APC/C and ternary APC/CCDH1:Hsl1 does not explain the molecular activation mechanism of S. cerevisiae APC/C. Nonetheless, the authors nicely integrate their data with well-established literature on the similarities and differences between yeast and human systems.

      Weaknesses:

      The authors should cite and discuss Cole Ferguson et al., Mol Cell 2022. This study describes the loss of APC7 in human disease and provides a detailed structural and biochemical examination of the effects of APC7 loss on human APC/C. Given that much of our understanding of APC7 comes from this work, it should be highlighted in the introduction and discussed in depth in light of the new work on S. cerevisiae APC/C.

    1. eLife assessment

      This study assessed the virulence and immune responses of different M. tuberculosis lineages using a 3D in vitro granuloma model. The useful findings support the functional impact of M. tuberculosis natural diversity on host-pathogen interactions but are incomplete and only partially support the claims. The study will interest researchers working on mycobacteria and understanding how genetic diversity influences virulence and immunity outcomes.

    2. Reviewer #1 (Public Review):

      Summary:

      This study further develops the potential of in vitro granulomas to study host-pathogen interactions in tuberculosis. It uses a human-based cellular model and a collection of M. tuberculosis isolates representative of the pathogen's diversity. It provides important methodologic information and some findings that help in defining protective responses in TB.

      Strengths:

      A strength of the study is the multitude of parameters addressed across different M. tuberculosis strains and donors. The inclusion of several strains of the same lineage shows that intra-lineage diversity is also relevant, illustrating how complex it is to model the immune response to M. tuberculosis.

      Weaknesses:

      A weakness of the study is that although several interesting findings are reported and a hypothesis proposed, the work is mainly descriptive and correlative. Some functional data based on the current observations would strengthen the findings.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript reports a comparison of microbial traits and host response traits in a laboratory model of infected granuloma using Mtb strains from different lineages. The authors report increased bacillary growth and granuloma formation, inversely associated with T cell activation that is characterized by CXCL9, granzyme B, and TNF expression. They therefore infer that these T cell responses are likely to be host-protective and that the greater virulence of modern Mtb lineages may be driven by their ability to avoid triggering these responses.

      Strengths:

      The comparison of multiple Mtb lineages in a granuloma model that enables evaluation of the potential role of multiple host cells in Mtb control offers a valuable experimental approach to studying the biological mechanisms that underpin differential virulence of Mtb lineages that have been previously reported in clinical and epidemiological studies.

      Weaknesses:

      The study is rather limited to descriptive observations and lacks experiments to test causal relationships between host and pathogen traits. Some of the presentation of the data is difficult to interpret, and some conclusions are not adequately supported by the data.

    4. Reviewer #3 (Public Review):

      Summary:

      In "CXCL9, granzyme B, and TNF-α orchestrate protective in vitro granulomatous responses across Mycobacterium tuberculosis complex lineages", Arbués and colleagues describe the impact of mycobacterial genetic diversity on host-infection phenotypes. The authors evaluate Mtb infection and contextualize host responses, bacterial growth, and metabolic transitioning in vitro using their previously established model of blood-derived, primary human cells cultured on a collogen/fibronectin matrix. They seek to demonstrate the effectiveness of the model in determining mycobacterial strain-specific granuloma-dependent host-pathogen interactions.

      Strengths and weaknesses:

      Understanding the way mycobacterial genetic diversity impacts granuloma biology in tuberculosis is an important goal. One of this work's strengths is the use of primary human cells and two constituents of the pulmonary extracellular matrix to model Mtb infection. The authors and others have previously shown that Mtb-infected PBMC aggregates share important characteristics with early pulmonary TB granulomas (Arbues et al., Bio Protoc, 2020, PMID: 3659472; Guirado et al., mBio, 2015, PMID: 25691598; Kapoor et al., PloS One, 2013, PMID: 23308269). The use of multiple genetically distinct strains of Mtb defines this work and further bolsters its potential impact. However, the study is not comprehensive as lineages 6 and 7 are not tested. Experiments are primarily descriptive, and the methodologies are conventional. Correlative relationships are the manuscript's focus and functional validation is not conducted. Convoluted data presentation hampers the readers' ability to effectively evaluate many of the findings for significance. The effect sizes are generally small and most quantitative data are unitless. A further weakness of the study is a lack of any in vivo modeling.

      Achievement of aims, and support for conclusions:

      The main aim of this work is to extend an in vitro granuloma model to the study of a large collection of well-characterized, genetically diverse representatives of the mycobacterium tuberculosis complex (MTBC). I believe that they accomplish that aim. The work does investigate MTBC infection of aggregated PBMCs using three strains each of Mtb lineages 1-5 and H37Rv, which is not a trivial undertaking. The experimental aims are to show that MTBC genetic diversity impacts the growth and dormancy of granuloma-bound bacteria and, the host responses of granulomatous aggregation as well as macrophage apoptosis, lymphocyte activation, and soluble mediator release within granulomas. A lack of basic descriptive statistics for raw data makes it difficult to determine if benchmarks for most of the experimental aims have been reached. Although the methodologies employed should have been able to test most of these aims. The title's conclusion that CXCL9, granzyme B, and TNF orchestrate a protective granulomatous response is not tested and is not supported by the findings. Those molecules are not a focus of the work, their effects are not investigated effectively and their relationship to the granulomatous response is not determined. The authors' conclusions regarding their results are a mixed bag. Their conclusion that lineage impacts growth within granulomas is likely true and the data as presented reflect such a relationship. However, even without the basic descriptive statistics needed to evaluate the data supporting that claim, the methods employed for bacterial collection call into question whether all Mtb plated for CFU assay resided within granulomatous aggregates. Their conclusions regarding lineage's impact on dormancy are not supported, as their findings demonstrate that assays for dormancy identify replicating bacteria as being dormant. Their conclusion that strain diversity results in a spectrum of granulomatous responses in their model system is strongly supported by the results. Their conclusion that strain diversity impacts macrophage apoptosis is supported by the data but a relationship of apoptosis to the granulomatous response is not effectively evaluated. Their conclusion that lymphocyte activation is associated with reduced mycobacterial growth as an aspect of granulomas is well supported in the literature and a negative correlation between T cell activation and growth is supported by their results.

      Impact on the field:

      The authors contribute some valuable insights, particularly in Figure 3 and supplementary Figures 1 and 2, where data is more accessible to critique. Their identification of donor-dependent aggregation phenotypes by mycobacterial strain has the potential to enable future reverse-genetic screens for human and Mtb loci that contribute to granulomatous inflammation. Their model is a higher echelon relative to others in the field, but I don't believe that it possesses all of the necessary tissue and cellular components to effectively replicate the formation of granulomas in nature. The bulk of the data in its current form is not of high value to the community, but I think it has the potential to contribute additional novel insights if panels that display descriptive statistics are added to the figures.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Ever-improving techniques allow the detailed capture of brain morphology and function to the point where individual brain anatomy becomes an important factor. This study investigated detailed sulcal morphology in the parieto-occipital junction. Using cutting-edge methods, it provides important insights into local anatomy, individual variability, and local brain function. The presented work advances the field and will stimulate future research into this important area.

      Strengths:

      Detailed, very thorough methodology. Multiple raters mapped detailed sulci in a large cohort. The identified sulcal features and their functional and behavioural relevance are then studied using various complementary methods. The results provide compelling evidence for the importance of the described sulcal features and their proposed relationship to cortical brain function.

      We thank the Reviewer for highlighting the strengths of our methods and findings.

      Weaknesses:

      A detailed description/depiction of the various sulcal patterns is missing.

      We agree that adding these details for the newly described sulci is necessary and have now done so. These details are included in the Results (Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001).”

      And in four new Supplementary Tables.

      A possible relationship between sulcal morphology and individual demographics might provide more insight into anatomical variability.

      We have conducted additional analyses to relate sulcal incidence to demographic features (age and gender). These results are included on Pages 5-6:

      “Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).”

      The unique dataset offers an opportunity to provide insights into laterality effects that should be explored.

      We included hemisphere as a factor in all models for this exact reason. Throughout the paper, we have edited the text to ensure that these laterality effects are more apparent to readers.

      Further, we have a Supplementary Results section on hemispheric effects regarding the slocs-v, cSTS3, and lTOS:

      “Hemispheric asymmetries in morphological, architectural, and functional features with regards to the slocs-v, cSTS3, and lTOS comparison

      We observed a sulcus x metric x hemisphere interaction on the morphological and architectural features of the slocs-v (F(4.20, 289.81) = 4.16, η2 = 0.01, p = .002; the cSTS3 is discussed in the next section). Post hoc tests showed that this interaction was driven by  the slocs-v being cortically thinner in the left than the right hemisphere (p < .001; Fig. 2a).

      There was also a sulcus x network x hemisphere interaction on the functional connectivity profiles (using functional connectivity parcellations from (Kong et al., 2019)) of the slocs-v and lTOS (F(32, 2144) = 3.99, η2 = 0.06, p < .001; the cSTS3 is discussed in the next section). Post hoc tests showed that this interaction was driven by three effects: (i) the slocs-v overlapped more with the Default C subnetwork in the left than the right hemisphere (p = .013), (ii) the lTOS overlapped more with Visual A subnetwork in the right than the left hemisphere (p = .002), and (iii) the lTOS overlapped more with the Visual B subnetwork in the left than the right hemisphere (p = .002; Fig. 2b).”

      As well as the other STS rami on morphology:

      “It is also worth noting that there was a sulcus x metric x hemisphere interaction (F(4, 284.12) = 6.60, η2 = 0.08, p < .001). Post hoc tests showed that: (i) the cSTS3 was smaller (p < .001) and thinner (p = .025) in the left than the right hemisphere (Supplementary Fig. 8a), (ii) the cSTS2 was shallower (p = .004) and thicker (p < .001) in the right than left hemisphere (Supplementary Fig. 8a), and (iii) the cSTS1 was shallower (p < .001), smaller (p = .002), thinner (p = .001), and less myelinated (p < .001) in the left than the right hemisphere (Supplementary Fig. 8a).”

      And functional connectivity of the STS rami:

      “There was also a sulcus x network x hemisphere interaction (F(32, 2208) = 12.26, η2 = 0.15, p < .001). Post hoc tests showed differences for each cSTS component. Here, the cSTS1 overlapped more with the Auditory network (p < .001), less with the Control B subnetwork (p < .001), more with the Control C subnetwork (p < .001), less with the Default B subnetwork (p < .001), more with the Default C subnetwork (p < .001), more with the Ventral Attention B subnetwork (p < .001), and more with the Visual A subnetwork (p = .024) in the right than in the left hemisphere (Supplementary Fig. 8b). In addition, the cSTS2 overlapped more with the Control B subnetwork (p < .001), more with the Control C subnetwork (p < .001), less with the Default B subnetwork (p < .001), and less with the Temporal-Parietal network (p = .011) in the right than in the left hemisphere (Supplementary Fig. 8b). Finally, the cSTS3 overlapped more with the Control B subnetwork (p = .002), less with the Default B subnetwork (p = .014), more with the Default C subnetwork (p = .022), less with the Ventral Attention B subnetwork (p = .029) in the right than in the left hemisphere (Supplementary Fig. 8b).”

      Reviewer #2 (Public Review):

      Summary: After manually labeling 144 human adult hemispheres in the lateral parieto-occipital junction (LPOJ), the authors 1) propose a nomenclature for 4 previously unnamed highly variable sulci located between the temporal and parietal or occipital lobes, 2) focus on one of these newly named sulci, namely the ventral supralateral occipital sulcus (slocs-v) and compare it to neighboring sulci to demonstrate its specificity (in terms of depth, surface area, gray matter thickness, myelination, and connectivity), 3) relate the morphology of a subgroup of sulci from the region including the slocs-v to the performance in a spatial orientation task, demonstrating behavioral and morphological specificity. In addition to these results, the authors propose an extended reflection on the relationship between these newly named landmarks and previous anatomical studies, a reflection about the slocs-v related to functional and cytoarchitectonic parcellations as well as anatomic connectivity and an insight about potential anatomical mechanisms relating sulcation and behavior.

      Strengths:

      - To my knowledge, this is the first study addressing the variable tertiary sulci located between the superior temporal sulcus (STS) and intraparietal sulcus (IPS).

      - This is a very comprehensive study addressing altogether anatomical, architectural, functional and cognitive aspects.

      - The definition of highly variable yet highly reproducible sulci such as the slocs-v feeds the community with new anatomo-functional landmarks (which is emphasized by the provision of a probability map in supp. mat., which in my opinion should be proposed in the main body).

      - The comparison of different features between the slocs-v and similar sulci is useful to demonstrate their difference.

      - The detailed comparison of the present study with state of the art contextualizes and strengthens the novel findings.

      - The functional study complements the anatomical description and points towards cognitive specificity related to a subset of sulci from the LPOJ

      - The discussion offers a proposition of theoretical interpretation of the findings

      - The data and code are mostly available online (raw data made available upon request).

      We thank the Reviewer for highlighting the strengths of our methods, analyses, and applications of our findings.

      Weaknesses:

      - While three independent raters labeled all hemispheres, one single expert finalized the decision. Because no information is reported on the inter-rater variability, this somehow equates to a single expert labeling the whole cohort, which could result in biased labellings and therefore affect the reproducibility of the new labels.

      Our group does not use an approach amenable to calculating inter-rater agreements to expedite the process of defining thousands of sulci at the individual level in multiple regions. Our method consists of a two-tiered procedure. Here, authors YT and TG defined sulci which were then checked by a trained expert (EHW). These were then checked again by senior author  (KSW) . We emphasize that this process has produced reproducible anatomical results in other regions such as posteromedial cortex (Willbrand et al., 2023 Science Advances; Willbrand et al., 2023 Communications Biology; Maboudian et al., 2024 The Journal of Neuroscience), ventral temporal cortex (Weiner et al., 2014 NeuroImage; Miller et al., 2020 Scientific Reports; Parker et al., 2023 Brain Structure and Function), and lateral prefrontal cortex (Miller et al., 2021 The Journal of Neuroscience; Voorhies et al., 2021 Nature Communications; Yao et al., 2022 Cerebral Cortex; Willbrand et al., 2022 Brain Structure and Function; Willbrand et al., 2023 The Journal of Neuroscience) across age groups, species, and clinical populations. Further, in the Supplemental Materials we provide post mortem images showing that these sulci exist outside of cortical reconstructions, supporting this updated sulcal schematic of the lateral parieto-occipital junction. For the present study, by the time the final tier of our method was reached, we emphasize that a very small percentage (~2%) of sulcal definitions were actually modified. We will include an exact percentage in future publications in LPC/LOPJ.

      - 3 out of the 4 newly labeled sulci are only described in the very first part and never reused. This should be emphasized as it is far from obvious at first glance of the article.

      We have edited the Abstract (shown below, on Page 1) and paper throughout to emphasize the emphasis on the slocs-v over the other three sulci.

      “After defining thousands of sulci in a young adult cohort, we revised the previous LPC/LPOJ sulcal landscape to include four previously overlooked, small, shallow, and variable sulci. One of these sulci (ventral supralateral occipital sulcus, slocs-v) is present in nearly every hemisphere and is morphologically, architecturally, and functionally dissociable from neighboring sulci. A data-driven, model-based approach, relating sulcal depth to behavior further revealed that the morphology of only a subset of LPC/LPOJ sulci, including the slocs-v, is related to performance on a spatial orientation task.”

      It is worth noting that we have added additional analyses that include the other three newly-characterized sulci in response to Reviewer 1. We first described the relationship between these sulci and demographic features, alongside analyses on the patterning of these sulci, which are included in the Results (Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001). Though we characterize these sulci in this paper for the first time, the location of these four sulci is consistent with the presence of variable “accessory sulci” in this cortical expanse mentioned in prior modern and classic studies (Supplementary Methods). We could also identify these sulci in post-mortem hemispheres (Supplementary Figs. 2, 3), ensuring that these sulci were not an artifact of the cortical reconstruction process.

      Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).  Finally, to help guide future research on these newly- and previously-classified LPC/LPOJ sulci, we generated probabilistic maps of each of these 17 sulci and share them with the field with the publication of this paper (Supplementary Fig. 6; Data availability).”

      - The tone of the article suggests a discovery of these 4 sulci when some of them have already been reported (as rightfully highlighted in the article), though not named nor studied specifically. This is slightly misleading as I interpret the first part of the article as a proposition of nomenclature rather than a discovery of sulci.

      We have toned down our language throughout the paper, emphasizing that this paper is updating the sulcal landscape of LPC/LOPJ taking into account these sulci that have not been comprehensively described previously. For example, in the Abstract (Page 1), we now write:

      “After defining thousands of sulci in a young adult cohort, we revised the previous LPC/LPOJ sulcal landscape to include four previously overlooked, small, shallow, and variable sulci. One of these sulci (ventral supralateral occipital sulcus, slocs-v) is present in nearly every hemisphere and is morphologically, architecturally, and functionally dissociable from neighboring sulci. A data-driven, model-based approach, relating sulcal depth to behavior further revealed that the morphology of only a subset of LPC/LPOJ sulci, including the slocs-v, is related to performance on a spatial orientation task. “

      - The article never mentions the concept of merging of sulcal elements and the potential effect it could have on the labeling of the newly named variable sulci.

      We emphasize that we use multiple surfaces (pial, inflated, smoothwm) to help distinguish intersecting sulci from one another. We include extra text in the Methods (Page 21):

      “We defined LPC/LPOJ sulci for each participant based on the most recent schematics of sulcal patterning by Petrides (2019) as well as pial, inflated, and smoothed white matter (smoothwm) FreeSurfer cortical surface reconstructions of each individual. In some cases, the precise start or end point of a sulcus can be difficult to determine on a surface (Borne et al., 2020); however, examining consensus across multiple surfaces allowed us to clearly determine each sulcal boundary in each individual. “

      Further, upon quantifying the patterning of these variable sulci, a majority of the time they are independent (described in the Results on Page 6):

      “Beyond characterizing the incidence of sulci, it is also common in the neuroanatomical literature to qualitatively characterize sulci on the basis of fractionation and intersection with surrounding sulci (termed “sulcal types”; for examples in other cortical expanses, see (Chiavaras & Petrides, 2000; Drudik et al., 2023; Miller et al., 2021; Paus et al., 1996; Weiner et al., 2014; Willbrand, Parker, et al., 2022). All four sulci most commonly did not intersect with other sulci (see Supplementary Tables 1-4 for a summary of the sulcal types of the slocs and pAngs dorsal and ventral components). The sulcal types were also highly comparable between hemispheres (rs > .99 , ps < .001).”

      Thus, merging sulcal elements likely had a minimal impact on the present definitions.

      - The definition of the new sulci is solely based on their localization relative to other sulci which are themselves variable (e.g. the 3rd branch of the STS can show different locations and different orientation, potentially affecting the definition of the slocs-v). This is not addressed in the discussion.

      As displayed in our probabilistic maps of these sulci (Supplementary Fig. 6), the cSTS components (2-4) are actually relatively consistent between individuals, and thus, future investigators can utilize these maps to help define these sulci in new hemispheres.

      Nevertheless, there is, of course, individual variability in the location of these sulci, and we do agree that this point brought up by the Reviewer is important. We have now added text to the Limitations section of the Discussion (Pages 15-16):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci, let alone PTS, without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This method is also arduous and time-consuming—which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull  relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg and Finn, 2022).”

      - The new sulci are only defined in terms of localization relative to other sulci, and no other property is described (general length, depth, orientation, shape...), making it hard for a new observer to take labeling decisions in case of conflict.

      To help guide future investigators, we now show these metrics for all sulci in Supplemental Figure 7 to help future groups identify these sulci with the assistance of their general morphology.

      - The very assertive tone of the article conveys the idea that these sulci are identifiable certainly in most cases, when by definition these highly variable tertiary sulci are sometimes very difficult to take decisions on.

      The highly variable nature of ¾ of the putative tertiary sulci (slocs-v, slocs-d, pAngs-v, pAngs-d) described here is why we focused on the slocs-v (as it is identifiable in nearly all f hemispheres). However, we have edited our language throughout the text to also emphasize the variability of these sulci. For example, in the Results (Page 5), we now write:

      “In previous research in small sample sizes, neuroanatomists noticed shallow sulci in this cortical expanse (Supplementary Methods and Supplementary Figs. 1-4 for historical details). In the present study, we fully update this sulcal landscape considering these overlooked indentations. In addition to defining the 13 sulci previously described within the LPC/LPOJ, as well as the posterior superior temporal cortex (Methods) (Petrides, 2019) in individual participants, we could also identify as many as four small and shallow PTS situated within the LPC/LPOJ that were highly variable across individuals and uncharted until now (Supplementary Methods and Supplementary Figs. 1-4). Macroanatomically, we could identify two sulci between the cSTS3 and the IPS-PO/lTOS ventrally and two sulci between the cSTS2 and the pips/IPS dorsally. We focus our analyses on the slocs-v since it was identifiable in nearly every hemisphere.”

      - I am not absolutely convinced with the labeling proposed of a previously reported sulcus, namely the posterior intermediate parietal sulcus.

      In defining previously-identified LPC sulci, we followed the previous labeling procedure by Petrides (2019) alongside historical definitions (detailed in Supplementary Figures 1-4). Nevertheless, future deep learning algorithms using these and others data can be used to rectify discrepancies in labeling (e.g., Borne et al., 2020 Medical Image Analysis; Lyu et al., 2021 NeuroImage). We discuss these points in the Limitations section of the Discussion (Pages 16-17):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This should accelerate the process of subsequent studies confirming the accuracy of our updated schematic of LPC/LOPJ. This manual method is also arduous and time-consuming, which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg & Finn, 2022). Though our sample size is comparable to other studies that produced reliable results relating sulcal morphology to brain function and cognition (e.g., (Cachia et al., 2021; Garrison et al., 2015; Lopez-Persem et al., 2019; Miller et al., 2021; Roell et al., 2021; Voorhies et al., 2021; Weiner, 2019; Willbrand, Parker, et al., 2022; Willbrand, Voorhies, et al., 2022; Yao et al., 2022), ongoing work that uses deep learning algorithms to automatically define sulci should result in much larger sample sizes in future studies (Borne et al., 2020; Lyu et al., 2021). Finally, the time-consuming manual definitions of primary, secondary, and PTS also limit the cortical expanse explored in each study, thus, restricting the present study to LPC/LPOJ. “

      Assuming that the labelling of all sulci reported in the article is reproducible, the different results are convincing and in general, this study achieves its aims in defining more precisely the sulcation of the LPOJ and looking into its functional/cognitive value. This work clearly offers a finer understanding of sulcal pattern in this region, and lacks only little for the new markers to be convincingly demonstrated. An overall coherence of the labelling can still be inferred from the supplementary material which support the results and therefore the conclusions, yet, addressing some of the weaknesses listed above would greatly enhance the impact of this work. This work is important to the understanding of sulcal variability and its implications on functional and cognitive aspects.

      We thank the Reviewer for their positive remarks on the implications of this work.

      Reviewer #3 (Public Review):

      Summary: 72 subjects, and 144 hemispheres, from the Human Connectome Project had their parietal sulci manually traced. This identified the presence of previously undescribed shallow sulci. One of these sulci, the ventral supralateral occipital sulcus (slocs-v), was then demonstrated to have functional specificity in spatial orientation. The discussion furthermore provides an eloquent overview of our understanding of the anatomy of the parietal cortex, situating their new work into the broader field. Finally, this paper stimulates further debate about the relative value of detailed manual anatomy, inherently limited in participant numbers and areas of the brain covered, against fully automated processing that can cover thousands of participants but easily misses the kinds of anatomical details described here.

      Strengths:

      - This is the first paper describing the tertiary sulci of the parietal cortex with this level of detail, identifying novel shallow sulci and mapping them to behaviour and function.

      - It is a very elegantly written paper, situating the current work into the broader field.

      - The combination of detailed anatomy and function and behaviour is superb.

      We thank the Reviewer for their positive remarks on paper and our findings.

      Weaknesses:

      - The numbers of subjects are inherently limited both in number as well as in typically developing young adults.

      We emphasize that the sample size is limited due to the arduous nature of manually defining sulci; however, we provide probabilistic maps with the publication of this work to help expedite this process for future investigators. Further, with improved deep learning algorithms, the sample sizes in future neuroanatomical studies should be enhanced. We discuss these points in the Limitations section of the Discussion (Pages 16-17):

      “The main limitation of our study is that presently, the most accurate methodology to define sulci —especially the small, shallow, and variable PTS—requires researchers to manually trace each structure on the cortical surface reconstructions. This method is limited due to the individual variability of cortical sulcal patterning (Fig. 1, Supplementary Fig. 5), which makes it challenging to identify sulci without extensive experience and practice. However, we anticipate that our probabilistic maps  will provide a starting point and hopefully, expedite the identification of these sulci in new participants. This should accelerate the process of subsequent studies confirming the accuracy of our updated schematic of LPC/LOPJ. This manual method is also arduous and time-consuming, which, on the one hand, limits the sample size in terms of number of participants, while on the other, results in thousands of precisely defined sulci. This push-pull relationship reflects a broader conversation in the human brain mapping and cognitive neuroscience fields between a balance of large N studies and “precision imaging” studies in individual participants (Allen et al., 2022; Gratton et al., 2022; Naselaris et al., 2021; Rosenberg & Finn, 2022). Though our sample size is comparable to other studies that produced reliable results relating sulcal morphology to brain function and cognition (e.g., (Cachia et al., 2021; Garrison et al., 2015; Lopez-Persem et al., 2019; Miller et al., 2021; Roell et al., 2021; Voorhies et al., 2021; Weiner, 2019; Willbrand, Parker, et al., 2022; Willbrand, Voorhies, et al., 2022; Yao et al., 2022), ongoing work that uses deep learning algorithms to automatically define sulci should result in much larger sample sizes in future studies (Borne et al., 2020; Lyu et al., 2021). The time-consuming manual definitions of primary, secondary, and PTS also limit the cortical expanse explored in each study, thus restricting the present study to LPC/LPOJ.”

      - While the paper begins by describing four new sulci, only one is explored further in greater detail.

      Due to the increased variability of three of the four newly-classified sulci, we chose to only focus on the slocs-v given that it was present in nearly all hemispheres. In response to other reviewers, we have conducted additional analyses that also describe these new sulci and potential factors related to their incidence (Page 6):

      “Given that sulcal incidence and patterning is also sometimes related to demographic features (Cachia et al., 2021; Leonard et al., 2009; Wei et al., 2017), subsequent GLMs relating the incidence and patterning of the three more variable sulci (slocs-d, pAngs-v, and pAngs-d) to demographic features (age and gender) revealed no associations for any sulcus (ps > .05).”

      In addition, given that sulcal variability is cognitively (e.g., Amiez et al., 2018 Scientific Reports; Cachia et al., 2021 Frontiers in Neuroanatomy; Garrison et al., 2015 Nature Communications; Willbrand et al., 2022, 2023 Brain Structure & Function), anatomically (e.g., Amiez et al., 2021 Communications Biology; Vogt et al., 1995 Journal of Comparative Neurology), functionally (e.g., Lopez Persem et al., 2019 The Journal of Neuroscience), and translationally (e.g., Yucel et al., 2002 Biological Psychiatry) relevant, future research can investigate these relationships regarding the slocs-d and pAngs components. We have added text to the Limitations section of the Discussion (Pages 17-18) to discuss this:

      “Finally, although we did not focus on the relationship between the other three PTS (slocs-d, pAngs-v, and pAngs-d) to anatomical and functional features of LPC and cognition, given that variability in sulcal incidence is cognitively (Amiez et al., 2018; Cachia et al., 2021; Garrison et al., 2015; Willbrand, Jackson, et al., 2023; Willbrand, Voorhies, et al., 2022), anatomically (Amiez et al., 2021; Vogt et al., 1995), functionally (Lopez-Persem et al., 2019), and translationally (Clark et al., 2010; Le Provost et al., 2003; Meredith et al., 2012; Nakamura et al., 2020; Yücel et al., 2002, 2003) relevant, future work can also examine the relationship between the more variable slocs-d, pAngs-v, and pAngs-d and these features.”

      - There is some tension between calling the discovered sulci new vs acknowledging they have already been reported, but not named.

      We have edited the manuscript throughout to emphasize our primary focus on revising the LPC/LOPJ sulcal landscape to include these often overlooked small, shallow, and variable putative tertiary sulci, rather than using the terms “discovered sulci” and “new.”

      - The anatomy of the sulci, as opposed to their relation to other sulci, could be described in greater detail.

      Beyond the radar plots in the main text which compare specific groupings of sulci, we now show the morphological metrics for all sulci investigated in the present work in Supplemental Figure 7.

      Overall, to summarize, I greatly enjoyed this paper and believe it to be a highly valued contribution to the field.

      We are glad the Reviewer enjoyed reading our paper and thank them for their positive thoughts on the potential impact of this work on the field.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The slocs-v is found in 71 subjects left and right. Is that the same subject?

      No, these are different subjects.

      (2) How were the 72 subjects chosen?

      The subjects were randomly selected from the HCP database as describe in the methods (Page 18):

      “Here, we used 72 randomly-selected participants, balanced for gender (following the terminology of the HCP data dictionary), from the HCP database (50% female, 22-36 years old, and 90% right-handed; there was no effect of handedness on our behavioral tasks; Supplementary materials) that were also analyzed in several prior studies (Hathaway et al., 2023; Miller et al., 2021, 2020; Willbrand et al., 2023b, 2023c, 2022a).”

      (3) Are there effects of laterality on sulcal pattern? Table?

      We now include sulcal pattern results in the Results section and Supplementary Materials; although there were no laterality effects regarding the sulcal pattern .

      (4) Depiction/description of common sulcal patterns

      We now include sulcal pattern results in the Results section and Supplementary Materials.

      (5) Is there a relationship between sulcal patterns and demographic features?

      We now include analyses on this in the Results section. There is no relationship between sulcal patterns and demographic features.

      (6) Just for clarity, the sulcal features are studied and extracted in native space?

      Yes, sulcal features are studied and extracted in native space, as described in the Methods section (Page 19):

      “Anatomical T1-weighted (T1-w) MRI scans (0.8 mm voxel resolution) were obtained in native space from the HCP database. Reconstructions of the cortical surfaces of each participant were generated using FreeSurfer (v6.0.0), a software package used for processing and analyzing human brain MRI images (surfer.nmr.mgh.harvard.edu) (Dale et al., 1999; Fischl et al., 1999). All subsequent sulcal labeling and extraction of anatomical metrics were calculated from these native space reconstructions generated through the HCP’s version of the FreeSurfer pipeline (Glasser et al., 2013).”

      (7) The authors use "Gender". Are they referring to biological sex (female/male) or socially defined characteristics (man/woman etc.)?

      The term gender is referred to socially defined characteristics, as used by the HCP data dictionary (Methods page 18):

      “Here, we used 72 randomly-selected participants, balanced for gender (following the terminology of the HCP data dictionary), from the HCP database (50% female, 22-36 years old, and 90% right-handed; there was no effect of handedness on our behavioral tasks; Supplementary materials) that were also analyzed in several prior studies (Hathaway et al., 2023; Miller et al., 2021, 2020; Willbrand et al., 2023b, 2023c, 2022a).”

      (8) Fig 2. Grey is poorly visible compared to green and blue.

      The shade of gray has been edited to be more distinguishable.

      (9) The relationship between behavior and sulcal features is significant but weak.

      We acknowledge that the morphological-behavioral relationship identified in the present study explains a modest amount of variance; however, the more important aspect of the finding is that multiple sulci identified in the model are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. We have added text to the Limitations section of the Discussion (Pages 17-18) to emphasize this point:

      “It is also worth noting that the morphological-behavioral relationship identified in the present study explains a modest  amount of variance; however, the more important aspect of our findings is that multiple sulci identified in our model-based approach are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. “

      (10) The Limitation section could be expanded.

      We have added additional text to flesh out the Limitations section of the Discussion (Pages 17-18):

      “It is also worth noting that the morphological-behavioral relationship identified in the present study explains a modest  amount of variance; however, the more important aspect of our findings is that multiple sulci identified in our model-based approach are recently-characterized sulci in LPC/LOPJ identified by our group and others (Petrides, 2019), and thus, the relationship would have been overlooked or lost if these sulci were not identified. Finally, although we did not focus on the relationship between the other three PTS (slocs-d, pAngs-v, and pAngs-d) to anatomical and functional features of LPC and cognition, given that variability in sulcal incidence is cognitively (Amiez et al., 2018; Cachia et al., 2021; Garrison et al., 2015; Willbrand, Jackson, et al., 2023; Willbrand, Voorhies, et al., 2022), anatomically (Amiez et al., 2021; Vogt et al., 1995), functionally (Lopez-Persem et al., 2019), and translationally (Clark et al., 2010; Le Provost et al., 2003; Meredith et al., 2012; Nakamura et al., 2020; Yücel et al., 2002, 2003) relevant, future work can also examine the relationship between the more variable slocs-d, pAngs-v, and pAngs-d and these features. “

      Reviewer #2 (Recommendations For The Authors):

      First, I would like to thank the authors for their important contribution to the field of sulcal studies and anatomo-functional correlates. My main comments about the work are treated in the public review, and I will only address details in this section. I have detected a number of typos which are harder to report from a document in which lines are not numbered. Could you please submit a numbered document for the next iteration?

      - p2. "hominoid-specific, shallow indentations, or sulci" - can lead to misunderstanding that sulci are hominoid-specific and shallow

      Sentence has been rewritten:

      “Of all the neuroanatomical features to target, recent work shows that morphological features of the shallower, later developing, hominoid-specific indentations of the cerebral cortex (also known as putative tertiary sulci, PTS) are not only functionally and cognitively meaningful, but also are particularly impacted by multiple brain-related disorders and aging (Amiez et al., 2019, 2018; Ammons et al., 2021; Cachia et al., 2021; Fornito et al., 2004; Garrison et al., 2015; Harper et al., 2022; Hathaway et al., 2023; Lopez-Persem et al., 2019; Miller et al., 2021, 2020; Nakamura et al., 2020; Parker et al., 2023; Voorhies et al., 2021; Weiner, 2019; Willbrand et al., 2023b, 2023c, 2022a, 2022b; Yao et al., 2022).”

      - p2. next sentence (starting with "The combination [...]": not clear that you are addressing tertiary sulci here, maybe introduce the concept beforehand?

      The previous sentence (just above) has been edited to introduce putative tertiary sulci beforehand.

      - p5. error in numbering of sulci relative to Fig1. (5,6,7,8 -> 6,7,8,9)

      Sulcal numbering has been fixed.

      -p5. reference to supp mat -> I would have expected the nomenclature used in Borne et al. 2020 to be discussed alongside with the state of the art. How would you relate F.I.P.r.int.1 and F.I.P.r.int.2 to the sulci you describe?

      We thank the Reviewer for bringing up this relevant literature. The F.I.P.r.int. 1 and 2 are described as rami of the IPS, whereas the slocs and pAngs are independent, small indentations near the IPS, but not part of the complex. Nevertheless, future work should integrate these two schematics together to establish the most comprehensive sulcal map of LPC/LOPJ. We have added text to the Supplementary Methods detailing the differences between the F.I.P.r.int.1 and F.I.P.r.int.2 and slocs-/pAngs:

      “slocs/pAng vs. F.I.P.r.int.1 and F.I.P.r.int.2

      Recent work (Borne et al., 2020; Perrot et al., 2011) identified two intermediate rami of the IPS (F.I.P.r.int.1 and F.I.P.r.int.2) that were not defined in the present investigation. Crucially, the newly classified sulci here (slocs and pAngs) are distinguishable from the two F.I.P.r.int. in that the F.I.P.r.int. are branches coming off the main body of the IPS (Borne et al., 2020; Perrot et al., 2011), whereas the slocs/pAngs are predominantly non-intersecting (“free”) structures that never intersected with the IPS (Supplementary Tables 1-4).”

      - p6. Fig 1.a. labelling discrepancy between line 1 and 2, column 4: the labels 10 and 11 from the inflated hemisphere do not match the labels 10 and 11 in the pial surface. Fig 1.b. swapped label 2 and 3 in the 4th hemisphere

      These aspects of Figure 1 have been edited accordingly.

      - p7. "(iii) the slocs-v was thicker than both the cSTS3 and lTOS" -> the slocs-v showed thicker gray matter?

      The sentence has been adjusted (Page 7):

      “(iii) the slocs-v showed thicker gray matter than both the cSTS3 and lTOS (ps < .001), “

      - p9. Six left hemisphere LPC/LPOJ sulci were related to spatial orientation task performance -> missing

      Fixed (Page 9):

      “Six left hemisphere LPC/LPOJ sulci were related to spatial orientation task performance (Fig. 3a, b). “

      - p14. "Steel and colleagues" -> missing space

      Fixed (Page 14):

      “Furthermore, the slocs-v appears to lie at the junction of scene-perception and place-memory activity (a transition that also consistently co-localizes with the HCP-MMP area PGp) as identified by Steel and colleagues (2021).”

      - p20. Probability maps "we share these maps with the field" -> specify link to data availability

      The link to data availability has been added (Page 21):

      “To aid future studies interested in investigating LPC/LPOJ sulci, we share these maps with the field (Data availability). “

      Reviewer #3 (Recommendations For The Authors):

      No detailed recommendations not already present in the rest of the review.

    1. Reviewer #1 (Public Review):

      Summary:

      This study investigates how the human brain flexibly adjusts its representations of the world as the environment continually changes. The authors identified regions where the representation continuously drifted across multiple months. They also found that the representation in the parahippocampal cortex could be rapidly influenced by recent environmental inputs.

      Strengths:

      (1) This study touches upon a crucial but less-explored issue: the relationship between semantic knowledge updating and representation drift in the brain.

      (2) This study addresses this issue with a unique dataset in which participants viewed objects embedded in thousands of natural scenes across many fMRI sessions over eight months.

      (3) The method for investigating whether the recent inputs could change the neural representation is compelling (i.e., subtracting the backward correlation value from the forward correlation value).

      Weaknesses:

      (1) Statistical Inference.

      (a) Statistical inference is across eight subjects. Low statistical power means high false positive rates.

      (b) Multiple comparisons across brain regions were not corrected.

      (2) Object Encoding

      It is unclear whether the identified brain regions represent the objects (as declared in the manuscript) or the visual features shared by pictures of similar items. Such visual features could be those of the background (e.g., spatial layout or the color tone of the scene), not the objects.

      (3) Semantic Content in the MTL

      Items with higher levels of semantic association tend to cooccur in the same picture. The results could be driven by the number of pictures shared between each pair of items, not semantic similarity (as declared in the manuscript).

      (4) Long-term Drift of Item Representations in the MTL

      (a) The results show a long-term representational drift in the brain but provide no evidence suggesting that this long-term neural representational drift reflects the drift in semantic representation. Although the authors used the "semantic" mask defined in the previous step, it does not mean the representation drift in the semantic mask is semantic, and there is doubt whether the "semantic" mask defined in the previous step is really semantic (see the third point).

      (b) The beta value of the drift can not be directly compared across regions. Different regions have different sizes and signal-to-noise ratios in the BOLD signal. Their within-item similarity can not be compared directly in the first place.

      (5) Recent Structure Rapidly Influences Item Representations in PHC

      (a) It is unclear why the authors implement additional modularity analysis instead of directly using the pairwise co-occurrence frequencies among the 80 items, which is more straightforward.

      (b) It does not make sense to compare the recent structure to the long-term structure across all 30 sessions because the structure of the posterior sessions cannot influence the current structure updating.

      (c) It is unclear how the authors calculate the structure-induced change in the PHC in Figure 7.

    2. eLife assessment

      This valuable study investigates how the human brain flexibly adjusts its representations of the world as the environment continually changes. It utilizes a unique dataset in which participants view thousands of natural scenes across many fMRI sessions over multiple months. The evidence supporting the claims of the authors is incomplete, with statistical inference not always warranted. The study would interest a broad readership in cognitive neuroscience.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors set out to uncover which brain regions might support the continuous updating of semantic associations thereby showing a system of semantic plasticity. Using fMRI data from participants viewing thousands of natural scene images over 30 recording sessions, they hoped to establish how objects co-occuring with each other within images influences the semantic representations in the human brain that relate to those object concepts.

      Strengths:

      There is a lot to like about the paper. A major strength of the methods and results is the convincing demonstration of many of the results. This includes showing item representations in the ventral visual pathway and medial temporal lobes (MTL), as we would expect. They also show semantic effects - defined using the word co-occurrence vectors from word2vec, along the posterior and anterior ventral visual pathway and MTL - replicating various past studies. The authors use a creative approach to show that the item representations measured within each session are modulated by the co-occurrence structure in previous trials, becoming more closely related. And that item representations seem to subtly change over the course of the 30 sessions, in that they become less related to each other with increasing distance. However, the semantic effects within each session itself are claimed to remain unchanged.

      Weaknesses:

      This leads to what I see as a weakness in the study. The conclusions relate to semantic plasticity and the changes in semantic (associative) representations. The drift analyses do appear to show representational changes across the sessions, but this is based on the item representations. The inference is that this is due to an updating of knowledge about the associations each item has had with other items. Yet, in the same regions, the authors suggest that semantic associative effects, as tested using word2vec for each session, remain stable. Doesn't this seem to contradict the claims about semantic plasticity?

      Some of this is difficult to unpick as the semantic stability analysis using word2vec in each session is only very briefly mentioned, and the data is not shown (I would include it). So, at present, I feel they show evidence of representational changes but do not show evidence of what the nature of the change is. If the neural representations consistently reflect the long-term semantic associations (which is what word2vec captures), then how does this combine with the drift effects of item representations?

      Does it mean that the changes in item representations do not reflect semantic associative knowledge? And reflect some other non-specified type of information (perhaps as the participants are doing an image memory test).

      Another potential weakness is the robustness of the drift analysis itself. For the drift analysis, item representations in each session are compared to all other sessions and then averaged according to the number of intervening sessions. This means the data for item representation with a session difference of 1 will be based on 29 data points, a session difference of 2 on 28 data points ... and a session difference of 29 based on 1 data point. So there is a huge imbalance in the amount of data that goes into the analysis for the different numbers of intervening sessions. This leads me to wonder if it could impact the validity of the results. An alternative might be to use 1 datapoint for each session (or a suitable value, I imagine 5 would still give enough data to analyse drifts) and calculate drift, and then repeat this with different partitions of the data to see how stable it is, and if drift is reliably occurring. Alternatively, the analyses they use might have been used and validated previously.

      To be clear, I do think this is a very nice study and will have a positive impact on researchers interested in object processing, semantic knowledge, statistical learning, and schemas. But think there are some gaps between what the data shows evidence for, and the ultimate inferences made.

    4. Reviewer #3 (Public Review):

      Summary:

      This study characterizes the relative stability of semantic representations in the human brain using functional magnetic resonance imaging (fMRI) data. The authors suggest that representations in the early stages of processing within the visual system are stable over hours, weeks, and months, while representations in later stages of processing - within the medial temporal lobe - change more rapidly, sometimes within the span of a single fMRI session.

      To make this claim, the authors conduct a series of analyses using a well-established fMRI dataset. This begins with a decoding analysis to identify regions that contain reliable object-specific information. This approach identifies early stages within canonical visual cortices (e.g., primary visual cortex, V1), as well as downstream regions within the medial temporal lobe (MTL); this includes perirhinal cortex (PRC), parahippocampal cortex (PHC), and several subfields within the hippocampal cortex (e.g., CA1). Next, they identify regions that are correlated with "semantic features" associated with these objects, determined using word2vec embeddings of each of these object names. Several regions within the MTL (CA1, PRC, PHC) were significantly correlated with these word2vec embeddings. The authors then turn their analyses to representational change across two different timescales. Between scan sessions, regions at early stages of visual processing (e.g., V1) contain relatively stable representations, while regions within the MTL decreased their auto-correlation across sessions, suggesting that there is increased representational change/drift in the MTL. Finally, the authors demonstrate that there is representational change with PHC within a single scan session - changes that reflect the statistics of visual experiences.

      Strengths:

      The analyses conducted in this study are solid and creative and they yield compelling theoretical results. Beyond the paper's central claims, this study also highlights the utility of publicly available datasets (i.e., NSD) in exploring and evaluating novel theoretical ideas.

      Especially compelling is the combined analysis used to estimate reliable item-level representations, first, and then the long-term drift of item representations (i.e., between sessions). The design choices for modeling the fMRI data (e.g., the cross-validated approach to predicting voxel-level responses) reflect state-of-the-art analysis methods, while the control regions used in these analyses (e.g., V1) provide compelling contrasts to the experimental effects. This makes it clear that the observed representational drift/instability is not present throughout the visual system. These results indicate that this effect is worthy of future experiments, while also providing auxiliary information related to effect size, etc.

      Weaknesses:

      The concerns outlined here do not challenge the central claim within this study, relating to the relative instability of representations within the MTL as compared to V1. Instead, these concerns focus on whether these representations should be described as "semantic," the importance we should give to the distinction between PHC and other MTC structures, and the lack of systematic analysis in relation to the "gradient" from posterior to anterior regions. In each case, I have provided suggestions as to how these concerns might be addressed. Finally, I've made a note about whether these data should be interpreted in terms of neural "plasticity" given the lack of behavioral change in relation to these fMRI data.

      (1) No reason to believe that representations within the MTL are necessarily 'semantic.'

      The authors suggest "evoked object representations in CA1, PHC, and PRC are semantic in nature." However, the correlation between fMRI responses and word2vec embeddings-the only evidence for "semantic" representations-is ambiguous. These structures might contain high-dimensional features that are associated with these objects for other reasons; concretely, there might be visual information that is not semantic but relates to the reliable visual properties of these objects (e.g., texture, shape, location in the image). Yet there are no analyses to disambiguate between these alternative accounts. As such, labeling these as "semantic" representations is suggestive but premature. Nonetheless, developing such a control analysis should be relatively straightforward. I outline one possible approach below.

      While "semantic" information is a relatively nebulous term in the cognitive neurosciences, contemporary deep-learning methods might offer unambiguous ways to characterize such representations. If we assume that "semantics" relate to the meaning of an object/entity and not the "low-level" sensory attributes related to encoding this information, this leads to a straightforward implementation of object semantics: the reliable variance that can be isolated within the residuals of a sensory encoder. For example, do word2vec embeddings explain variance within the medial temporal lobe above and beyond the variance explained by a vision-only image encoder? Of course, care must be taken to use a visual encoder which is not itself a crystallization of object semantics (e.g., encoders optimized using a classification objective), but this is all very feasible given contemporary computer vision methods. Adding such a control analysis would offer a significant improvement over the current approach, clarifying the nature of the stimulus-driven representations within the medial temporal lobe by disentangling "semantic" properties of reliable visual features.

      Additionally, it is not clear whether results from the current "object encoding" analysis and "semantic detection" analysis differ because of underlying differences in representational content in these regions or because of design choices in these analyses themselves. That is, while the object encoding analysis learns a linear projection from a one-hot 80-dimensional vector to hemodynamic responses in each brain region, the semantic detection analysis correlates these predicted hemodynamic responses with word2vec embeddings associated with each of these 80 objects. These different analysis methods result in different outcomes: not all regions identified by the object encoding analysis are also identified in the semantic detection analysis (e.g., hippocampal subfields). It is not clear to what degree these different outcomes are a function of "semantic" information, or are simply a consequence of differences in analytic approaches. It would be useful to know the results by repeating the logic from the object encoding analysis, but instead of 1-hot vectors for each object, use the word2vec embeddings.

      (2) Unclear if the differences between PHC and other MTL structures are driven by SNR.

      Parahippocampal cortex (PHC) is a region reliably identified by the analyses in this study: PHC is identified in the analysis of item encoding, semantic content, and representational drift across long (between-session) and short (within-session) timescales. Control regions here provide a convincing contrast to PHC in each of these analyses, and so the role of PHC appears clear in these analyses. However, it is unclear how to interpret the difference between PHC and other structures within the MTC - namely, the observation that PHC alone is influenced by representational drift across shorter timescales. It's possible that these effects are common throughout the MTL, but are only evident in PHC because of increased SNR. This concern seems plausible when observing PHC's "encoding success" and "semantic content," both visually and statistically, relative to other MTL structures: the magnitude of PHC's effect appears greater, which could simply be an artifact of PHC's relatively high SNR. In fMRI data, for example, PRC typically has relatively low SNR due to field inhomogeneities related to dropout, due to PRC's relative proximity to the ear canal-which is exacerbated in 7T (vs 3T) scanners, which was the case for the data in this study.

      Addressing this concern could be relatively straightforward. For example, including information about the SNR in each respective brain region would be very helpful. If the SNR across brain regions within the MTL is relatively uniform, then this already addresses the concern above. regardless, it would be useful to report the experimental effects in relation, for example, the split-half reliability of signal in each brain region. That is, instead of simply reporting that that results are significant across brain regions, the authors might estimate how reliable the variance is across brain regions, and use this reliable variance as a ceiling which can be used to normalize the amount of variance explained in each analysis. By providing an account of the differences in the reliability/SNR of different regions, we would have a much better estimate of the relative importance of differences in the results reported for different regions within the MTL.

      (3) Need for more systematic analysis/visualization of "posterior" vs. "anterior" regions.

      The authors report that "Whole-brain analyses revealed a gradient of plasticity in the temporal lobe, with drift more evident in anterior than posterior areas." However, the only contrast provided in the main text is between MTL structures and V1-there is no "gradient" in any of these analyses. There are other regions visualized in Supplemental Figure 3, but there is not a systematic evaluation of the gradient along a "posterior/anterior" axis. It would be helpful to see the results in Figures 3A, 4A, 5A, and 6A to include other posterior visual regions (e.g., V4, LOC, PPA, FFA) beyond V1.

      (4) Without behavioral data, not a direct relationship with "stability-plasticity tradeoff"

      The results from this study are framed in relation to a "stability-plasticity tradeoff." As argued in this manuscript, this tradeoff is central to animal behavior - our ability to rapidly deploy prior knowledge to respond to the world around us. Given that there are no behavioral measures used in the current study, however, no claims can be made about how these fMRI data might relate to learning, or behavior more generally. As such, framing these results in terms of a stability-plasticity tradeoff is tenuous. "Representational drift," on the other hand, is a term that is relatively agnostic in its relationship with behavior, and aptly describes the results presented here. The authors refer to this term as well. Considering the lack of behavioral evidence, alongside the core findings from these neuroimaging data, "representational stability" or "representational "drift" seems to be a more direct description of the available data than "neural plasticity" or a "stability-plasticity tradeoff."

    5. Author response:

      We are very appreciative of the reviewers’ assessment that we used “solid and creative” methods to provide a “convincing demonstration” of “compelling theoretical results” on a “crucial but less-explored issue” in cognitive neuroscience. We are also grateful for their thoughtful suggestions for analyses and for pointing out areas where our analysis descriptions need more clarity. While we will respond to all comments in a future response and revision, here we provide information and clarification on a few central points.

      Localization of semantic content:

      Regarding our semantic analysis, one reviewer rightly pointed out that items with a high degree of semantic association, as captured by word2vec, tend to occur in the same images, and they expressed concern that this could drive our similarity results. We wish to clarify here (and will revise the manuscript accordingly) that we excluded all pairs of co-occurring items in our word2vec semantic analysis in order to avoid this issue. Thus, our results cannot be driven by the number of images within which items co-occurred. We also agree with the reviewer who stated that “semantic information” is a nebulous term in the cognitive neurosciences, and it appears to have led to some confusion as to the nature of our claims. We take a broad view of this term, with the perspective that visual features (e.g., color, shape) can contribute to semantic content rather than necessarily competing with it. In our work, we use word2vec to identify neural representations that reflect the kind of semantic content present in word embedding models—but the conclusions we draw do not depend on these representations being devoid of visual content. That is, we do not use word2vec to examine semantic versus visual representations, but rather to narrow down the set of representations to be considered in subsequent analyses. While there are a range of legitimate views on what should be considered a “semantic” representation, our broad view, which is inclusive of visual content, along with our strategy for localizing semantic content are both standardly used in the visual neuroscience literature. Prior work in this literature has compared the ability of word2vec and low-level visual models to predict neural responses to natural images and found that the brain regions in which activity is accurately predicted by the models are considerably distinct: whereas a low-level visual model best predicts activity in V1, V2, and V4, word2vec performs better in more anterior regions, including in visual areas such as lateral occipital cortex (Güçlü & van Gerven, 2015, arXiv). This suggests that our effects are unlikely to be explained by overlap in the kinds of low-level visual features mentioned by the reviewers. However, the semantic content we localize and the representation of high-level visual features may indeed overlap, and this is compatible with our claims. We will do more in our revision to be explicit about our intended meaning in our use of the word “semantic” and how our approach relates to and builds on prior work in this literature.

      Long-term representational drift:

      We want to clarify our claims regarding the representational drift analysis. One reviewer stated that, while we show evidence of representational drift, we “provide no evidence suggesting that this long-term neural representational drift reflects a drift in semantic representation.” Another reviewer said: “The inference is that this [drift] is due to an updating of knowledge about the associations each item has had with other items,” and that our finding that semantic structure remains stable within these regions seems “to contradict the claims about semantic plasticity.” The claim we intended to make, which will be unpacked more clearly in our revision, is that the neural representations underlying semantic content drift over time, even if the semantic content itself is unchanging. In other words, we do not claim that our across-session drift analyses show changes in knowledge about object associations. Indeed, one of the reasons that representational drift has recently captured the attention of neuroscientists is that the neural representations underlying certain behaviors or cognitive content appear to drift over time even when the behaviors or cognitive content remain fixed. The relational structure of the neural representations can remain stable, even if the particular neurons recruited to represent each stimulus change over time (see, e.g., the T-maze in Rule, O’Leary, & Harvey., 2019, Curr Opin Neurobiol). Here we are translating these ideas, which were developed using animal models and/or primarily focused on low-level vision, to the semantic system in humans. The neural representations we identify in our paper capture semantic information because they share a similarity structure with word2vec, and the level of similarity to word2vec remains stable over time. Thus, our findings provide a simple demonstration of long-term representational drift in the human semantic system akin to that reported in animals—drift in the neural semantic representations of items even as the relations between these item representations appear stable.

      Signal-to-noise variability across the MTL:

      A reviewer raised the possibility that differences between our ROIs could be driven by variability in signal-to-noise ratio (SNR) across regions, particularly within the medial temporal lobe (MTL). We looked at noise ceiling SNR brain maps for each participant, which reflect the reliability of neural responses across repetitions of the same image. Preliminary analyses indicate that SNR differences do not account for our object encoding, semantic content, representational drift, or short-term plasticity measures across the MTL.

    1. eLife assessment

      This valuable paper presents convincing evidence that changing the constraint of how long to stop at an intermediate target significantly influences the degree of coarticulation of two sequential reaching movements, as well as their response to mechanical perturbations. Using an optimal-control framework, the authors offer a normative explanation of how both co-articulated and separated sequential movement can be understood as an optimal solution to the task requirements.

    2. Reviewer #1 (Public Review):

      In their paper, Kalidini et al. investigate why the motor system sometimes coarticulates movements within a sequence. They begin by examining this phenomenon in an optimal feedback controller (OFC) that performs reaching movements to two targets (T1 and T2). They show that coarticulation occurs only when the controller is not required to slow down at T1. When the controller must decelerate at T1, coarticulation does not occur. This observation holds true even though the controller has information about both targets in both scenarios. They test the same experiment on human participants and show that humans also coarticulate the reaches only when they are instructed to treat the first target as a via point. Both in human participants and OFC simulations, whenever the coarticulation is present, the long-latency response to perturbations during the first reach is also informed by the second target- suggesting that the information about the second target is already present in the circuitry that control the long-latency reflex.<br /> All experiments and analyses are standard and clearly explained. Their analysis of long-latency as a measure of coarticulation of sequence items is highly interesting and broadly useful for future experiment design. They successfully demonstrate that one reason the motor system sometimes coarticulates movements is due to high-level instructions on how to execute the sequence. These high-level instructions can, in turn, determine how and to what extent information about future sequence items is utilized by the low-level controller that governs muscle activity. However, the precise interaction between high-level task demands and low-level controllers at the neural tissue level remains an open question.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript the authors examine the question of whether discrete action sequences and coarticulated continuous sequential actions can be produced from the same controller, without having to derive separate control policies for each sequential movement. Using modeling and behavioral experiments, the authors demonstrate that this is indeed possible if the constraints of the policy are appropriately specified. These results are of interest to those interested in motor sequences, but it is unclear whether these findings can be interpreted to apply to the control of sequences more broadly (see weaknesses below).

      Strengths:<br /> The authors provide an interesting and novel extension of the stochastic optimal control model to demonstrate how different temporal constraints can lead to either individual or coarticulated movements. The authors use this model to make predictions about patterns of behavior (e.g., in response to perturbations), which they then demonstrate in human participants both by measuring movement kinematics as well as EMG. Together this work supports the authors' primary claims regarding how changes in task instructions (i.e., task constraints) can result in coarticulated or separated movement sequences and the extent to which the subsequent movement goal affects the planning and control of the previous movement.

      Weaknesses:<br /> Although this work is quite interesting, it remains unknown whether there is a fundamental distinction between a coarticulated sequence and a single movement passing through a via point (or equivalently, avoiding an obstacle). The notion of a coarticulated sequence brings with it the notion of sequential (sub)movements and temporal structure, whereas the latter can really be treated as more of a constraint on the production of a single continuous movement. The authors suggest that these are not truly different kinds of movements at the level of a control policy, but this remains to be tested experimentally.

      It also remains unclear for the theory of optimal feedback control as a whole where and how the cost function and constraints are specified to guide the optimization process. That is, presumably there is the ability for higher-level or explicit description of these constraints, but how they then become incorporated into a control policy remains unclear. With regard to the kind of multi-target constraints proposed here, in typical sequence tasks, while some movements become coarticulated, people also tend to form chunks with distinct chunk boundaries. This presumably means that there is at least some specification of the sequential ordering of these chunks that must exist beyond the control policy and that multiple control policies may still be warranted to execute an entire sequence (otherwise the authors' model might suggest that people can coarticulate forever without needing to exhibit any chunk boundaries). Hence, while the authors fairly convincingly show that a single control policy can lead to separated or coarticulated movements given an appropriate set of constraints, their work does not speak to where or how those constraints are specified, nor to how longer sequences are controlled.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: 

      In this paper, Kalidindi and Crevecoeur ask why sequential movements are sometimes coarticulated. To answer this question, first, they modified a standard optimal controller to perform consecutive reaches to two targets (T1 and T2). They investigated the optimal solution with and without a constraint on the endpoint's velocity in the via target (T1). They observed that the controller coarticulates the movements only when there is no constraint on the speed at the via-point. They characterized coarticulation in two ways: First, T2 affected the curvature of the first reach in unperturbed reaches. Second, T2 affected corrective movements in response to a mechanical perturbation of the first reach. 

      Parallel to the modeling work, they ran the same experiment on human participants. The participants were instructed to either consider T1 as via point (go task) or to slow down in T1 and then continue to T2 (stop task). Mirroring the simulation results, they observed coarticulation only in the go task. Interestingly, in the go task, when the initial reach was occasionally perturbed, the long-latency feedback responses differed for different T2 targets, suggesting that the information about the final target was already present in the motor circuits that mediate the long-latency response. In summary, they conclude that coarticulation in sequential tasks depends on instruction, and when coarticulation happens, the corrections in earlier segments of movement reflect the entirety of the coarticulated sequence.

      Evaluation 

      Among many strengths of this paper, most notably, the results and the experiment design are grounded in, and guided by the optimal control simulation. The methods and procedures are appropriate and standard. The results and methods are explained sufficiently and the paper is written clearly. The results on modulation of long-latency response based on future goals are interesting and of broad interest for future experiments on motor control in sequential movement. However, I find the authors' framing of these results, mostly in the introduction section, somewhat complicated.

      The current version of the introduction motivates the study by suggesting that "coarticulation and separation of sub-movement [in sequential movements] have been formulated as distinct hypotheses" and this apparent distinction, which led to contradictory results, can be resolved by Optimal Feedback Control (OFC) framework in which task-optimized control gains control coarticulation. This framing seems complicated for two main reasons. First, the authors use chunking and coarticulation interchangeably. However, as originally proposed by (Miller 1956), the chunking of the sequence items may fully occur at an abstract level like working memory, with no motoric coarticulation of sequence elements at the level of motor execution. In this scenario, sequence production will be faster due to the proactive preparation of sequence elements. This simple dissociation between chunking and coarticulation may already explain the apparent contradiction between the previous works mentioned in the introduction section. Second, the authors propose the OFC as a novel approach for studying neural correlates of sequence production. While I agree that OFC simulations can be highly insightful as a normative model for understanding the importance of sequence elements, it is unclear to me how OFCs can generate new hypotheses regarding the neural implementation of sequential movements. For instance, if the control gains are summarizing the instruction of the task and the relevance of future targets, it is unclear in which brain areas, or how these control gains are implemented. I believe the manuscript will benefit from making points more clear in the introduction and the discussion sections. 

      We agree that chunking may occur at different levels that do not necessarily involve motor coarticulation. We clarified that our contribution is towards answering why sequence movements sometimes coarticulate, and how the way sequences are executed influences the representation of future goals in the sensorimotor system.

      To address this point, we made the following modifications in the introduction:

      Line 44:

      “It remains unclear how future goals are integrated in the sensorimotor system. For rapid execution of a sequence, one possible solution is to represent multiple goals within low-level control circuits (3, 16), enabling the execution of several elements as a single entity, called “motor chunk”. Note that chunking can also occur at a higher level such as in working memory-guided sequences, which in this case may or may not involve the production of a movement (17, 18).”

      Lines 50:

      “Recent neural recordings in the primary motor cortex (M1) have shown no specific influence of future goals on the population responses governing ongoing action (19, 20). Specifically, Zimnik and Churchland (20) observed in a two-reach sequence task that, there was no coarticulation in sub-movement kinematics although the execution got faster with practice. Notably, M1 displayed separate phases of execution related activity for each sub-movement. Using a neural network model, they interpreted that sequence goals could be separated and serially specified to the controller from regions upstream of M1 (Figure 1A). These findings contrast with earlier studies showing coarticulation of sub-movements and whole sequence representations in M1 (21–23). As a result, it has been suggested that coarticulation and separation in rapid sequences may involve distinct computations: coarticulation possibly involves replacing sub-movements with a motor chunk, while separation possibly indicates independent control of each sub-movement with chunking at a higher-level (4, 20).  Thus, there are unresolved questions regarding why sequential movements sometimes coarticulate, and how the representation of future goals in the sensorimotor system influences the way sequences are executed.”

      With respect to the second part of your concern about OFC, we agree that this framework does not make direct prediction about the neural implementation and our statements required clarifications. The first link between the model and prediction about neural data follows from the observation that long-latency circuits participate in task-dependent sequence production, thus indicating that transcortical pathways must express this task dependency. The second link between our work and neural activities is by providing a counter argument to previous interpretation: indeed, Zimnik and Churchland argued that independent or “holistic” sequence production should be associated with different representations in monkey’s brain. In contrast we suggest that the same controller can flexibly generate both kinds of sequences, without implying a different structure in the controller, only a different cost-function. We thus refine the expectation about neural correlates of sequence representations by showing that it potentially relates to the encoding of task constraints.

      To address this point, we added the following changes in the introduction and discussion:

      Line 69 in Introduction: 

      “The theory of optimal feedback control (OFC) has been particularly useful in predicting the influence of numerous task parameters on the controller (27–34), thus reproducing goal-directed motor commands during both unperturbed movements and feedback responses to disturbances (30). OFC has been used in numerous studies to interpret flexible feedback responses occurring in the long-latency response period (30, 35).” 

      Line 454 in Discussion:

      “Although OFC has been predominantly used as a behavioral level framework agnostic to neural activity patterns, it can shed light on the planning, state estimation and execution related computations in the transcortical feedback pathway (Takei et al.,). Using OFC, our study proposes a novel and precise definition of the difference to expect in neural activities in order to identify coarticulated versus independent sequence representations from a computational point of view. Because each condition (i.e., overlapping versus non-overlapping controllers as in Figure 2) was associated with different cost-functions and time-varying control gains, it is the process of deriving these control gains, using the internal representation of the task structure, that may differ across coarticulated and separated sequence conditions. To our knowledge, how and where this operation is performed is unknown. A corollary of this definition is that the preparatory activity (20, 50) may not discern independently planned or coarticulated sequences because these situations imply different control policies (and cost functions), as opposed to different initial states. Moreover, the nature of the sequence representation is potentially not dissociable from its execution for the same reason.”

      Reviewer #2 (Public Review):

      Summary: 

      In this manuscript, the authors examine the question of whether discrete action sequences and coarticulated continuous sequential actions can be produced from the same controller, without having to derive separate control policies for each sequential movement. Using modeling and behavioral experiments, the authors demonstrate that this is indeed possible if the constraints of the policy are appropriately specified. These results are of interest to those interested in motor sequences, but it is unclear whether these findings can be interpreted to apply to the control of sequences more broadly (see weaknesses below). 

      Strengths: 

      The authors provide an interesting and novel extension of the stochastic optimal control model to demonstrate how different temporal constraints can lead to either individual or coarticulated movements. The authors use this model to make predictions about patterns of behavior (e.g., in response to perturbations), which they then demonstrate in human participants both by measuring movement kinematics as well as EMG. Together this work supports the authors' primary claims regarding how changes in task instructions (i.e., task constraints) can result in coarticulated or separated movement sequences and the extent to which the subsequent movement goal affects the planning and control of the previous movement. 

      Weaknesses: 

      I reviewed a prior version of this manuscript, and appreciate the authors addressing many of my previous comments. However, there are some concerns, particularly with regard to how the authors interpret their findings. 

      We thank the reviewer for their continued assessment of our work and for helping us to improve the paper. We are convinced that this and the previous review helped us clarifying our work considerably.

      (1) It would be helpful for the authors to discuss whether they think there is a fundamental distinction between a coarticulated sequence and a single movement passing through a via point (or equivalently, avoiding an obstacle). The notion of a coarticulated sequence brings with it the notion of sequential (sub)movements and temporal structure, whereas the latter can be treated as more of a constraint on the production of a single continuous movement. If I am interpreting the authors' findings correctly it seems they are suggesting that these are not truly different kinds of movements at the level of a control policy, but it would be helpful for the authors to clarify this claim. 

      Indeed, this is our interpretation of the results/simulations. This suggestion can also be observed in Ramkumar et al., article on chunking. To clarify this, we added a statement in the discussion as follows: 

      Line 449: 

      “Notably, in the framework of optimal feedback control, an intermediate goal is equivalent to a via-point that constrains the execution of the sequence (similar to (13)). It is thus possible that coarticulation in motor systems be processed similarly as other kinds of movement constraints, such as via-points, avoiding obstacles, or changes in control policies.”

      (2) The authors' model clearly shows that each subsequent target only influences the movement of one target back, but not earlier ones (page 7 lines 199-204). This stands in contrast to the paper they cite from Kashefi 2023, in which those authors clearly show that people account for at least 2 targets in the future when planning/executing the current movement. It would be useful to know whether this distinction arises because of a difference in experimental methodology, or because the model is not capturing something about human behavior.  

      Thank you for raising this point. There are some differences between the study of Kashefi and colleagues (2023), and ours. Both studies looked into planning of more than one reach. In the study of Kashefi et al., the results of Figure 6 showed that in H2 condition, there was no significant curvature, and the curvature increases in H3 and H4 conditions (only in the 75ms dwell-time scenario). Note that H2 condition in their work meant the presentation of +2 target after the initiation of +1 reach. Hence, we think the GO task in our case should be compared to the H3 condition, resulting in similar curvature as in our study. These authors also showed that curvature increased even in the H4 condition (75 ms dwell). OFC also accommodates this observation, if we consider the relationship between the cost of intermediate goals and spatial location of the targets (see figure below, also added to Supplementary Figure 4). To see this, we performed additional 3 target simulations where the constraint on intermediate goal velocity (at T1 and T2) was varied to achieve similar dwell velocity at the intermediate targets (Supplementary Figure 4C). In this case, the hand curvature of the first reach differed while the dwell velocity was similar across T3 up and T3 down conditions, as may be instructed experimentally. Again, the task instructions and the spatial location of the future goals together determine how much the first reach components are influenced by the next ones, and this may impact several reaches ahead. 

      We added the following clarification in the result to describe this. 

      Line 199:

      “It is worth noting that the OFC model can be generalized to longer sequences (10) through the incorporation of additional cost terms (in Equation 10 of Methods) and targets, enabling simultaneous planning for more than two targets. Simulations of a sample three-reach sequence (Supplementary Figure S4) revealed that, varying the cost of dwell velocity at intermediate targets (w2 and w3 parameters in Methods) caused a variation in control gains. Different amount of change in control gains can be expected for intermediate versus late targets (Supplementary Figure 4A). Notably, even when we used the same dwell velocity cost (w2 = w3 = 0), the observed velocity profiles were different between the two sequences towards different final targets (T3 up and T3 down) (Supplementary Figure 4B). We tested a condition in which both sequence reaches were forced to have similar dwell velocity profiles by increasing the dwell velocity costs in the sequence towards one of the targets (T3 down), while leaving this parameter unchanged for the other target (T3 up). In this scenario, T3 up sequence had the parameters (w2, w3) = (0, 0), while T3 down sequence had the parameters (0.8, 0.8). In this case, the curvature of the first reach was different, and predominantly occurred due to differences in K2 between the two sequence reaches (Supplementary Figure S4C). These simulations highlight that, planning for a longer horizon sequence can indirectly influence the curvature of early reaches, due to the interaction between intermediate dwell constraints, spatial arrangement of targets, and sequence horizon in a task dependent manner.”

      (3) In my prior review I raised a concern that the authors seem to be claiming that because they can use a single control policy for both coarticulated and separated movement sequences, there need not be any higher-level or explicit specification of whether the movements are sequential. While much of that language has been removed, it still appears in a few places (e.g., p. 13, lines 403-404). As previously noted, the authors' control policy can generate both types of movements as long as the proper constraints are provided to the model. However, these constraints must be specified somewhere (potentially explicitly, as the authors do by providing them as task instructions). Moreover, in typical sequence tasks, although some movements become coarticulated, people also tend to form chunks with distinct chunk boundaries, which presumably means that there is at least some specification of the sequential ordering of these chunks that must exist (otherwise the authors' model might suggest that people can coarticulate forever without needing to exhibit any chunk boundaries). Hence the authors should limit themselves to the narrow claim that a single control policy can lead to separated or coarticulated movements given an appropriate set of constraints, but acknowledge that their work cannot speak to where or how those constraints are specified in humans (i.e., that there could still be an explicit sequence representation guiding coarticulation). 

      We thank the reviewer for raising this point. We do not dispute the statement that the controller needs to be set dependent on the constraints of the task that must be specified somewhere. In our view, this problem is similar to the question of how a cost-function (or a task representation) is transformed into a control policy in the brain, which is unknown in general. In the earlier version, our intention was to stress that separation can occur without necessarily implying that the goals be processed independently (as in Figure 1A and Zimnik 2021). To avoid confusion on this point, we modified this statement in the new version as follows:

      Line 405: 

      “A straightforward interpretation could be that the stopping at the first target invoked a completely different strategy in which the control of the two reaches was performed independently (Figure 1A), effectively separating the two movements, whereas executing them rapidly could produce the merging of the two sub-movements into a coarticulated sequence. While this is conceptually valid, it is not necessary and the model provides a more nuanced view: both apparent separation or coarticulation of the two motor patterns can be explained within the same framework of flexible feedback control. These different modes of sequence execution still require proper specification of the task constraints in the model, such as number of intermediate steps, dwell-time, or velocity limit. Such specifications must be considered as input to the controller.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Line 57: Distinct hypotheses. 

      Line 209, The term "planned holistically" is confusing here. Seems like the authors suggest that the sequence is "planned holistically" as long as all sequence elements are given during the optimization process. 

      We changed the sentence as follows.

      Line 218: 

      “Overall, the model predicted that even if a feedback control policy was computed by optimizing the whole sequence over a long time-horizon, the requirements associated with intermediate goals determine how early in the sequence the second (future) target can influence the feedback controller”

      Line 336, It was not clear to me why the authors explained "the weak significant" results of PEC shortening in R0 given the nonsignificant values in R1. 

      We wanted to be transparent about whether changing the statistical analysis will lead to different interpretations, such as the sequence encoding even before long latency epochs. But we realized that it could lead to confusion and we deleted this sentence in the updated manuscript.

      Reviewer #2 (Recommendations For The Authors): 

      About Weakness #2, to clarify this point the authors should either model and discuss what it would take for their model to account for multiple targets ahead, or else run a study to show that in this task people indeed only ever plan 1 target ahead.  

      Please see our response above (in Weakness #2).

      I am still puzzled by why people would resist the perturbation more when they eventually have to move in the direction of the perturbation (e.g., p 10 lines 313-314). Perhaps this is simply due to the geometry of the task, but it could also depend on what participants were trying to accomplish in the experiment. To help clarify this, the authors should report exactly what instructions were given to participants in each task condition.  

      The simulations suggest that the observed perturbation movements are an optimal way to perform the task given the task constraints on accuracy, control effort and constraints at intermediate goals. The intuition is that modulating the acceleration at the intermediate goal is preferred rather than missing it. This however depends on the cost parameter. 

      Below, in Author response figure 1, we show the simulations by varying the accuracy requirements at intermediate goal and the total motor cost parameters. Clearly, as expected, increasing the cost on accuracy of the intermediate reach, or decreasing the cost on motor output modulated the hand deviation (simulations not included in the article).

      Author response image 1.

      Impact of movement costs (motor effort and intermediate goal reach errors) on the hand path following a mechanical perturbation   

      Our observation suggests that participants’ behaviour agreed with the interpretation that can result from the model. We clarified the exact instructions in the methods section. Note that the instructions were given at the beginning of the task and did not differ across the different conditions involving changes in the location of T2 or perturbation direction:

      Line 594:

      Participants were given the following instructions verbally: “Wait in the starting circle until you receive a GO signal, where the target circles turn red and you will simultaneously hear a beep sound. When the circles turn red, react quickly, move as soon, and as straight as possible to target 1 and then move to target 2. You will get two points at the end of the trial if you reach T1 in the prescribed time window and then move to T2, and in all other cases you will not receive any points. Importantly, once you reach T1 you should try to come out of it quickly. If you stay in T1 for more than 150 ms then T2 will disappear and you will receive only one point. Additionally, in some trials, a force will perturb your hand towards the right or left direction randomly while moving towards T1. The instructions remain the same in the presence of perturbations. Try to score as many points as you can.”

      Additionally, we added the following lines in the results description:

      Line 284:

      “The influence of second target on the lateral hand deviation was qualitatively similar to that observed in model simulations, and counterintuitive to what we might expect without the help of the model simulations. As observed in the model simulations (see also Supplementary Figure S2), lateral hand deviation was smaller when the perturbation was in the direction of the second target (T2) and vice-versa. This was consistent for both rightward and leftward perturbation conditions. Both the model and humans expressed this strategy that can be seen as an emergent feature of efficient feedback control during production of movement sequences. Additionally, even though behavior was reproduced in simulations, changing the cost on control effort and/or accuracy of intermediate reaches could modulate the sequencedependent changes in curvature.”

      I am not sure if "the data and code for simulations can be provided by the corresponding author" satisfies the eLife/PLoS software guidelines (i.e., that it be deposited in a public repository).

      Thank you for pointing this out. This sentence was added by mistake.

      We modified this statement in the updated manuscript. 

      “The data and code from simulations and experiments is available in the public repository ‘figshare’ in the following link (https://figshare.com/s/865a8b77c264ef17a181).”

    1. Reviewer #1 (Public Review):

      Summary:

      The authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction (MI) and heart failure (HF). They employed a secondary data analysis of the UK Biobank. They used descriptive and inferential analysis including Cox proportional hazards models to investigate the associations. Propensity score matching was also used. They found that Among participants with breast cancer, younger onset age was significantly associated with elevated risks of MI (HR=1.36, 95%CI: 1.19 to 1.56, P<0.001) and HF (HR=1.31, 95% CI: 1.18 to 1.46, P<0.001). the reported similar findings after propensity matching.

      Strengths:

      The use of a large dataset is a strength of the study as the study is well-powered to detect differences. Reporting both the unmatched and the propensity-matched estimates was also important for statistical inference.

      Weaknesses:

      The authors have addressed all my previous comments. I have no further comments.

    2. eLife assessment

      In this valuable study, the authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction and heart failure. Based on results from a series of compelling statistical analyses, the authors conclude that a younger onset age of breast cancer is associated with myocardial infarction and heart failure, highlighting the need to carefully monitor the cardiovascular status of women who have been diagnosed with breast cancer.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer #1:

      Comment 1:

      Summary:

      The authors sought to investigate the associations of age at breast cancer onset with the incidence of myocardial infarction (MI) and heart failure (HF). They employed a secondary data analysis of the UK Biobank. They used descriptive and inferential analysis including Cox proportional hazards models to investigate the associations. Propensity score matching was also used. They found that Among participants with breast cancer, younger onset age was significantly associated with elevated risks of MI (HR=1.36, 95%CI: 1.19 to 1.56, P<0.001) and HF (HR=1.31, 95% CI: 1.18 to 1.46, P<0.001). the reported similar findings after propensity matching.

      Strengths:

      The use of a large dataset is a strength of the study as the study is well-powered to detect differences. Reporting both the unmatched and the propensity-matched estimates was also important for statistical inference.

      Weaknesses:

      Despite the merits of the paper, readers may get confused as to whether authors are referring to “age at breast cancer onset” or “age at breast cancer diagnosis”. I suppose the title refers to the latter, in which case it will be best to be consistent in using “age at breast cancer diagnosis” throughout the manuscripts. I would recommend a revision to the title to make it explicit that the authors are referring to “age at breast cancer diagnosis”.

      Thank you for your nice comments and suggestions. Yes, as you mentioned, in this study, we focused on age at breast cancer diagnosis, which was obtained from the cancer registry data in the UK Biobank and was used in all the analyses. We agree with you that it would be better to consistently use “age at diagnosis of breast cancer” throughout the manuscripts for a better understanding; therefore, we have replaced “age at breast cancer onset” with “age at diagnosis of breast cancer”.

      Change in the manuscript:

      “Age at breast cancer onset” was replaced with “age at diagnosis of breast cancer” in the title and throughout the manuscripts.

      Recommendations For The Authors:

      Kindly review the references for the location of the full stop. Putting the full stop at the end of the parenthesis makes reading smother than its current form as it is difficult to know when the new sentence begins.

      Thank you for your suggestion. We have made revisions to the location of the full stop next to a reference.

      Change in the manuscript:

      The full stop was put at the end of the parenthesis of a reference throughout the manuscripts.

      Response to Reviewer #2:

      Comment 1:

      This is a well-presented large analysis from the UK Biobank of nearly 250,000 female adults. The authors examined the associations of breast cancer diagnosis with incident myocardial infarction and heart failure by different onset age groups. Based on results from a series of statistical analyses, the authors concluded that younger onset age of breast cancer was associated with myocardial infarction and heart failure, highlighting the necessity of careful monitoring of cardiovascular status in women diagnosed with breast cancer, especially those younger ones.

      Comments to consider:

      It’s thoughtful for the authors to have included and adjusted for menopausal status, breast cancer surgery, and hormone replacement therapy in their sensitivity analysis. It would be informative if the authors presented the number and percentages of menopause and cancer treatments.

      Thank you for your comments. As suggested, we have provided more detailed information on the number and percentage of menopausal status and breast cancer treatments.

      Change in the manuscript:

      Page 11, Lines 208 to 211: added “Among participants with breast cancer, 11 460 (70.6%) participants were postmenopausal, 14 255 (87.6%) participants had undergone breast cancer surgery, and 6 784 (41.8%) participants had received hormone replacement therapy.”

      Change in the supplementary material:

      The number and percentage of menopausal status, breast cancer surgery, and hormone replacement therapy were added to Table S13.

      aAdjusted for age, ethnicity, education, current smoking, current drinking, obesity, exercise, low-density lipoprotein cholesterol, depressed mood, hypertension, diabetes, antihypertensive drug use, antidiabetic drug use, statin use, menopausal status, breast cancer surgery, and hormone replacement therapy.

      HR, hazard ratio; CI, confidence interval.

      Comment 2:

      The analytical baseline used for follow-up should be pointed out in the methods section. It’s confusing whether the analytic baseline was defined as the study baseline or the time at breast cancer diagnosis.

      We apologize for the confusion. In this study, the analytical baseline used for follow-up was defined as the baseline of UK Biobank (2006-2010) and we have pointed it out in the methods section as suggested.

      Change in the manuscript:

      Page 9, Lines 165 to 166: added: “The analytical baseline used for follow-up was defined as the baseline of UK Biobank (2006-2010).”

      Comment 3:

      Did the older onset age group have a longer follow-up duration? Could the authors provide information on the length of follow-up by age of onset in Supplementary Table S4? It would give the readers more information regarding different age groups.

      Thank you for your question. We compared the time of follow-up among the three diagnosis age groups and found that although the durations of follow-up among the three groups were quite similar (as shown in Table S4), statistical analysis revealed a significant difference with the older diagnosis age group demonstrating a longer follow-up duration (P for Kruskal-Wallis test <0.001). This is understandable as with large sample sizes, even a slight difference could lead to statistical significance. According to your suggestion, we have added information on the length of follow-up by age of diagnosis in Supplementary Table S4.

      Change in the supplementary material:

      Added the median and interquartile range of follow-up in Supplementary Table S4.

      The results are presented as the mean ± standard deviation, or No. (%).

      aThe effect sizes are standardized mean differences for continuous outcomes and the Phi coefficient for dichotomous outcomes.

      LDL-C, low-density lipoprotein cholesterol.

    1. Reviewer #3 (Public Review):

      Summary:

      Bell and colleagues studied how different splice isoforms of voltage-gated CaV2 calcium channels affect channel expression, localization, function, synaptic transmission, and locomotor behavior at the larval Drosophila neuromuscular junction. They reveal that one mutually exclusive exon located in the fourth transmembrane domain encoding the voltage sensor is essential for calcium channel expression, function, active zone localization, and synaptic transmission. Furthermore, a second mutually exclusive exon residing in an intracellular loop containing the binding sites for Caβ and G-protein βγ subunits promotes the expression and synaptic localization of around ~50% of CaV2 channels, thereby contributing to ~50% of synaptic transmission. This isoform enhances release probability, as evident from increased short-term depression, is vital for homeostatic potentiation of neurotransmitter release induced by glutamate receptor impairment, and promotes locomotion. The roles of the two other tested isoforms remain less clear.

      Strengths:

      The study is based on solid data that was obtained with a diverse set of approaches. Moreover, it generated valuable transgenic flies that will facilitate future research on the role of calcium channel splice isoforms in neural function.

      Weaknesses:

      (1) Based on the data shown in Figures 2A-C, and 2H, it is difficult to judge the localization of the cac isoforms. Could they analyze cac localization with regard to Brp localization (similar to Figure 3; the term "co-localization" should be avoided for confocal data), as well as cac and Brp fluorescence intensity in the different genotypes for the experiments shown in Figure 2 and 3 (Brp intensity appears lower in the dI-IIA example shown in Figure 3G)? Furthermore, heterozygous dIS4B imaging data (Figure 2C) should be quantified and compared to heterozygous cacsfGFP/+.

      (2) They conclude that I-II splicing is not required for cac localization (p. 13). However, cac channel number is reduced in dI-IIB. Could the channels be mis-localized (e.g., in the soma/axon)? What is their definition of localization? Could cac be also mis-localized in dIS4B? Furthermore, the Western Blots indicate a prominent decrease in cac levels in dIS4B/+ and dI-IIB (Figure 1D). How do the decreased protein levels seen in both genotypes fit to a "localization" defect? Could decreased cac expression levels explain the phenotypes alone?

      (3) Cac-IS4B is required for Cav2 expression, active zone localization, and synaptic transmission. Similarly, loss of cac-I-IIB reduces calcium channel expression and number. Hence, the major phenotype of the tested splice isoforms is the loss of/a reduction in Cav2 channel number. What is the physiological role of these isoforms? Is the idea that channel numbers can be regulated by splicing? Is there any data from other systems relating channel number regulation to splicing (vs. transcription or post-transcriptional regulation)?

      (4) Although not supported by statistics, and as appreciated by the authors (p. 14), there is a slight increase in PSC amplitude in dIS4A mutants (Figure 2). Similarly, PSC amplitudes appear slightly larger (Figure 3J), and cac fluorescence intensity is slightly higher (Figure 3H) in dI-IIA mutants. Furthermore, cac intensity and PSC amplitude distributions appear larger in dI-IIA mutants (Figures 3H, J), suggesting a correlation between cac levels and release. Can they exclude that IS4A and/or I-IIA negatively regulate release? I suggest increasing the sample size for Canton S to assess whether dIS4A mutant PSCs differ from controls (Figure 2E). Experiments at lower extracellular calcium may help reveal potential increases in PSC amplitude in the two genotypes (but are not required). A potential increase in PSC amplitude in either isoform would be very interesting because it would suggest that cac splicing could negatively regulate release.

      (5) They provide compelling evidence that IS4A is required for the amplitude of somatic sustained HVA calcium currents. However, the evidence for effects on biophysical properties and activation voltage (p. 13) is less convincing. Is the phenotype confined to the sustained phase, or are other aspects of the current also affected (Figure 2J)? Could they also show the quantification of further parameters, such as CaV2 peak current density, charge density, as well as inactivation kinetics for the two genotypes? I also suggest plotting peak-normalized HVA current density and conductance (G/Gmax) as a function of Vm. Could a decrease in current density due to decreased channel expression be the only phenotype? How would changes in the sustained phase translate into altered synaptic transmission in response to AP stimulation?

      (6) Why was the STED data analysis confined to the same optical section, and not to max. intensity z-projections? How many and which optical sections were considered for each active zone? What were the criteria for choosing the optical sections? Was synapse orientation considered for the nearest neighbor Cac - Brp cluster distance analysis? How do the nearest-neighbor distances compare between "planar" and "side-view" Brp puncta?

      (7) Cac clusters localize to the Brp center (e.g., Liu et al., 2011). They conclude that Cav2 localization within Brp is not affected in the cac variants (p. 8). However, their analysis is not informative regarding a potential offset between the central cac cluster and the Brp "ring". Did they/could they analyze cac localization with regard to Brp ring center localization of planar synapses, as well as Brp-ring dimensions?

      (8) Given the accelerated PSC decay/ decreased half width in dI-IIA (Fig. 5Q), I recommend reporting PSC charge in Figure 3, and PPR charge in Figures 5A-D. The charge-based PPRs of dI-IIA mutants likely resemble WT more closely than the amplitude-based PPR. In addition, miniature PSC decay kinetics should be reported, as they may contribute to altered decay kinetics. How could faster cac inactivation kinetics in response to single AP stimulation result in a decreased PSC half-width? Is there any evidence for an effect of calcium current inactivation on PSC kinetics? On a similar note, is there any evidence that AP waveform changes accelerate PSC kinetics? PSC decay kinetics are mainly determined by GluR decay kinetics/desensitization. The arguments supporting the role of cac splice isoforms in PSC kinetics outlined in the discussion section are not convincing and should be revised.

      (9) Paired-pulse ratios (PPRs): On how many sweeps are the PPRs based? In which sequence were the intervals applied? Are PPR values based on the average of the second over the first PSC amplitudes of all sweeps, or on the PPRs of each sweep and then averaged? The latter calculation may result in spurious facilitation, and thus to the large PPRs seen in dI-IIB mutants (Kim & Alger, 2001; doi: 10.1523/JNEUROSCI.21-24-09608.2001).

      (10) Could the dI-IIB phenotype be simply explained by a decrease in channel number/ release probability? To test this, I propose investigating PPRs and short-term dynamics during train stimulation at lower extracellular Ca2+ concentration in WT. The Ca2+ concentration could be titrated such that the first PSC amplitude is similar between WT and dI-IIB mutants. This experiment would test if the increased PPR/depression variability is a secondary consequence of a decrease in Ca2+ influx, or specific to the splice isoform.

      (11) How were the depression kinetics analyzed? How many trains were used for each cell, and how do the tau values depend on the first PSC amplitude? Time constants in the range of a few (5-10) milliseconds are not informative for train stimulations with a frequency of 1 or 10 Hz (the unit is missing in Figure 5H). Also, the data shown in Figures 5E-K suggest slower time constants than 5-10 ms. Together, are the data indeed consistent with the idea that dI-IIB does not only affect cac channel number, but also PPR/depression variability (p. 9)?

      (12) The GFP-tagged I-IIA and mEOS4b-tagged I-IIB cac puncta shown in Figure 6N appear larger than the Brp puncta. Endogenously tagged cac puncta are typically smaller than Brp puncta (Gratz et al., 2019). Also, the I-IIA and I-IIB fluorescence sometimes appear to be partially non-overlapping. First, I suggest adding panels that show all three channels merged. Second, could they analyze the area and area overlap of I-IIA and I-IIB with regard to each other and to Brp, and compare it to cac-GFP? Any speculation as to how the different tags could affect localization? Finally, I recommend moving the dI-IIA and dI-IIB localization data shown in Figure 6N to an earlier figure (Figure 1 or Figure 3).

    2. Author response:

      eLife assessment

      Cav2 voltage-gated calcium channels play key roles in regulating synaptic strength and plasticity. In contrast to mammals, invertebrates like Drosophila encode a single Cav2 channel, raising questions on how diversity in Cav2 is achieved from a single gene. Here, the authors present convincing evidence that two alternatively spliced isoforms of the Cac gene (cacophony, also known as Dmca1A and nightblindA) enable diverse changes in Cav2 expression, localization, and function in synaptic transmission and plasticity. These valuable findings will be of interest to a variety of researchers.

      We suggest replacing “two alternatively spliced isoforms of the Cac gene” by “two alternatively spliced mutually exclusive exon pairs of the Cac gene”. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Bell et. al. describes an analysis of the effects of removing one of two mutually exclusive splice exons at two distinct sites in the Drosophila CaV2 calcium channel Cacophony (Cac). The authors perform imaging and electrophysiology, along with some behavioral analysis of larval locomotion, to determine whether these alternatively spliced variants have the potential to diversify Cac function in presynaptic output at larval neuromuscular junctions. The author provided valuable insights into how alternative splicing at two sites in the calcium channel alters its function.

      Strengths:

      The authors find that both of the second alternatively spliced exons (I-IIA and I-IIB) that are found in the intracellular loop between the 1st and 2nd set of transmembrane domains can support Cac function. However, loss of the I-IIB isoform (predicted to alter potential beta subunit interactions) results in 50% fewer channels at active zones and a decrease in neurotransmitter release and the ability to support presynaptic homeostatic potentiation. Overall, the study provides new insights into Cac diversity at two alternatively spliced sites within the protein, adding to our understanding of how regulation of presynaptic calcium channel function can be regulated by splicing.

      Weaknesses:

      The authors find that one splice isoform (IS4B) in the first S4 voltage sensor is essential for the protein's function in promoting neurotransmitter release, while the other isoform (IS4A) is dispensable. The authors conclude that IS4B is required to localize Cac channels to active zones. However, I find it more likely that IS4B is required for channel stability and leads to the protein being degraded, rather than any effect on active zone localization. More analysis would be required to establish that as the mechanism for the unique requirement for IS4B.

      We agree that we need to explain more clearly why IS4B is unlikely required for channel stability, but instead, likely has a unique function at the presynaptic active zone of fast synapses. We will address this by revising text and by providing additional data. If IS4B was required for evoked release because it supported channel protein stability, then the removal of IS4B should cause protein degradation throughout all sub-neuronal compartments and throughout the CNS, but this is not the case. First, upon removal of IS4B in adult motoneurons (which use cac channels at the presynapse and somatodendritically, Ryglewski et al., 2012) evoked release from axon terminals is abolished (as at the larval NMJ), but somatodendritic cac inward current is present. If IS4B was required for cac channel stability, somatodendritic current should also be abolished. We will add these data to the ms. Second, immunohistochemistry for tagged IS4B channels reveals that these are present not only at presynaptic active zones at the NMJ but also throughout the VNC motor neuropils. Excision of IS4B causes the absence of cac channels from the presynaptic active zones at the NMJ and throughout the VNC neuropils (and accordingly this is lethal). By contrast, tagged IS4A channels (with IS4B excised) are not found at the presynaptic terminals of fast synapses, but instead, in other distinct parts of the CNS. We will also provide data to show this. Together these data are in line with a unique requirement of IS4B at presynaptic active zones (not excluding additional functions of IS4B), whereas IS4A containing cac isoforms mediate different functions.

      We appreciate the additional reviewer suggestions to the authors that we will address point by point when revising the ms. 

      Reviewer #2 (Public Review):

      This study by Bell et al. focuses on understanding the roles of two alternatively spliced exons in the single Drosophila Cav2 gene cac. The authors generate a series of cac alleles in which one or the other mutually exclusive exons are deleted to determine the functional consequences at the neuromuscular junction. They find alternative splicing at one exon encoding part of the voltage sensor impacts the activation voltage as well as localization to the active zone. In contrast, splicing at the second exon pair does not impact Cav2 channel localization, but it appears to determine the abundance of the channel at active zones. Together, the authors propose that alternative splicing at the Cac locus enables diversity in Cav2 function generated through isoform diversity generated at the single Cav2 alpha subunit gene encoded in Drosophila.

      Overall this is an excellent, rigorously validated study that defines unanticipated functions for alternative splicing in Cav2 channels. The authors have generated an important toolkit of mutually exclusive Cac splice isoforms that will be of broad utility for the field, and show convincing evidence for distinct consequences of alternative splicing of this single Cav2 channel at synapses. Importantly, the authors use electrophysiology and quantitative live sptPALM imaging to determine the impacts of Cac alternative splicing on synaptic function. There are some outstanding questions regarding the mechanisms underlying the changes in Cac localization and function, and some additional suggestions are listed below for the authors to consider in strengthening this study. Nonetheless, this is a compelling investigation of alternative splicing in Cav2 channels that should be of interest to many researchers.

      We agree that some additional information on cac isoform localization (in particular for splicing at the IS4 site) will strengthen the manuscript. We will address this by providing additional data and revising text (see responses to reviewers 1 and 3). We are also grateful for the additional reviewer suggestions which we will address point by point when revising the ms.  

      Reviewer #3 (Public Review):

      Summary:

      Bell and colleagues studied how different splice isoforms of voltage-gated CaV2 calcium channels affect channel expression, localization, function, synaptic transmission, and locomotor behavior at the larval Drosophila neuromuscular junction. They reveal that one mutually exclusive exon located in the fourth transmembrane domain encoding the voltage sensor is essential for calcium channel expression, function, active zone localization, and synaptic transmission. Furthermore, a second mutually exclusive exon residing in an intracellular loop containing the binding sites for Caβ and G-protein βγ subunits promotes the expression and synaptic localization of around ~50% of CaV2 channels, thereby contributing to ~50% of synaptic transmission. This isoform enhances release probability, as evident from increased short-term depression, is vital for homeostatic potentiation of neurotransmitter release induced by glutamate receptor impairment, and promotes locomotion. The roles of the two other tested isoforms remain less clear.

      Strengths:

      The study is based on solid data that was obtained with a diverse set of approaches. Moreover, it generated valuable transgenic flies that will facilitate future research on the role of calcium channel splice isoforms in neural function.

      Weaknesses:

      (1) Based on the data shown in Figures 2A-C, and 2H, it is difficult to judge the localization of the cac isoforms. Could they analyze cac localization with regard to Brp localization (similar to Figure 3; the term "co-localization" should be avoided for confocal data), as well as cac and Brp fluorescence intensity in the different genotypes for the experiments shown in Figure 2 and 3 (Brp intensity appears lower in the dI-IIA example shown in Figure 3G)? Furthermore, heterozygous dIS4B imaging data (Figure 2C) should be quantified and compared to heterozygous cacsfGFP/+.

      We understand the reviewer’s comment and will do the following to convincingly demonstrate absence of cac from presynaptic active zones upon IS4B excision. First, we will show selective enlargements of IS4A and IS4B with Brp in presynaptic active zones to show distinct cac label in active zones following excision of IS4A but not following excision of IS4B. Second, we will provide Pearson’s co-localization coefficients of Brp with IS4B and with IS4A, respectively. Third, we will reduce the intensity of the green channels in figures 2C and 2H to the same levels as in 2A and B, and H control to allow a fair comparison of cac intensities following excision of IS4B versus excision of IS4A and control. We had increased intensity to show that following excision of IS4B, no distinct cac label is found in active zones, even at high exaggerated image brightness. However, we agree with the reviewer that the bright background hampers interpretation and thus will show the same intensity in all images that need to be compared.

      (2) They conclude that I-II splicing is not required for cac localization (p. 13). However, cac channel number is reduced in dI-IIB. Could the channels be mis-localized (e.g., in the soma/axon)? What is their definition of localization? Could cac be also mis-localized in dIS4B? Furthermore, the Western Blots indicate a prominent decrease in cac levels in dIS4B/+ and dI-IIB (Figure 1D). How do the decreased protein levels seen in both genotypes fit to a "localization" defect? Could decreased cac expression levels explain the phenotypes alone?

      We will precisely define channel localization, and we will explain why it is highly unlikely that the absence of IS4B channels as well as the lower number of I-IIA channels are simply a consequence of reduced expression, but instead of splice variant specific channel function and localization. For example, upon excision of IS4B no cac channels are found at the presynaptic active zones and these synapses are thus non-functional. The isoforms containing the mutually exclusive IS4A exon are expressed and mediate other functions (see also response to reviewer 1) but cannot substitute IS4B containing isoforms at the presynapse. In fact, our Western blots are in line with reduced cac expression if all isoforms that mediate evoked release are missing, again indicating that the presynapse specific cac isoforms cannot be replaced by other cac isoforms (see also below, response to (3)). Feedback mechanisms that regulate cac expression in the absence of presynapse specific cac isoforms are beyond the scope of this study.

      (3) Cac-IS4B is required for Cav2 expression, active zone localization, and synaptic transmission. Similarly, loss of cac-I-IIB reduces calcium channel expression and number. Hence, the major phenotype of the tested splice isoforms is the loss of/a reduction in Cav2 channel number. What is the physiological role of these isoforms? Is the idea that channel numbers can be regulated by splicing? Is there any data from other systems relating channel number regulation to splicing (vs. transcription or post-transcriptional regulation)?

      We will provide additional evidence that mutually exclusive splicing at the IS4 site results in cac channels that localize to the presynaptic active zone (IS4B) versus cac channels that localize to other brain parts and/or other subneuronal compartments (see response to reviewer 1).  In addition, we already show in figure 2J that IS4B is required for normal cac HVA current, and we can add data showing that IS4A is not essential for cac HVA current. Similarly, for I-II we find it unlikely that differential splicing regulates channel numbers, but rather splice variant specific functions in different brain parts and different sub-neuronal compartments. To substantiate this interpretation, we will add data from developing adult motoneurons showing that excision of I-IIA causes reduced activity induced calcium influx into dendrites (new data), but it does not reduce channel number at the larval NMJ (figure 4). In our opinion these data are not in line with the idea that splicing regulates cac expression levels, and this in turn, results in specific defects in distinct neuronal compartments. However, we agree that the lack of isoforms with specific functions results in altered overall cac expression levels as indicated by our Western data. If isoforms normally abundantly expressed throughout most neuropils are missing due to exon excision, we indeed find less cac protein in Westerns. By contrast, the lack of isoforms with little abundance has little effect on cac expression levels. This may be the results of unknown feedback mechanisms which are beyond the scope of this study.

      (4) Although not supported by statistics, and as appreciated by the authors (p. 14), there is a slight increase in PSC amplitude in dIS4A mutants (Figure 2). Similarly, PSC amplitudes appear slightly larger (Figure 3J), and cac fluorescence intensity is slightly higher (Figure 3H) in dI-IIA mutants. Furthermore, cac intensity and PSC amplitude distributions appear larger in dI-IIA mutants (Figures 3H, J), suggesting a correlation between cac levels and release. Can they exclude that IS4A and/or I-IIA negatively regulate release? I suggest increasing the sample size for Canton S to assess whether dIS4A mutant PSCs differ from controls (Figure 2E). Experiments at lower extracellular calcium may help reveal potential increases in PSC amplitude in the two genotypes (but are not required). A potential increase in PSC amplitude in either isoform would be very interesting because it would suggest that cac splicing could negatively regulate release.

      There are several possibilities to explain this, but as none of the effects are statistically significant, we prefer to not investigate this in depth. However, given that we cannot find IS4A at the presynaptic active zone, IS4A is unlikely to have a direct negative effect on release probability. Nonetheless, given that IS4A containing cac isoforms mediate functions in other neuronal compartments it may regulate release indirectly by affecting action potential shape. We will provide data in response to the more detailed suggestions to authors that will provide additional insight.

      (5) They provide compelling evidence that IS4A is required for the amplitude of somatic sustained HVA calcium currents. However, the evidence for effects on biophysical properties and activation voltage (p. 13) is less convincing. Is the phenotype confined to the sustained phase, or are other aspects of the current also affected (Figure 2J)? Could they also show the quantification of further parameters, such as CaV2 peak current density, charge density, as well as inactivation kinetics for the two genotypes? I also suggest plotting peak-normalized HVA current density and conductance (G/Gmax) as a function of Vm. Could a decrease in current density due to decreased channel expression be the only phenotype? How would changes in the sustained phase translate into altered synaptic transmission in response to AP stimulation?

      Most importantly, HVA current is mostly abolished upon excision of IS4B (not IS4A, we think the reviewer accidentally mixed up the genotype). This indicates that the cac isoforms that mediate evoked release encode HVA channels. However, the somatodendritic current shown in figure 2J that remains upon excision of IS4B is mediated by IS4A containing cac isoforms. Please note that these never localize to the presynaptic active zone, thus the small inactivating HVA that remains in figure 2J does normally not mediate evoked release. Therefore, the interpretation is that specifically HVA current encoded by IS4B cac isoforms is required for synaptic transmission. Reduced cac current density is not the cause for this phenotype because a specific current component is absent. 

      We agree with the reviewer that a deeper electrophysiological analysis of cac currents mediated by IS4B containing isoforms will be instructive. However, a precise analysis of activation and inactivation voltages and kinetics suffers form space clamp issues in recordings from the soma of such complex neurons (DLM motoneurons of the adult fly). Therefore, we will analyze the currents in a heterologous expression system and present these data to the scientific community as a separate study at a later time point.

      (6) Why was the STED data analysis confined to the same optical section, and not to max. intensity z-projections? How many and which optical sections were considered for each active zone? What were the criteria for choosing the optical sections? Was synapse orientation considered for the nearest neighbor Cac - Brp cluster distance analysis? How do the nearest-neighbor distances compare between "planar" and "side-view" Brp puncta?

      Max. z-projections would be imprecise because they can artificially suggest close proximity of label that is close in x and y but far away in z. Therefore, the analysis was executed in xy-direction of various planes of entire 3D image stacks. We considered active zones of different orientations (Fig. 4C, D). In fact, we searched the entire z-stacks until we found active zones of all orientations shown in figures 4C1-C6 within the same boutons. The same active zone orientations were analyzed for all exon-out mutants with cac localization in active zones. The distance between cac and brp did not change if viewed from the side.

      (7) Cac clusters localize to the Brp center (e.g., Liu et al., 2011). They conclude that Cav2 localization within Brp is not affected in the cac variants (p. 8). However, their analysis is not informative regarding a potential offset between the central cac cluster and the Brp "ring". Did they/could they analyze cac localization with regard to Brp ring center localization of planar synapses, as well as Brp-ring dimensions?

      In the top views (planar) we did not find any clear offset in cac orientation to brp between genotypes. This study focuses on cac splice isoform specific localization and function. Possible effects of different cac isoforms on Brp-ring dimensions or other aspects of scaffold structure are not central to our study, in particular given that Brp puncta are clearly present even if cac is absent from the synapse (Fig. 2H), indicating that cac is not instructive for the formation of the Brp scaffold.  

      (8) Given the accelerated PSC decay/ decreased half width in dI-IIA (Fig. 5Q), I recommend reporting PSC charge in Figure 3, and PPR charge in Figures 5A-D. The charge-based PPRs of dI-IIA mutants likely resemble WT more closely than the amplitude-based PPR. In addition, miniature PSC decay kinetics should be reported, as they may contribute to altered decay kinetics. How could faster cac inactivation kinetics in response to single AP stimulation result in a decreased PSC half-width? Is there any evidence for an effect of calcium current inactivation on PSC kinetics? On a similar note, is there any evidence that AP waveform changes accelerate PSC kinetics? PSC decay kinetics are mainly determined by GluR decay kinetics/desensitization. The arguments supporting the role of cac splice isoforms in PSC kinetics outlined in the discussion section are not convincing and should be revised.

      We agree that reporting charge in figure 3 will be informative and will do so. We also understand the reviewer’s concern attributing altered PSC kinetics to presynaptic cac channel properties. We will tone down our interpretation in the discussion and list possible alterations in presynaptic AP shape or Cav2 channel kinetics as alternative explanations (not conclusions). Moreover, we will quantify postsynaptic GluRIIA abundance to test whether altered PSC kinetics are caused by altered GluRIIA expression. In our opinion, the latter is more instructive than mini decay kinetic analysis because this depends strongly on the distance of the recording electrode to the actual site of transmission in these large muscle cells.

      (9) Paired-pulse ratios (PPRs): On how many sweeps are the PPRs based? In which sequence were the intervals applied? Are PPR values based on the average of the second over the first PSC amplitudes of all sweeps, or on the PPRs of each sweep and then averaged? The latter calculation may result in spurious facilitation, and thus to the large PPRs seen in dI-IIB mutants (Kim & Alger, 2001; doi: 10.1523/JNEUROSCI.21-24-09608.2001).

      We agree that the PP protocol and analyses have to be described more precisely in the methods, and we will do so. PPR values are based on the PPRs of each sweep and then averaged. We are aware of the study of Kim and Alger 2001, but it does not affect our data interpretation because all genotypes were analyzed identically, but only the I-IIB excision resulted in the large data spread shown in figure 5.

      (10) Could the dI-IIB phenotype be simply explained by a decrease in channel number/ release probability? To test this, I propose investigating PPRs and short-term dynamics during train stimulation at lower extracellular Ca2+ concentration in WT. The Ca2+ concentration could be titrated such that the first PSC amplitude is similar between WT and dI-IIB mutants. This experiment would test if the increased PPR/depression variability is a secondary consequence of a decrease in Ca2+ influx, or specific to the splice isoform.

      In fact, the interpretation that decreased PSC amplitude upon I-IIB excision is caused mainly by reduced channel number is precisely our interpretation (see discussion page 14, last paragraph to page 15, first paragraph). In addition, we are grateful for the reviewer’s suggestion to triturate the external calcium such that the first PSC amplitude matches the one in ΔI-IIB to test whether altered short term plasticity is solely a function of altered channel number or whether additional causes, such as altered channel properties, also play into this. We will conduct these experiments and include them in the revised manuscript.

      (11) How were the depression kinetics analyzed? How many trains were used for each cell, and how do the tau values depend on the first PSC amplitude? Time constants in the range of a few (5-10) milliseconds are not informative for train stimulations with a frequency of 1 or 10 Hz (the unit is missing in Figure 5H). Also, the data shown in Figures 5E-K suggest slower time constants than 5-10 ms. Together, are the data indeed consistent with the idea that dI-IIB does not only affect cac channel number, but also PPR/depression variability (p. 9)?

      For each animal, the amplitudes of each PSC were plotted over time and fitted with a single exponential. For depression at 1 and 10 Hz, we used one train per animal, and 5-6 animals per genotype (as reflected in the data points in Figs 5H and 5L). Given that the tau values are highly similar between control and excision of I-IIA, but ΔI-IIA tends to have larger single PSC amplitudes, differences in first PSC amplitude do not seem to skew the data (but see also response to comment 10 above). We thank the reviewer for pointing out that tau values in the range of ms are not informative at 1 and 10 Hz stimulations (Figs 5H and 5L). We mis-labeled (or did not label) the axes. The label should read seconds, not milliseconds. We apologize, and this will be corrected accordingly.

      In sum, pending the outcome of additional important control experiments for GluRIIA abundance (see response to comment 8) and trituration of control PSC amplitude for the first pulse of paired pulses in ΔI-IIB (see response to comment 10) we will either modify or further support that interpretation.

      (12) The GFP-tagged I-IIA and mEOS4b-tagged I-IIB cac puncta shown in Figure 6N appear larger than the Brp puncta. Endogenously tagged cac puncta are typically smaller than Brp puncta (Gratz et al., 2019). Also, the I-IIA and I-IIB fluorescence sometimes appear to be partially non-overlapping. First, I suggest adding panels that show all three channels merged. Second, could they analyze the area and area overlap of I-IIA and I-IIB with regard to each other and to Brp, and compare it to cac-GFP? Any speculation as to how the different tags could affect localization? Finally, I recommend moving the dI-IIA and dI-IIB localization data shown in Figure 6N to an earlier figure (Figure 1 or Figure 3).

      We will show panels with all three labels matched as suggested by the reviewer. For the size of the puncta: this could be different numbers and types of fluorophores on the different antibodies used and thus different point spread, chromatic aberration, different laser and detector intensities etc. We will re-analyze the data to test whether there are systematic differences in size. We do not want to speculate whether the different tags have any effect on localization precision because of the abovementioned reasons as well as artificial differences in localization precision that can be suggested by different antibodies. We prefer to not move the figure because we believe it is informative to show our finding that active zones usually contain both splice variants together with the finding that only one splice variant is required for PHP.

    3. eLife assessment

      Cav2 voltage-gated calcium channels play key roles in regulating synaptic strength and plasticity. In contrast to mammals, invertebrates like Drosophila encode a single Cav2 channel, raising questions on how diversity in Cav2 is achieved from a single gene. Here, the authors present convincing evidence that two alternatively spliced isoforms of the Cac gene (cacophony, also known as Dmca1A and nightblindA) enable diverse changes in Cav2 expression, localization, and function in synaptic transmission and plasticity. These valuable findings will be of interest to a variety of researchers.

    4. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Bell et. al. describes an analysis of the effects of removing one of two mutually exclusive splice exons at two distinct sites in the Drosophila CaV2 calcium channel Cacophony (Cac). The authors perform imaging and electrophysiology, along with some behavioral analysis of larval locomotion, to determine whether these alternatively spliced variants have the potential to diversify Cac function in presynaptic output at larval neuromuscular junctions. The author provided valuable insights into how alternative splicing at two sites in the calcium channel alters its function.

      Strengths:

      The authors find that both of the second alternatively spliced exons (I-IIA and I-IIB) that are found in the intracellular loop between the 1st and 2nd set of transmembrane domains can support Cac function. However, loss of the I-IIB isoform (predicted to alter potential beta subunit interactions) results in 50% fewer channels at active zones and a decrease in neurotransmitter release and the ability to support presynaptic homeostatic potentiation. Overall, the study provides new insights into Cac diversity at two alternatively spliced sites within the protein, adding to our understanding of how regulation of presynaptic calcium channel function can be regulated by splicing.

      Weaknesses:

      The authors find that one splice isoform (IS4B) in the first S4 voltage sensor is essential for the protein's function in promoting neurotransmitter release, while the other isoform (IS4A) is dispensable. The authors conclude that IS4B is required to localize Cac channels to active zones. However, I find it more likely that IS4B is required for channel stability and leads to the protein being degraded, rather than any effect on active zone localization. More analysis would be required to establish that as the mechanism for the unique requirement for IS4B.

    5. Reviewer #2 (Public Review):

      This study by Bell et al. focuses on understanding the roles of two alternatively spliced exons in the single Drosophila Cav2 gene cac. The authors generate a series of cac alleles in which one or the other mutually exclusive exons are deleted to determine the functional consequences at the neuromuscular junction. They find alternative splicing at one exon encoding part of the voltage sensor impacts the activation voltage as well as localization to the active zone. In contrast, splicing at the second exon pair does not impact Cav2 channel localization, but it appears to determine the abundance of the channel at active zones. Together, the authors propose that alternative splicing at the Cac locus enables diversity in Cav2 function generated through isoform diversity generated at the single Cav2 alpha subunit gene encoded in Drosophila.

      Overall this is an excellent, rigorously validated study that defines unanticipated functions for alternative splicing in Cav2 channels. The authors have generated an important toolkit of mutually exclusive Cac splice isoforms that will be of broad utility for the field, and show convincing evidence for distinct consequences of alternative splicing of this single Cav2 channel at synapses. Importantly, the authors use electrophysiology and quantitative live sptPALM imaging to determine the impacts of Cac alternative splicing on synaptic function. There are some outstanding questions regarding the mechanisms underlying the changes in Cac localization and function, and some additional suggestions are listed below for the authors to consider in strengthening this study. Nonetheless, this is a compelling investigation of alternative splicing in Cav2 channels that should be of interest to many researchers.

    1. eLife assessment

      This is an important study that describes an elegant modelling driven approach to design of allosteric antagonists for CXCR4 that have a selective effect on receptor nanocluster formation, cell polarisation and chemotaxis, but spare binding of CXCL12 to the receptor and inhibition of adenylate cyclase. This enables selective targeting of processes dependent upon cell polarisation and chemotaxis without impacting signalling effects and may avoid some of the toxicity associated with antagonists that target CXCL12 binding and thus block all CXCR4 signalling. The revised manuscript offers convincing evidence to support the claims. The modelling work is better described and additional data has been presented that better illustrates the unique features of the new antagonist. The in vivo studies in the zebrafish model open a path to studies in mammalian models.

    2. Reviewer #2 (Public Review):

      Summary:

      This work describes a new pharmacological targeting approach to inhibit selective functions of the ubiquitously expressed chemokine receptor CXCR4, a potential target of immunomodulatory or anti-cancer treatments. Overall, the results build a strong case for the potential of this new compound to target specific functions of CXCR4, particularly linked to tumorigenesis. However, a more thorough evaluation of the function of the compound as well as future studies in mammalian model systems are needed to better assess the promise of the compound.

      Strengths:

      The work elegantly utilizes in silico drug modelling to propose new small molecule compounds with specific features. This way, the authors designed compound AGR1.137, which abolishes ligand-induced CXCR4 receptor nanoclustering and the subsequent directed cell migration without affecting ligand-binding itself or some other ligand-induced signaling pathways. The authors have used a relatively broad set of experiments to validate and demonstrate the effects of the drug. Importantly, the authors also test AGR1.137 in vivo, using a zebra fish model of tumorigenesis and metastasis. A relatively strong inhibitory effect of the compound is reported.

      Weaknesses:

      The authors have been able to significantly strengthen their data from the first submission. The content of this manuscript is pretty solid, although studies in mammalian model systems are naturally needed in the future to better assess the promise of the compound.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1

      (1) In the "Introduction" section, an important aspect that requires attention pertains to the discussion surrounding the heterodimerization of CXCR4 and CCR5. Notably, the manuscript overlooks a recent study (https://doi.org/10.1038/s41467-023-42082-z) elucidating the mechanism underlying the formation of functional dimers within these G protein-coupled receptors (GPCRs)…The inclusion of this study within the manuscript would significantly enrich the contextual framework of the work, offering readers a comprehensive understanding of the current knowledge surrounding the structural dynamics and functional implications of CXCR4 and CCR5 heterodimerization.

      We thank the reviewer for his/her recommendation to enrich the contextual framework of our study. The Nature Communications paper by Di Marino et al. was published after we sent the first version of our manuscript to eLife, and therefore was not included in the discussion. As the reviewer rightly indicates, this paper elucidates the mechanism underlying the formation of functional dimers within CCR5 and CXCR4. Using metadynamics approaches, the authors emphasize the importance of distinct transmembrane regions for dimerization of the two receptors. In particular, CXCR4 shows two low energy dimer structures and the TMVI-TMVII helices are the preferred interfaces involved in the protomer interactions in both cases. Although the study uses in silico techniques, it also includes the molecular binding mechanism of CCR5 and CXCR4 in the membrane environment, as the authors generate a model in which the receptors are immersed in a 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) phospholipid bilayer with 10% cholesterol. This is an important point in this study, as membrane lipids also interact with membrane proteins, and the lipid composition affects CXCR4 oligomerization (Gardeta S.R. et al. Front. Immunol. 2023). In particular, Di Marino et al. find a cholesterol molecule placed in-between the two CXCR4 protomers where it engages a series of hydrophobic interactions with residues including Leu132, Val214, Leu216 and Phe249. Then, the polar head of cholesterol forms an H-bond with Tyr135 that further stabilizes protomer binding. In our hands, the F249L mutation in CXCR4 reverted the antagonism of AGR1.137, suggesting that the compound binds, among others, this residue. We should, nonetheless, indicate that we analyzed receptor oligomerization and not CXCR4 dimerization, which was the main object of the Di Marino et al. study. It is therefore also plausible that other residues than those described as essential for CXCR4 dimerization might participate in receptor oligomerization. We can speculate that AGR1.137 might affect cholesterol binding to CXCR4 and, therefore, alter dimerization/oligomerization. Additionally, the CXCR4 x-ray structure with PDB code 3ODU (Wu B. et al. Science, 2010) experimentally shows the presence of two fatty acid molecules in contact with both TMV and TMVI. These molecules closely interact with hydrophobic residues in the protein, thereby stabilizing it in a hydrophobic environment. Although more experiments will be needed to clarify the mechanism involved, our results suggest that cholesterol and/or other lipids also play an important role in CXCR4 oligomerization and function, as seen for other GPCRs (Jakubik J. & ElFakahani E.E. Int J Mol Sci. 2021). However, we should also consider that other factors not included in the analysis by Di Marino et al. can also affect CXCR4 oligomerization; for instance, the co-expression of other chemokine receptors and/or other GPCRs that heterodimerize with CXCR4 might affect CXCR4 dynamics at the cell membrane, similar to other membrane proteins such as CD4, which also forms complexes with CXCR4 (Martinez-Muñoz L. et al. Mol. Cell 2018).

      The revised discussion contains references to the study by Di Marino et al. to enrich the contextual framework of our data.

      (2) In "various sections" of the manuscript, there appears to be confusion surrounding the terminology used to refer to antagonists. It is recommended to provide a clearer distinction between allosteric and orthosteric antagonists to enhance reader comprehension. An orthosteric antagonist typically binds to the same site as the endogenous ligand, directly blocking its interaction with the receptor. On the other hand, an allosteric antagonist binds to a site distinct from the orthosteric site, inducing a conformational change in the receptor that inhibits the binding of the endogenous ligand. By explicitly defining the terms "allosteric antagonist" and "orthosteric antagonist" within the manuscript, readers will be better equipped to discern the specific mechanisms discussed in the context of the study.

      The behavior of the compounds described in our manuscript (AGR1.35 and AGR1.137) fits with the definition of allosteric antagonists, as they bind on a site distinct from the orthosteric site, although they only block some ligand-mediated functions and not others. This would mean that they are not formally antagonists and should be not considered as allosteric compounds, as their binding on CXCR4 does not alter CXCL12 binding, although they might affect its affinity. In this sense, our compounds respond much better to the concept of negative allosteric modulators (Gao Z.-G. & Jacobson K.A. Drug Discov. Today Technol. 2013). They act by binding on a site distinct from the orthosteric site and selectively block some downstream signaling pathways but not others induced by the same endogenous agonist.

      To avoid confusion and to clarify the role of the compounds described in this study, we now refer to them as negative allosteric modulators along the manuscript.

      (3) In the Results section, the computational approach employed for "screening small compounds targeting CXCR4, particularly focusing on the inhibition of CXCL12-induced CXCR4 nanoclustering", requires clarification due to several points of incomprehension. The following recommendations aim to address these concerns and enhance the overall clarity of the section:

      (1) Computational Approach and Binding Mode Description: 

      -Explicitly describe the methodology for identifying the pocket/clef area in angstroms (Å) on the CXCR4 protein structure. Include details on how the volume of the cleft enclosed by TMV and TMVI was determined, as this information is not readily apparent in the provided reference (https://doi.org/10.1073/pnas.1601278113).

      The identification of the cleft was based on the observations by Wu et al. (Wu B. et al. Science 2010) who described the presence of bound lipids in the area formed by TMV and VI, and those of Wescott et al. (Wescott M.P. et al. Proc. Natl. Acad. Sci. 2016) on the importance of TMVI in the transmission of conformational changes promoted by CXCL12 on CXCR4 towards the cytoplasmic surface of the receptor to link the binding site with signaling activation. Collectively, these results, and our previous data on the critical role of the N-terminus region of TMVI for CXCR4 oligomerization (Martinez-Muñoz L. et al. Mol. Cell 2018), focused our in silico screening to this region. Once we detected that several compounds bound CXCR4 in this region, the cleavage properties were calculated by subtracting the compound structure. The resulting PDB was analyzed using the PDBsum server (Laskowski R.A. et. al. Protein Sci. 2018). Volume calculations were obtained using the server analyzing surface clefts by SURFNET (Laskowski R. A. J. Mol. Graph. 1995). The theoretical interaction surface between the selected compounds and CXCR4 and the atomic distances between the protein residues and the compounds was calculated using the PISA server (Krissinel E. & Henrick K. J. Mol. Biol. 2007) (Fig. I, only for review purposes). The analysis of the cleft occupied by AGR1.135 showed two independent cavities of 434 Å3 and 1,381 Å3 that were not connected to the orthosteric site. In the case of AGR1.137, the data revealed two distinct clefts of 790 Å3 and 580 Å3 (Fig. I, only for review purposes). These details have been included in the revised manuscript (New Fig. 1A, Supplementary Fig 8A, B).

      (4) Clarify the statement regarding the cleft being "surface exposed for interactions with the plasma membrane," particularly in the context of its embedding within the membrane.

      For GPCRs, transmembrane domains represent binding sites for bioactive lipids that play important functional and physiological roles (Huwiler A. & Zangemeister-Wittke U. Pharmacol. Ther. 2018). The channel between TMV and TMVI connects the orthosteric chemokine binding pocket to the lipid bilayer and is occupied by an oleic acid molecule, according to the CXCR4 structure published in 2010 (Wu B. et al. Science 2010). In addition, the target region contains residues involved in cholesterol (and perhaps other lipids) engagement (Di Marino et al. Nat. Commun. 2023). Taken together, these data support our statement that the cleft supports interactions between CXCR4 molecules and the plasma membrane. 

      Moreover, the data of Di Marino et al. also support that CCR5 and CXCR4 have a symmetric and an asymmetric binding mode. Therefore, either dimeric structure has the possibility to form trimers, tetramers, and even oligomers by using the free binding interface to complex with another protomer. This hypothesis suggests that the interaction of dimers to form oligomers should involve residues distinct from those included in the dimeric conformation.

      The sentence has been modified in the revised manuscript to clarify comprehension.

      (5) Discuss the rationale behind targeting the allosteric binding pocket instead of the orthosteric pocket, outlining potential advantages and disadvantages.

      The advantages and disadvantages of using negative allosteric modulators vs orthosteric antagonists have been now included in the revised discussion. 

      The majority of GPCR-targeted drugs function by binding to the orthosteric site of the receptor, and are agonists, partial agonists, antagonists or inverse agonists. These orthosteric compounds can have off-target effects and poor selectivity due to highly homologous receptor orthosteric sites and to abrogation of spatial and/or temporal endogenous signaling patterns. 

      The alternative is to use allosteric modulators, which can tune the functions associated with the receptors without affecting the orthosteric site. They can be positive, negative or neutral modulators, depending on their effect on the functionality of the receptor (Foster D.J. & Conn P.J. Neuron 2017). For example, the use of a negative allosteric modulator of a chemokine receptor to dampen pathological signaling events, while retaining full signaling for non-pathological activities might limit adverse effects (Kohout T.A.et al. J. Biol. Chem. 2004). In this case, the negative allosteric modulator 873140 blocks CCL3 binding on CCR5 but does not alter CCL5 binding (Watson C. et al. Mol. Pharmacol. 2005). In other cases, allosteric modulators can stabilize a particular receptor conformation and block others. The mechanism of action of the anti-HIV-1, FDAapproved, CCR5 allosteric modulator, maraviroc (Jin J. et al. Sci. Signal. 2018) is attributed to its ability to modulate CCR5 dimer populations and their subsequent subcellular trafficking and localization to the cell membrane (Jin J .et al. Sci. Signal. 2018). Two CCR5 dimeric conformations that are imperative for membrane localization were present in the absence of maraviroc; however, an additional CCR5 dimer conformation was discovered after the addition of maraviroc, and all homodimeric conformations were further stabilized. This finding is consistent with the observation that CCR5 dimers and oligomers inhibit HIV host-cell entry, likely by preventing the HIV-1 co-receptor formation.

      It is well known that GPCRs activate G proteins, but they also recruit additional proteins (e.g., β-arrestins) that induce signaling cascades which, in turn, can direct specific subsets of cellular responses independent of G protein activation (Eichel K. et al. Nature 2018) and are responsible for either therapeutic or adverse effects. Allosteric modulators can thus be used to block these adverse effects without influencing the therapeutic benefits. This was the case in the design of G protein-biased agonists for the kappa opioid receptor, which maintain the desirable antinociceptive and antipruritic effects and eliminate the sedative and dissociative effects in rodent models (Brust T.F. et al. Sci. Signal 2016).

      (6) Provide the PDB ID of the CXCR4 structure used as a template for modeling with SwissModel. Explain the decision to model the structure from the amino acid sequence and suggest an alternative approach, such as utilizing AlphaFold structures and performing classical molecular dynamics with subsequent clustering for the best representative structure.

      The PDB used as a template for modeling CXCR4 was 3ODU. This information was already included in the material and methods section. At the time we performed these analyses, there were several crystallographic structures of CXCR4 in complex with different molecules and peptides deposited at the PDB. None of them included a full construct containing the complete receptor sequence to provide a suitable sample for Xray structure resolution, as the N- and C-terminal ends of CXCR4 are very flexible loops. In addition, the CXCR4 constructs contained T4 lysozyme inserted between helices TMV and TMVI to increase the stability of the protein––a common strategy used to facilitate crystallogenesis of GPCRs (Zou Y. et al. PLoS One 2012). Therefore, we generated a CXCR4 homology model using the SWISS-MODEL server (Waterhouse A. et al. Nucleic Acids Res. 2018). This program reconstructed the loop between TMV and TMVI, a domain particularly important in this study that was not present in any of the crystal structure available in PDB. The model structure was, nonetheless, still incomplete, as it began at P27 and ended at S319 because the terminal ends were not resolved in the crystal structure used as a template. Nevertheless, we considered that these terminal ends were not involved in CXCR4 oligomerization. 

      As Alphafold was not available at the time we initiated this project, we didn’t use it. However, we have now updated our workflow to current methods and predicted the structure of the target using AlphaFold (Jumper J. et al. Nature 2021) and the sequence available under UniProt entry P61073. We prepared the ligands using OpenBabel (O’Boyle N.M. et al., J. Cheminformatics 2011), with a gasteiger charge assignment, and generated 10 conformers for each input ligand using the OpenBabel genetic algorithm. We then prepared the target structure with Openmm, removing all waters and possible heteroatoms, and adding all missing atoms. We next predicted the target binding pockets with fPocket (Le Guilloux V. et al. BMC Bioinformatics 2009), p2rank (Krivak R. & Hoksza, J. Cheminformatics 2018), and AutoDock autosite (Ravindranath P.A. & Sanner M.F. Bioinformatics 2016). We chose only those pockets between TMV and TMVI (see answer to point 3). We merged the results of the three programs into so-called consensus pockets, as two pockets are said to be sufficiently similar if at least 75% of their surfaces are shared (del Hoyo D. et al. J. Chem. Inform. Model. 2023). From the consensus pockets, there was one pocket that was significantly larger than the others and was therefore selected. We then docked the ligand conformers in this pocket using AutoDock GPU (Santos-Martins D. et al. J. Chem. Theory Comput. 2021), LeDock (Liu N & Xu Z., IOP Conf. Ser. Earth Environ. Sci. 2019), and Vina (Eberhardt J. et al. J. Chem. Inf. Model. 2021). The number of dockings varied from 210 to 287 poses. We scored each pose with the Vina score using ODDT (Wójcikowski M. et al. J. Cheminform. 2015). Then, we clustered the different solutions into groups whose maximum RMSD was 1Å. This resulted in 40 clusters, the representative of each cluster was the one with maximum Vina score and confirmed that the selected compounds bound this pocket (Author response image 1). When required, we calculated the binding affinity using Schrodinger’s MM-GBSA procedure (Greenidge P.A. et al. J. Chem. Inf. Model. 2013), in two ways: first, assuming that the ligand and target are fixed; second, with an energy minimization of all the atoms within a distance of 3Å from the ligand. This information has now been included in the revised version of the manuscript.

      Author response image 1.

      AGR1.135 docking in CXCR4 using the updated protocol for ligand docking. Cartoon representation colored in gray with TMV and TMVI shown in blue and pink, respectively. AGR1.135 is shown in stick representation with carbons in yellow, oxygens in red and nitrogens in blue.

      (7) Specify the meaning of "minimal interaction energy" and where (if present) the interaction scores are reported in the text.

      We refer to minimal interaction energy, the best docking score, that is, the best score obtained in our docking studies. These data were not included in the previous manuscript due to space restrictions but are now included in the reviewed manuscript.

      (8) You performed docking studies using GLIDE to identify potential binding sites for the small compounds on the CXCR4 protein. The top-scoring binders were then subjected to further refinement using PELE simulations. However, I realize that a detailed description of the specific binding modes of these compounds was not provided in the text. Please make the description of binding poses more detailed

      Firstly, to assess the reliability of this method, a PELE study was carried out for the control molecule IT1t, which is a small drug-like isothiourea derivative that has been crystallized in complex with CXCR4 (PDB code: 3ODU). IT1t is a CXCR4 antagonist that binds to the CXCL12 binding cavity and inhibits HIV-1 infection (Das D. Antimicrob. Agents Chemother. 2015; Dekkers S. et al. J. Med. Chem. 2023). From the best five trajectories, two of them had clearly better binding energies, and corresponded to almost the same predicted pose of the molecule. Although the predicted binding mode was not exactly the same as the one in the crystal structure, the approximation was very good, giving validation to the approach. Although PELE is a suitable technique to find potential binding sites, the predicted poses must be subsequently refined using docking programs.

      Analyzing the best trajectories for the remaining ligands, at least one of the best-scored poses was always located at the orthosteric binding site of CXCR4. Even though these poses showed good binding energies, they were discarded as the in vitro biological experiments indicated that the compounds were unable to block CXCL12 binding or CXCL12-mediated inhibition of cAMP release or CXCR4 internalization. Collectively, these data indicated that the selected compounds did not behave as orthosteric inhibitors of CXCR4. The CXCL12 binding pocket is the biggest cavity in CXCR4, and so PELE may tend to place the molecules near it. However, all the compounds presented other feasible binding sites with a comparable binding energy.

      AGR1.135 and AGR1.137 showed interesting poses between TMV and TMVI with very good binding energy (-51.4 and -37.2 kcal/mol, respectively). This was precisely the region we had previously selected for the in silico screening, as previously described (see response to point 3).

      AGR1.131 showed two poses with low binding energy that were placed between helices TMI and TMVII (-43.6 kcal/mol) and between helices TMV and TMVI (-39.8 kcal/mol). This compound was unable to affect CXCL12-mediated chemotaxis and was therefore used as an internal negative control as it was selected in the in silico screening with the same criteria as the other compounds but failed to alter any CXCL12-mediated functions. PELE studies nonetheless provided different binding sites for each molecule, which had to be further studied using docking to obtain a more accurate binding mode. In agreement with the previous commentary, we repeated the analysis using AlphaFold and the rest of the procedure described (see our response to point 6) and calculated the binding energies for all the compounds using Schrodinger’s MM-GBSA procedure (Greenidge P.A. et al. J. Chem. Inf. Model. 2013). Calculations were performed in two ways: first, assuming that the ligand and target are fixed; second, with an energy minimization of all the atoms within a distance of 3Å from the ligand. The results using the first method indicated that AGR1.135 and AGR1.137 showed poses between TMV and TMVI with - 56.4 and -62.4 kcal/mol, respectively and AGR1.131 had a pose between TMI and TMVII with -61.6kcal/mol.  In the second method AGR1.135 and AGR1.137 showed poses between TMV and TMVI with -57.9, and -67.6 kcal/mol, respectively, and AGR1.131 of -62.2 kcal/mol between TMI and TMVII.

      This information is now included in the text.

      (9) (2) Experimental Design:-Justify the choice of treating Jurkat cells with a concentration of 50 μM of the selected compound. Consider exploring different concentrations and provide a rationale for the selected dosage. Additionally, clearly identify the type of small compound used in the initial experiment.

      The revised version contains a new panel in Fig. 1B to show a more detailed kinetic analysis with different concentrations (1-100 µM) of the compounds in the Jurkat migration experiments. In all cases, 100 µM nearly completely abrogated cell migration, but in order to reduce the amount of DMSO added to the cells we selected 50 µM for further experiments, as it was the concentration that inhibits 50-75% of ligand-induced cell migration. Regarding the type of small compounds used in the initial experiments, they were compounds included in the library described in reference #24 (Sebastian-Pérez V. et al Med. Biol. Chem. 2017), which contains heterocyclic compounds. We would note that we do not consider AGR1.137 a final compound. We think that there is scope to develop AGR1.137-based second-generation compounds with greater solubility in water, greater specificity or affinity for CXCR4, and to evaluate delivery methods to hopefully increase activity.  

      (10) Avoid reporting details in rounded parentheses within the text; consider relocating such information to the Materials and Methods section or figure captions for improved readability.

      Most of the rounded parentheses within the text have been eliminated in the revised version of the manuscript to improve readability.

      (11) Elaborate on the virtual screening approach using GLIDE software, specifying the targeted site and methodology employed.

      For the virtual screening, we used the Glide module (SP and XP function scoring) included in the Schrödinger software package, utilizing the corresponding 3D target structure and our MBC library (Sebastián-Pérez V et al. J. Chem. Inf. Model. 2017).  The center of the catalytic pocket was selected as the centroid of the grid. In the grid generation, a scaling factor of 1.0 in van der Waals radius scaling and a partial charge cutoff of 0.25 were used. A rescoring of the SP poses of each compound was then performed with the XP scoring function of the Glide. The XP mode in Glide was used in the virtual screening, the ligand sampling was flexible, epik state penalties were added and an energy window of 2.5 kcal/mol was used for ring sampling. In the energy minimization step, the distance-dependent dielectric constant was 4.0 with a maximum number of minimization steps of 100,000. In the clustering, poses were considered as duplicates and discarded if both RMS deviation is less than 0.5 Å and maximum atomic displacement is less than 1.3 Å.

      (12) Provide clarity on the statement that AGR1.131 "theoretically" binds the same motif, explaining the docking procedure used for this determination.

      In the in silico screening, AGR1.131 was one of the 40 selected compounds that showed, according to the PELE analysis (see answer to point 8), a pose with low binding energy (-39.8 kcal/mol) between TMV and TMVI helices, which is the selected area for the screening. It, nonetheless, also showed a best pose placed between helices TM1 and TM7 (-43.7 kcal/mol) using the initial workflow. In conclusion, although AGR1.131 also faced to the TMV-TMVI, the most favorable pose was in the area between TMI and TMVII. In addition, the compound was included in the biological screening, where it did not affect CXCL12-mediated chemotaxis. We thus decided to use it as an internal negative control, as it has a skeleton very similar to AGR1.135 and AGR1.137 and can interact with the TM domains of CXCR4 without promoting biological effects. This statement has been clarified in the revised text.

      (13) Toxicity Testing:

      -Enhance the explanation of the approach to testing the toxicity of the compound in Jurkat cells. Consider incorporating positive controls to strengthen the assessment and clarify the experimental design.

      All the selected compounds in the in silico screening were initially tested for propidium iodide incorporation in treated cells in a toxicity assay, and some of them were discarded for further experiments (e.g., AGR1.103 and VSP3.1).

      Further evaluation of Jurkat cell viability was determined by cell cycle analysis using propidium iodide.  Supplementary Fig. 1B included the percentage of each cell cycle phase, and data indicated no significant differences between the treatments tested. Nevertheless, at the suggestion of the reviewer, and to clarify this issue, positive controls inducing Jurkat cell death (staurosporine and hydrogen peroxide) have also been included in the new Supplementary Fig. 2. The new figure also includes a table showing the percentage of cells in each cell-cycle phase.  

      (14) In the Results section concerning "AGR1.135 and AGR1.137 blocking CXCL12-mediated CXCR4 nanoclustering and dynamics", several points can be improved to enhance clarity and coherence: 1. Specificity of Low Molecular Weight Compounds:  

      -Clearly articulate how AGR1.135 and AGR1.137 specifically target homodimeric CXCR4 and provide an explanation for their lack of impact on heterodimeric CXCR4-CCR5 in that region.

      First of all, we should clarify that when we talk about receptor nanoclustering, oligomers refer to complexes including 3 or more receptors and, therefore, the residues involved in these interactions can differ from those involved in receptor dimerization. Moreover, our FRET experiments did not indicate that the compounds alter receptor dimerization (see new Supplementary Fig. 7). Of note, mutant receptors unable to oligomerize can still form dimers (Martínez-Muñoz L. et al. Mol. Cell 2018; García-Cuesta E.M .et al. Proc. Natl. Acad. Sci. USA 2022). Additionally, we believe that these oligomers can also include other chemokine receptors/proteins expressed at the cell membrane, which we are currently studying using different models and techniques.

      We have results supporting the existence of CCR5/CXCR4 heterodimers (Martínez-Muñoz L et al. Proc. Natl. Acad. Sci. USA 2014), in line with the data published by Di Marino et al. However, in the current study we have not evaluated the impact of the selected compounds on other CXCR4 complexes distinct from CXCR4 oligomers. Our Jurkat cells do not express CCR5 and, therefore, we cannot discuss whether AGR1.137 affects CCR5/CXCR4 heterodimers. The chemokine field is very complex and most receptors can form dimers (homo- and heterodimers) as well as oligomers (Martinez-Muñoz L., et al Pharmacol & Therap. 2011) when co-expressed. To evaluate different receptor combinations in the same experiment is a complex task, as the number of potential combinations between distinct expressed receptors makes the analysis very difficult. We started with CXCR4 as a model, to continue later with other possible CXCR4 complexes. In addition, for the analysis of CCR5/CXCR4 dynamics, it is much better to use dual-TIRF techniques, which allow the simultaneous detection of two distinct molecules coupled to different fluorochromes.

      Regarding the data of Di Marino et al., it is possible that the compounds might also affect heterodimeric conformations of CXCR4. This aspect has also been broached in the revised discussion. We would again note that we evaluated CXCR4 oligomers and not monomers or dimers; this is especially relevant when we compare the residues involved in these processes as they might differ depending on the receptor conformation considered. This issue was also hypothesized by Di Marino et al. (see our response to point 4).

      (15) When referring to "unstimulated" cells, provide a more detailed explanation to elucidate the experimental conditions and cellular state under consideration.

      Unstimulated cells refer to the cells in basal conditions, that is, cells in the absence of CXCL12. For TIRF-M experiments, transiently-transfected Jurkat cells were plated on glass-bottomed microwell dishes coated with fibronectin; these are the unstimulated cells. To observe the effect of the ligand, dishes were coated as above plus CXCL12 (stimulated cells). We have clarified this point in the material and methods section of the revised version.

      (16) 2. Paragraph Organization

      -Reorganize the second paragraph to eliminate redundancy and improve overall flow. A more concise and fluid presentation will facilitate reader comprehension and engagement.

      The second paragraph has been reorganized to improve overall flow.

      (17) Ensure that each paragraph contributes distinct information, avoiding repetition and redundancy.

      We have carefully revised each paragraph of the manuscript to avoid redundancy.

      (18) 3. Claim of Allosteric Antagonism:

      -Exercise caution when asserting that "AGR1.135 and AGR1.137 behave as allosteric antagonists of CXCR4" based on the presented results. Consider rephrasing to reflect that the observed effects suggest the potential allosteric nature of these compounds, acknowledging the need for further investigations and evidence.

      To avoid misinterpretations on the effect of the compounds on CXCR4, as we have commented in our response to point 2, we have substituted the term allosteric inhibitors with negative allosteric modulators, which refer to molecules that act by binding a site distinct from the orthosteric site, and selectively block some downstream signaling pathways, whereas others induced by the same endogenous or orthosteric agonist are unaffected (Gao Z.-G. & Jacobson K.A. Drug Discov. Today Technol. 2013). Our data indicate that the selected small compounds do not block ligand binding or G protein activation or receptor internalization, but inhibit receptor oligomerization and ligand-mediated directed cell migration.

      (19) In the Results section discussing the "incomplete abolition of CXCR4-mediated responses in Jurkat cells by AGR1.135 and AGR1.137", several points can be refined for better clarity and completeness:  1. Inclusion of Positive Controls: 

      -Consider incorporating positive controls in relevant experiments to provide a comparative benchmark for assessing the impact of AGR1.135 and AGR1.137. This addition will strengthen the interpretation of results and enhance the experimental rigor. 

      The in vivo experiments (Fig. 7E,F) used AMD3100, an orthosteric antagonist of CXCR4, as a positive control. We also included AMD3100, as a positive control of inhibition when evaluating the effect of the compounds on CXCL12 binding (Fig. 3, new Supplementary Fig. 3). The revised version of the manuscript also includes the effect of this inhibitor on other relevant CXCL12-mediated responses such as cell migration (Fig. 1B), receptor internalization (Fig. 3A), cAMP production (Fig. 3C), ERK1/2 and AKT phosphorylation (Supplementary Fig. 4), actin polymerization (Fig. 4A), cell polarization (Fig. 4B, C) and cell adhesion (Fig. 4D), to facilitate the interpretation of the results and improve the experimental rigor.

      (20) 2. Clarification of Terminology: 

      -Clarify the term "CXCR4 internalizes" by providing context, perhaps explaining the process of receptor internalization and its relevance to the study.

      We refer to CXCR4 internalization as a CXCL12-mediated endocytosis process that results in reduction of CXCR4 levels on the cell surface. We use CXCR4 internalization in this study with two purposes: First, for CXCR4 and other chemokine receptors, internalization processes are mediated by ligand-induced clathrin vesicles (Venkatesan et al 2003) a process that triggers CXCR4 aggregation in these vesicles. We have previously determined that the oligomers of receptors detected by TIRF-M remain unaltered in cells treated with inhibitors of clathrin vesicle formation and of internalization processes (Martinez-Muñoz L. et al. Mol. Cell 2018). Moreover, we have described a mutant CXCR4 that cannot form oligomers but internalizes normally in response to CXCL12 (Martinez-Muñoz L. et al. Mol. Cell 2018). The observation in this manuscript of normal CXCL12-mediated endocytosis in the presence of the negative allosteric inhibitors of CXCR4 that abrogate receptor oligomerization reinforces the idea that the oligomers detected by TIRF are not related to receptor aggregates involved in endocytosis; Second, receptor internalization is not affected by the allosteric compounds, indicating that they downregulate some CXCL12-mediated signaling events but not others (new Fig. 3).

      All these data have been included in the revised discussion of the manuscript.

      (21) Elaborate on the meaning of "CXCL12 triggers normal CXCR4mut internalization" to enhance reader understanding.

      We have previously described a triple-mutant CXCR4 (K239L/V242A/L246A; CXCR4mut). The mutant residues are located in the N-terminal region of TMVI, close to the cytoplasmic region, thus limiting the CXCR4 pocket described in this study (see our response to point 3). This mutant receptor dimerizes but neither oligomerizes in response to CXCL12 nor supports CXCL12-induced directed cell migration, although it can still trigger some Ca2+ flux and is internalized after ligand activation (Martinez-Muñoz L. et al. Mol. Cell 2018).  We use the behavior of this mutant (CXCR4mut) to show that the CXCR4 oligomers and the complexes involved in internalization processes are not the same and to explain why we evaluated CXCR4 endocytosis in the presence of the negative allosteric modulators.

      As we indicated in a previous answer to the reviewer, these issues have been re-elaborated in the revised version.

      (22) 3. Discrepancy in CXCL12 Concentration:

      -Address the apparent discrepancy between the text stating, "...were stimulated with CXCL12 (50 nM, 37{degree sign}C)," and the figure caption (Fig. 3A) reporting a concentration of 12.5 nM. Rectify this inconsistency and provide an accurate and clear explanation.

      We apologize for this error, which is now corrected in the revised manuscript. With the exception of the cell migration assays in Transwells, where the optimal concentration was established at 12.5 nM, in the remaining experiments the optimal concentration of CXCL12 employed was 50 nM. These concentrations were optimized in previous works of our laboratory using the same type of experiment. We should also remark that in the experiments using lipid bilayers or TIRF-M experiments, CXCL12 is used to coat the plates and therefore it is difficult to determine the real concentration of the ligand that is retained in the surface of the plates after the washing steps performed prior to adding the cells. In addition, we use 100 nM CXCL12 to create the gradient in the chambers used to perform the directed-cell migration experiments.

      (23) 4. Speculation on CXCL12 Binding:

      -Refrain from making speculative statements, such as "These data suggest that none of the antagonists alters CXCL12 binding to CXCR4," unless there is concrete evidence presented up to that point. Clearly outline the results that support this conclusion.

      Figure 3B and Supplementary Figure 3 show CXCL12-ATTO700 binding by flow cytometry in cells pretreated with the negative allosteric modulators. We have also included AMD3100, the orthosteric antagonist, as a control for inhibition. While these experiments showed no major effect of the compounds on CXCL12 binding, we cannot discard small changes in the affinity of the interaction between CXCL12 and CXCR4. In consequence we have re-written these statements.

      (24) 5. Corroboration of Data:

      -Specify where the corroborating data from immunostaining and confocal analysis are reported, ensuring readers can access the relevant information to support the conclusions drawn in this section.

      In agreement with the suggestion of the reviewer, the revised manuscript includes data from immunostaining and confocal analysis to complement Fig. 4B (new Fig. 4C). The revised version also includes some representative videos for the TIRF experiments showed in Figure 2 to clarify readability.

      (25) In the Results section concerning "AGR1.135 and AGR1.137 antagonists and their direct binding to CXCR4", several aspects need clarification and refinement for a more comprehensive and understandable presentation: 1. Workflow Clarification:

      -Clearly articulate the workflow used for assessing the binding of AGR1.135 and AGR1.137 to CXCR4. Address the apparent contradiction between the inability to detect a direct interaction and the utilization of Glide for docking in the TMV-TMVI cleft.

      To address the direct interaction of the compounds with CXCR4, we intentionally avoided the modification of the small compounds with different labels, which could affect their properties. We therefore attempted a fluorescence a spectroscopy strategy to formally prove the ability of the small compounds to bind CXCR4, but this failed because the AGR1.135 is yellow in color, which interfered with the determinations. We also tried a FRET strategy (see new Supplementary Fig. 7) and detected a significant increase in FRET efficiency of CXCR4 homodimers when AGR1.135 was evaluated, but again the yellow color interfered with FRET determinations. Moreover, AGR1.137 did not modify FRET efficiency of CXCR4 dimers. Therefore, we were unable to detect the interaction of the compounds with CXCR4.

      We elected to develop an indirect strategy; in silico, we evaluated the binding-site using docking and molecular dynamics to predict the most promising CXCR4 binding residues involved in the interaction with the selected compounds. Next, we generated point mutant receptors of the predicted residues and re-evaluated the behavior of the allosteric antagonists in a CXCL12-induced cell migration experiment. Obviously, we first discarded those CXCR4 mutants that were not expressed on the cell membrane as well as those that were not functional when activated with CXCL12. Using this strategy, we eliminated the interference due to the physical properties of the compounds and demonstrated that if the antagonism of a compound is reversed in a particular CXCR4 mutant it is because the mutated residue participates or interferes with the interaction between CXCR4 and the compound, thus assuming (albeit indirectly) that the compound binds CXCR4. 

      To select the specific mutations included in the analysis, our strategy was to generate point mutations in residues present in the TMV-TMVI pocket of CXCR4 that were not directly proposed as critical residues involved in chemokine engagement, signal initiation, signal propagation, or G protein-binding, based on the extensive mutational study published by Wescott MP et. al. (Wescott M.P. et. al. Proc. Natl. Acad. Sci. U S A. 2016).

      (26) Provide a cohesive explanation of the transition from docking evaluation to MD analysis, ensuring a transparent representation of the methodology.

      Based on the aim of this work, the workflow shown in Author response image 2, was proposed to predict the binding mode of the selected molecules. Firstly, a CXCR4 model was generated to reconstruct some unresolved parts of the protein structure; then a binding site search using PELE software was performed to identify the most promising binding sites; subsequently, docking studies were performed to refine the binding mode of the molecules; and finally, molecular dynamics simulations were run to determine the most stable poses and predict the residues that we should mutate to test that the compounds interact with CXCR4. 

      Author response image 2.

      Workflow followed to determine the binding mode of the  studied compounds.

      (27) 2. Choice of Software and Techniques:

      -Justify the use of "AMBER14" and the PELE approach, considering  their potential obsolescence.

      These experiments were performed five years ago when the project was initiated. As the reviewer indicates, AMBER14 and PELE approaches might perhaps be considered obsolescent. Thus, we have predicted the structure of the target using AlphaFold (Jumper J. et al, Nature 2021) and the sequence available under UniProt entry P61073. The complete analysis performed (see our response to point 4) confirmed that the compounds bound the selected pocket, as we had originally determined using PELE. These new analyses have been incorporated into the revised manuscript.

      (28)-Discuss the role of the membrane in the receptor-ligand interac7on. Elaborate on how the lipidic double layer may influence the binding of small compounds to GPCRs embedded in the membrane.

      Biological membranes are vital components of living organisms, providing a diffusion barrier that separates cells from the extracellular environment, and compartmentalizing specialized organelles within the cell. In order to maintain the diffusion barrier and to keep it electrochemically sealed, a close interaction of membrane proteins with the lipid bilayer is necessary. It is well known that this is important, as many membrane proteins undergo conformational changes that affect their transmembrane regions and that may regulate their activity, as seen with GPCRs (Daemen F.J. & Bonting S.L., Biophys. Struct. Mech. 1977; Gether U. et al. EMBO J. 1997). The lateral and rotational mobility of membrane lipids supports the sealing function while allowing for the structural rearrangement of membrane proteins, as they can adhere to the surface of integral membrane proteins and flexibly adjust to a changing microenvironment. In the case of the first atomistic structure of CXCR4 (Wu B. et al. Science 2010), it was indicated that for dimers, monomers interact only at the extracellular side of helices V and VI, leaving at least a 4-Å gap between the intracellular regions, which is presumably filled by lipids. In particular, they indicated that the channel between TMV and TMVI that connects the orthosteric chemokine binding pocket to the lipid bilayer is occupied by an oleic acid molecule. Recently, Di Marino et al., analyzing the dimeric structure of CXCR4, found a cholesterol molecule placed in between the two protomers, where it engages a series of hydrophobic interactions with residues located in the area between TMI and TMVI (Leu132, Val214, Leu216, Leu246, and Phe249). The polar head of cholesterol forms an H-bond with Tyr135 that further stabilizes its binding mode. This finding confirms that cholesterol might play an important role in mediating and stabilizing receptor dimerization, as seen in other GPCRs (Pluhackova, K., et al. PLoS Comput. Biol. 2016). In addition, we have previously observed that, independently of the structural changes on CXCR4 triggered by lipids, the local lipid environment also regulates CXCR4 organization, dynamics and function at the cell membrane and modulates chemokine-triggered directed cell migration. Prolonged treatment of T cells with bacterial sphingomyelinase promoted the complete and sustained breakdown of sphingomyelins and the accumulation of the corresponding ceramides, which altered both membrane fluidity and CXCR4 nanoclustering and dynamics. Under these conditions, CXCR4 retained some CXCL12-mediated signaling activity but failed to promote efficient directed cell migration (Gardeta S.R. et al. Front. Immunol. 2022). Collectively, these data demonstrate the key role that lipids play in the stabilization of CXCR4 conformations and in regulating its lateral mobility, influencing their associated functions. These considerations have been included in the revised version of the manuscript. 

      (29) 3. Stable Trajectories and Binding Mode Superimposi7on -Specify the criteria for defining "stable trajectories" to enhance reader understanding

      There could be several ways to describe the stability of a MD simulation, based on the convergence of energies, distances or ligand-target interactions, among others. In this work, we use the expression “stable trajectories” to refer to simulations in which the ligand trajectory converges and the ligand RMSD does not fluctuate more than 0.25Å. This definition is now included in the revised text.

      (30)  Clarify the meaning behind superimposing the two small compounds and ensure that the statement in the figure caption aligns with the information presented in the main text.

      We apologize for the error in the previous Fig. 5A and in its legend. The figure was created by superimposing the protein component of the poses for the two compounds, AGR1.135 and AGR1.137, rather than the compounds themselves. As panel 5A was confusing, we have modified all Fig. 5 in the revised manuscript to improve clarity.

      (31) 4. Volume Analysis and Distances:

      -Provide details on how the volume analysis was computed and how distances were accounted for. Consider adding a figure to illustrate these analyses, aiding reader comprehension.

      The cleft search and analysis were performed using the default settings of SURFNET (Laskowski R.A. J. Mol. Graph. 1995) included in the PDBsum server (Laskowski R.A. et. al. Trends Biochem. Sci. 1997). The first run of the input model for CXCR4 3ODU identified a promising cleft of 870 Å3 in the lower half of the region flanked by TMV and TMVI, highlighting this area as a possible small molecule binding site (Fig. I, only for review purposes). Analysis of the cleft occupied by AGR1.135 showed two independent cavities of 434 Å3 and 1381 Å3 that were not connected to the orthosteric site. The same procedure for AGR1.137 revealed two distinct clefts of 790 Å3 and 580 Å3, respectively (Fig. I, only for review purposes). Analysis of the atomic distances between the protein residues and the compounds was performed using the PISA server. Krissinel E. & Henrick K. J. Mol. Biol. 2007). (Please see our response to point 3 and the corresponding figure).

      (32) 5. Mutant Selection and Relevance:

      -Clarify the rationale behind selecting the CXCR4 mutants used in the study. Consider justifying the choice and exploring the possibility of performing an alanine (ALA) scan for a more comprehensive mutational analysis.  

      The selection of the residues to be mutated along the cleft was first based on their presence in the proposed cleft and the direct interaction of the compounds with them, either by hydrogen bonding or by hydrophobic interactions. Secondly, all mutated residues did not belong to any of the critical residues involved in transmitting the signal generated by the interaction of CXCL12 with the receptor. In any case, mutants producing a non-functional CXCR4 at the cell membrane were discarded after FACS analysis and chemotaxis experiments. Finally, the length and nature of the resulting mutations were designed mainly to occlude the cleft in case of the introduction of long residues such as lysines (I204K, L208K) or to alter hydrophobic interactions by changing the carbon side chain composition of the residues in the cleft. Indeed, we agree that the alanine scan mutation analysis would have been an alternative strategy to evaluate the residues involved in the interactions of the compounds. 

      (33) Reevaluate the statement regarding the relevance of the Y256F muta7on for the binding of AGR1.137. If there is a significant impact on migra7on in the mutant (Fig. 6B), elaborate on the significance in the context of AGR1.137 binding.

      In the revised discussion we provide more detail on the relevance of Y256F mutation for the binding of AGR1.137 as well as for the partial effect of G207I and R235L mutations. The predicted interactions for each compound are depicted in new Fig. 6 C, D after LigPlot+ analysis (Laskowski R.A. & Swindells M.B. J. Chem. Inf. Model. 2011), showing that AGR1.135 interacted directly with the receptor through a hydrogen bond with Y256. When this residue was mutated to F, one of the anchor points for the compound was lost, weakening the potential interaction in the region of the upper anchor point.

      It is not clear how the Y256F mutation will affect the binding of AGR1.137, but other potential contacts cannot be ruled out since that portion of the compound is identical in both AGR1.135 and AGR1.137. This is especially true for its neighboring residues in the alpha helix, F249, L208, as shown in 3ODU structure (Fig. 6D), which are shown to be directly implicated in the interaction of both compounds. Alternatively, we cannot discard that Y256 interacts with other TMs or lipids stabilizing the overall structure, which could reverse the effect of the mutant at a later stage (Author response image 3).

      Author response image 3.

      Cartoon representation of Y256 and its intramolecular interactions in the CXCR4 Xray solved structure 3ODU. TMV helix is colored in blue and TMVI in pink.

      (34) Address the apparent discrepancy in residue involvement between AGR1.135 and AGR1.137, particularly if they share the same binding mode in the same clef.

      AGR1.135 and AGR1.137 exhibit comparable yet distinct binding modes, engaging with CXCR4 within a molecular cavity formed by TMV and TMVI. AGR1.135 binds to CXCR4 through three hydrogen bonds, two on the apical side of the compound that interact with residues TMV-G207 and TMVI-Y256 and one on the basal side that interacts with TMVI-R235 (Fig. 5A). This results in a more extended and rigid conformation when sharing hydrogen bonds, with both TMs occupying a surface area of 400 Å2 and a length of 20 Å in the cleft between TMV and TMVI (Supplementary Fig. 8A). AGR1.137 exhibits a distinct binding profile, interacting with a more internal region of the receptor. This interaction involves the formation of a hydrogen bond with TMIIIV124, which induces a conformational shift in the TMVI helix towards an active conformation (Fig. 5B; Supplementary Fig. 13). Moreover, AGR1.137 may utilize the carboxyl group of V124 in TMIII and overlap with AGR1.135 binding in the cavity, interacting with the other 19 residues dispersed between TMV and VI to create an interaction surface of 370 Å2 along 20 Å (Supplementary Fig. 8B). This is illustrated in the new Fig. 5B. AGR1.137 lacks the phenyl ring present in AGR1.135, resulting in a shorter compound with greater difficulty in reaching the lower part of TMVI where R235 sits. 

      Author response image 4.

      AGR1.135 and AGR1.137 interaction with TMV and TMVI.  The model shows the location of the compounds within the TMV-VI cleft, illustrated by a ribbon and stick representation. The CXCR4 segments of TMV and TMVI are represented in blue and pink ribbons respectively, and side chains for some of the residues defining the cavity are shown in sticks. AGR1.135 and AGR1.137 are shown in stick representation with carbon in yellow, nitrogen in blue, oxygen in red, and fluorine in green. Hydrogen bonds are indicated by dashed black lines, while hydrophobic interactions are shown in green. The figure reproduces the panels A, B of Fig. 5 in the revised manuscript.

      (35) In the Results sec7on regarding "AGR1.137 treatment in a zebrafish xenograf model", the following points can be refined for clarity and completeness: 1. Cell Line Choice for Zebrafish Xenograft Model:

      -Explain the rationale behind the choice of HeLa cells for the zebrafish xenograft model when the previous experiments primarily focused on Jurkat cells. Address any specific biological or experimental considerations that influenced this decision.

      As far as we know, there are no available models of tumors in zebrafish using Jurkat cells. We looked for a tumoral cell system that expresses CXCR4 and could be transplanted into zebrafish. HeLa cells are derived from a human cervical tumor, express a functional CXCR4, and have been previously used for tumorigenesis analyses in zebrafish (Brown H.K. et al. Expert Opin. Drug Discover. 2017; You Y. et al Front. Pharmacol. 2020). These cells grow in the fish and disseminate through the ventral area and can be used to determine primary tumor growth and metastasis. Nonetheless, we first analyzed in vitro the expression of a functional CXCR4 in these cells (Supplementary Fig. 10A), whether AGR1.137 treatment specifically abrogated CXCL12-mediated direct cell migration (Fig. 7A, B), as whether it affected cell proliferation (Supplementary Fig. 10B). As HeLa cells reproduce the in vitro effects detected for the compounds in Jurkat cells, we used this model in zebrafish. These issues were already discussed in the first version of our manuscript. 

      (36) 2. Toxicity Assessment in Zebrafish Embryos: 

      -Clarify the basis for stating that AGR1.137 is not toxic to zebrafish embryos. Consider referencing the Zebrafish Embryo Acute Toxicity Test (ZFET) and provide relevant data on lethal concentration (LC50) and non-lethal toxic phenotypes such as pericardial edema, head and tail necrosis, malformation, brain hemorrhage, or yolk sac edema.

      Tumor growth and metastasis kinetics within the zebrafish model have been extensively evaluated in many publications (White R. et al. Nat. Rev. Cancer. 2013; Astell K.R. and Sieger D. Cold Spring Harb. Perspect. Med. 2020; Chen X. et al. Front. Cell Dev. Biol. 2021; Weiss JM. Et al. eLife 2022; Lindhal G. et al NPJ Precis. Oncol. 2024). Our previous experience using this model shows that tumors start having a more pronounced proliferation and lower degree of apoptosis from day 4 onwards, but we cannot keep the tumor-baring larvae for that long due to ethical reasons and also because we don’t see much scientific benefit of unnecessarily extending the experiments. Anti-proliferative or pro-apoptotic effects of drugs can still be observed within the three days, even if this is then commonly seen as larger reduction (instead of a smaller growth as it is commonly seen in for example mouse tumor models) compared to controls. Initially we characterized the evolution of implanted tumors in our system and how much they metastasize over time in the absence of treatment before to test the compounds (Author response image 5).

      The in vivo experiments were planned to validate efficacious concentrations of the investigated drugs rather than to derive in vivo IC50 or other values, which require testing of multiple doses. We have, however, included an additional concentration to show concentration-dependence and therefore on-target specificity of the drugs in the revised version of the manuscript (data also being elaborated in ongoing experiments). At this stage, we believe that adding the LC50 does not provide interesting new knowledge, and it is standard to only show results from the experimental endpoint (in our case 3 days post implantation). We agree that showing these new data points strengthens the manuscript and facilitates independent evaluation and conclusions to be drawn from the presented data. We have created new graphs where datapoints for each compound dose are shown.  

      Author response image 5.

      Evolution of the tumors and metastasis along the time in the absence of any treatment. HeLa cells were labeled with 8 µg/mL Fast-DiI™ oil and then implanted in the dorsal perivitelline space of 2-days old zebrafish embryos. Tumors were imaged within 2 hours of implantation and re-imaged each 24 h for three days. Changes in tumor size was evaluated as tumor area at day 1, 2 and 3 divided by tumor area at day 0, and metastasis was evaluated as the number of cells disseminated to the caudal hematopoietic plexus at day 1, 2 and 3 divided by the number of cells at day  3.

      Regarding the statement that AGR1.137 was not toxic, this was based on visual inspection of the zebrafish larvae at the end of the experiment, which also revealed a lack of drug-related mortality in these experiments. There are a number of differences in how our experiment was run compared with the standardized ZFET. ZFET evaluates toxicity from 0 hours post-fertilization to 1 or 2 days post-fertilization, whereas here we exposed zebrafish from 2 days post-fertilization to 5 days post-fertilization. The ZFET furthermore requires that the embryos are raised at 26ºC whereas kept the temperature as close as possible to a physiologically relevant temperature for the tumor cells (36ºC). In the ZFET, embryos are incubated in 96-well plates whereas for our studies we required larger wells to be able to manipulate the larvae and avoid well edge-related imaging artefacts, and we therefore used 24-well plates. As such, the ZFET was for various reasons not applicable to our experimental settings. As we were not interested in rigorously determining the LD50 or other toxicity-related measurements, as our focus was instead on efficacy and we found that the targeted dose was tolerated, we did not evaluate multiple doses, including lethal doses of the drug, and are therefore not able to determine an LD50/LC50. We also did not find drug-induced non-lethal toxic phenotypes in this study, and so we cannot elaborate further on such phenotypes other than to simply state that the drug is well tolerated at the given doses. Therefore, the reference to ZFET in the manuscript was eliminated.

      (37) If supplementary information is available, consider providing it for a comprehensive understanding of toxicity assessments. 

      The effective concentration used in the zebrafish study was derived from the in vitro experiments. That being said, and as elaborated in our response to comment 36, we have added data for one additional dose to show the dose-dependent regulation of tumor growth and metastasis. 

      (38) 3. Optimization and Development of AGR1.137: 

      -Justify the need for further optimization and development of AGR1.137 if it has a comparable effect to AMD3100. Explain the specific advantages or improvements that AGR1.137 may offer over AMD3100. 

      AGR1.137 is highly hydrophobic and is very difficult to handle, particularly in in vivo assays; thus, for the negative allosteric modulators to be used clinically, it would be very important to increase their solubility in water. Contrastingly, AMD3100 is a water-soluble compound. Before using the zebrafish model, we performed several experiments in mice using AGR1.137, but the inhibitory results were highly variable, probably due to its hydrophobicity. We also believe that it would be important to increase the affinity of AGR1.137 for CXCR4, as the use of lower concentrations of the negative allosteric modulator would limit potential in vivo side effects of the drug. On the other hand, we are also evaluating distinct administration alternatives, including encapsulation of the compounds in different vehicles. These alternatives may also require modifications of the compounds. 

      AMD3100 is an orthosteric inhibitor and therefore blocks all the signaling cascades triggered by CXCL12. For instance, we observed that AMD3100 treatment blocked CXCL12 binding, cAMP inhibition, calcium flux, cell adhesion and cell migration (Fig. 3, Fig. 4), whereas the effects of AGR1.137 were restricted to CXCL12-mediated directed cell migration. Although AMD3100 was well tolerated by healthy volunteers in a singledose study, it also promoted some mild and reversible events, including white blood cells count elevations and variations of urine calcium just beyond the reported normal range (Hendrix C.W. et al. Antimicrob. Agents Chemother. 2000). To treat viral infections, continuous daily dosing requirements of AMD3100 were impractical due to severe side effects including cardiac arrhythmias (De Clercq E. Front Immunol. 2015). For AMD3100 to be used clinically, it would be critical to control the timing of administration. In addition, side effects after long-term administration have potential problems. Shorter-term usage and lower doses would be fundamental keys to its success in clinical use (Liu T.Y. et al. Exp. Hematol. Oncol. 2016). The use of a negative allosteric modulator that block cell migration but do not affect other signaling pathways triggered by CXCL12 would be, at least in theory, more specific and produce less side effects. These ideas have been incorporated into the revised discussion to reflect potential advantages or improvements that AGR1.137 may offer over AMD3100.

      (39) 4. Discrepancy in AGR1.137 and AMD3100 Effects:

      -Discuss the observed discrepancy where AGR1.137 exhibits similar effects to AMD3100 but only after 48 hours. Provide insights into the temporal dynamics of their actions and potential implications for the experimental design.

      Images and data shown in Fig. 7E, F correspond to days 0 and 3 after HeLa cell implantation (tumorigenesis) and only to day 3 in the case of metastasis data. The revised version contains the effect of two distinct doses of the compounds (10 and 50 µM, for AGR1.135 and AGR1.137 and 1 and 10 µM for AMD3100). 

      (40) In the "Discussion" section, there are several points that require clarifica7on and refinement to enhance the overall coherence and depth of the analysis:  1. Reduction of Side-Effects: 

      -Provide a more detailed explanation of how the identified compounds, specifically AGR1.135 and AGR1.137, contribute to the reduction of side effects. Consider discussing specific mechanisms or characteristics that differentiate these compounds from existing antagonists.

      The sentence indicating that AGR1.135 and AGR1.137 contribute to reduce side effects is entirely speculative, as we have no experimental evidence to support it. We have therefore corrected this in the revised version. The origin of the sentence was that orthosteric antagonists typically bind to the same site as the endogenous ligand, thus blocking its interaction with the receptor. Therefore, orthosteric inhibitors (i.e. AMD3100) block all signaling cascades triggered by the ligand and therefore their functional consequences. However, the compounds described in this project are essentially negative allosteric modulators, that is, they bind to a site distinct from the orthosteric site, inducing a conformational change in the receptor that does not alter the binding of the endogenous ligand, and therefore block some specific receptor-associated functions without altering others. We observed that AGR1.137 blocked receptor oligomerization and directed cell migration whereas CXCL12 still bound CXCR4, triggered calcium mobilization, did not inhibit cAMP release or promoted receptor internalization. This is why we speculated on the limitation of side effects. The statements have been nonetheless revised in the new version of the manuscript.

      (41) 2. Binding Site Clarification:

      -Address the apparent discrepancy between docking the small compounds in a narrow cleft formed by TMV and TMVI helices and the statement that AGR1.131 binds elsewhere. Clarify the rationale behind this assertion

      After the in silico screening, a total of 40 compounds were selected.  These compounds showed distinct degrees of interaction with the cleft formed by TMV and TMVI and even with other potential interaction sites on CXCR4, with the exception of the ligand binding site according to the data described by Wescott et al. (PNAS 2016 113:9928-9933), as this possibility was discarded in the initial approach of the in silico screening. According to PELE analysis, AGR1.131 was one of the 40 selected compounds that showed a pose with low binding energy, -39.8 kcal/mol, between TMV and TMVI helices, that is, it might interact with CXCR4 through the selected area for the screening. It nonetheless also showed a best pose placed between helices TMI and TMVII, -43.7 kcal/mol. In any case, the compound was included in the biological screening, where it was unable to impact CXCL12-mediated chemotaxis (Fig. 1B). We then focused on AGR1.135 and AGR1.137, as showed a higher inhibitory effect on CXCL12-mediated migration, and on AGR1.131 as an internal negative control. AGR1.131 has a skeleton very similar to the other compounds (Fig. 1C) and can interact with the TM domains of CXCR4 without promoting effects. None of the three compounds affected CXCL12 binding, or CXCL12mediated inhibition of cAMP release, or receptor internalization. However, whereas AGR1.135 and AGR1.137, blocked CXCL12-mediated CXCR4 oligomerization and directed cell migration towards CXCL12 gradients, AGR1.131 had no effect in these experiments (Fig. 3, Fig.  4). 

      Next, we performed additional theoretical calculations (PELE, docking, MD) to inspect in detail the potential binding modes of active and inactive molecules. Based on these additional calculations, we identified that whereas AGR1.135 and AGR1.137 showed preferent binding on the molecular pocket between TMV and TMVI, the best pose for AGR1.131 was located between TMI and TMVII, as the initial experiments indicated.  These observations and data have been clarified in the revised discussion. 

      (42) 3. Impact of Chemical Modifications:

      -Discuss the consequences of the distinct chemical groups in AGR1.135, AGR1.137, and AGR1.131, specifically addressing how variations in amine length and chemical nature may influence binding affinity and biological activity. Provide insights into the potential effects of these modifications on cellular responses and the observed outcomes in zebrafish. 

      The main difference between AGR1.131 and the other two compounds is the higher flexibility of AGR1.131 due to the additional CH2 linker, together with the lack of a piperazine ring. The additional CH2 linking the phenyl ring increases the flexibility of AGR1.131 when compared with AGR1.135 and AGR1.137, and the absence of the piperazine ring might be responsible for its lack of activity, as it makes this compound able to bind to CXCR4 (Fig. 1C).

      AGR1.137 was chosen in a second round. The additional presence of the tertiary amine (in the piperazine ring) allows the formation of quaternary ammonium salts in the aqueous medium and its substituents to increase its solubility (Fig 1C). This characteristic might be related to the absence of toxic effects of the compound in the zebrafish model.

      (43) 4. Existence of Distinct CXCR4 Conformational States: 

      -Provide more detailed support for the statement suggesting the "existence of distinct CXCR4 conformational states" responsible for activating different signaling pathways. Consider referencing relevant studies or experiments that support this claim.

      Classical models of GPCR allostery and activation, which describe an equilibrium between a single inactive and a single signaling-competent active conformation, cannot account for the complex pharmacology of these receptors. The emerging view is that GPCRs are highly dynamic proteins, and ligands with varying pharmacological properties differentially modulate the balance between multiple conformations.

      Just as a single photograph from one angle cannot capture all aspects of an object in movement, no one biophysical method can visualize all aspects of GPCR activation. In general, there is a tradeoff between high-resolution information on the entire protein versus dynamic information on limited regions. In the former category, crystal and cryo-electron microscopy (cryoEM) structures have provided comprehensive, atomic-resolution snapshots of scores of GPCRs both in inactive and active conformations, revealing conserved conformational changes associated with activation. However, different GPCRs vary considerably in the magnitude and nature of the conformational changes in the orthosteric ligand-binding site following agonist binding (Venkatakrishnan A.J.V. et al. Nature 2016). Spectroscopic and computational approaches provide complementary information, highlighting the role of conformational dynamics in GPCR activation (Latorraca N.R.V. et al. Chem. Rev 2017). In the absence of agonists, the receptor population is typically dominated by conformations closely related to those observed in inactive-state crystal structures (Manglik A. et al. Cell 2015). While agonist binding drives the receptor population towards conformations similar to those in activestate structures, a mixture of inactive and active conformations remains, reflecting “loose” or incomplete allosteric coupling between the orthosteric and transducer pockets (Dror R.O. et al. Proc. Natl. Acad. Sci. USA 2011). Surprisingly, for some GPCRs, and under some experimental conditions, a substantial fraction of unliganded receptors already reside in an active-like conformation, which may be related to their level of basal or constitutive signaling (Staus D.P. et al. J. Biol. Chem. 2019);  Ye L. et al. Nature 2016).  In our case, the negative allosteric modulators, (Staus DP, et al. J. Biol. Chem 2019); Ye L. et al. Nature 2016) did not alter ligand binding and had only minor effects on specific CXCL12-mediated functions such as inhibition of cAMP release or receptor internalization, among others, but failed to regulate CXCL12-mediated actin dynamics and receptor oligomerization. Collectively, these data suggest that the described compounds alter the active conformation of CXCR4 and therefore support the presence of distinct receptor conformations that explain a partial activation of the signaling cascade.

      All these observations are now included in the revised discussion of the manuscript.

      (44) 5. Equilibrium Shift and Allosteric Ligands: 

      -Clarify the statement about "allosteric ligands shifting the equilibrium to favor a particular receptor conformation". Support this suggestion with references or experimental evidence

      In a previous answer (see our response to point 2), we explain why we define the compounds as negative allosteric modulators. These compounds do not bind the orthosteric binding site or a site distinct from the orthosteric site that alters the ligand-binding site. Their effect should be due to changes in the active conformation of CXCR4, which allow some signaling events whereas others are blocked. Our functional data thus support that through the same receptor the compounds separate distinct receptor-mediated signaling cascades, that is, our data suggest that CXCR4 has a conformational heterogeneity. It is known that GPCRs exhibit more than one “inactive” and “active” conformation, and the endogenous agonists stabilize a mixture of multiple conformations. Biased ligands or allosteric modulators can achieve their distinctive signaling profiles by modulating this distribution of receptor conformations. (Wingler L.M. & Lefkowitz R.J. Trends Cell Biol. 2020). For instance, some analogs of angiotensin II do not appreciably activate Gq signaling (e.g., increases in IP3 and Ca2+) but still induce receptor phosphorylation, internalization, and mitogen-activated protein kinase (MAPK) signaling (Wei H, et al. Proc. Natl. Acad. Sci. USA 2003). Some of these ligands activate Gi and G12 in bioluminescence resonance energy transfer (BRET) experiments (Namkung Y. et al. Sci. Signal. 2018). A similar observation was described in the case of CCR5, where some chemokine analogs promoted G protein subtype-specific signaling bias (Lorenzen E. et al. Sci. Signal 2018). Structural analysis of distinct GPCRs in the presence of different ligands vary considerably in the magnitude and nature of the conformational changes in the orthosteric ligand-binding site following agonist binding (Venkatakrishnan A.J.V. et al. Nature 2016). Yet, these changes modify conserved motifs in the interior of the receptor core and induce common conformational changes in the intracellular site involved in signal transduction. That is, these modifications might be considered distinct receptor conformations. 

      The revised discussion contains some of these interpretations to support our statement about the stabilization of a particular receptor conformation triggered by the negative allosteric modulators. 

      (45) 6. Refinement of Binding Mode: 

      -Clarify the workflow for obtaining the binding mode, particularly the role of GLIDE and PELE. Clearly explain how these software tools were used in tandem to refine the binding mode. 

      The computational sequential workflow applied in this project included, i) Protein model construction, ii) Virtual screening (Glide), iii) PELE, iv) Docking (AutoDock and Glide) and v) Molecular Dynamics (AMBER).

      Glide was applied for the structure-based virtual screening to explore which compounds could fit and interact with the previously selected binding site.

      After the identification of theoretically active compounds (modulators of CXCR4), additional calculations were done to identify a potential binding site. PELE was used in this sense, to study how the compounds could bind in the whole surface of the target (TMV-TMVI). By applying PELE, we avoided biasing the calculation, and we found that the trajectories with better interaction energies identified the cleft between TMV and TMVI as the binding site for AGR1.135 and AGR1.137, and not for AGR1.131. AGR1.131 showed a pose with low binding energy, -39.8 kcal/mol, between TMV and TMVI helices, that is, it might interact with CXCR4 in the selected area for the screening. But it also showed a better pose placed between helices TMI and TMVII, - 43.7 kcal/mol (see our response to point 41). These data have been now confirmed using Schrodinger’s MM-GBSA procedure (see our response to points 6 and 8). In any case, the compound was included in the biological screening, where it was unable to affect CXCL12-mediated chemotaxis (Fig. 1B). Docking and MD simulations were then performed to study and refine the specific binding mode in this cavity. These data were important to choose the mutations on CXCR4 required, to test whether the compounds reversed its behavior. In these experiments we also confirmed that AGR1.131 had a better pose on the TMI-TMVII region. 

      (46) 7. Impact of Compound Differences on CXCR4-F249L mutant: 

      -Provide visual aids, such as figures, and additional experiments to support the statement about differences in the behavior of AGR1.135 and AGR1.137 on cells expressing CXCR4-F249L mutant. Elaborate on the closer interaction suggested between the triazole group of AGR1.137 and the F249 residue

      At the reviewer’s suggestion, Fig. 5 has been modified to incorporate a closer view of the interactions identified and new panels in new Fig. 6 have been added to show in detail the effect of the mutations selected on the structure of the cleft between TMV and TMVI. The main difference between AGR1.135 and AGR1.137 is how the triazole group interacts with F249 and L216 (Author response image 6). In AGR1.137, the three groups are aligned in a parallel organization, which appears to be more effective: This might be due to a better adaptation of this compound to the cleft since there is only one hydrogen bond with V124. In AGR1.135, the compound interacts with the phenyl ring of F249 and has a stronger interaction at the apical edge to stabilize its position in the cleft. However, there is still an additional interaction present. When changing F249

      Author response image 6.

      Cartoon representation of the interaction of CXCR4 F249L mutant with AGR1.135 (A) and AGR1.137 (B). The two most probable conformations of Leucine rotamers are represented in cyan A and B conformations. Van der Waals interactions are depicted in blue cyan dashed lines, hydrogen bonds in black dashed lines. CXCR4 segments of TMV and TMVI are colored in blue and pink, respectively

      to L (Fig. VIIA, B, only for review purposes) and showing the two most likely rotamers resulting from the mutation, it is observed that rotamer B is in close proximity to the compound, which may cause the binding to either displace or adopt an alternative conformation that is easier to bind into the cleft. As previously mentioned, it is likely that AGR1.135 can displace the mutant rotamer and bind into the cleft more easily due to its higher affinity.

      (47) In the "Materials and Methods" section, the computational approach for the "discovery of CXCR4 modulators" requires significant revision and clarification. The following suggestions aim to address the identified issues: 1. Structural Modeling: 

      -Reconsider the use of SWISS-MODEL if there is an available PDB code for the entire CXCR4 structure. Clearly articulate the rationale for choosing one method over the other and explain any limitations associated with the selected approach. 

      The SWISS-model server allows for automated comparative modeling of 3D protein structures that was pioneered in the fields of automated modeling. At the time we started this project. it was the most accurate method to generate reliable 3D protein structure models.

      As explained above, we have now predicted the structure of the target using AlphaFold (Jumper J. et al, Nature 2021) and performed several additional experiments that confirm that the small compounds bind the selected pocket as the original strategy indicated (see our response to point 6). (Fig. II, only for review purposes).

      (48) 2. Parametriza7on of Small Compounds: 

      -Provide a detailed description of the parametrization process for the small compounds used in the study. Specify the force field and parameters employed, considering the obsolescence of AMBER14 and ff14SB. Consider adopting more contemporary force fields and parameterization strategies. 

      When we performed these experiments, some years ago, the force fields applied (ff14SB, AMBER14 used in MD or OPLS2004 in docking with Glide) were well accepted and were gold standards. It is, however, true that the force fields have evolved in the past few years, Moreover, in the case of the MD simulations, to consider the parameters of the ligands that are not contained within the force field, we performed an additional parameterization as a standard methodology. We then generated an Ab initio optimization of the ligand geometry, defining as basis sets B3LYP 6-311+g(d), using Gaussian 09, Revision A.02, and then a single point energy calculation of ESP charges, with HF 6311+g(d) on the optimized structure. As the last step of the parametrization, the antechamber module was used to adapt these charges and additional parameters for MD simulations.

      (49) 3. Treatment of Lipids and Membrane: 

      -Elaborate on how lipids were treated in the system. Clearly describe whether a membrane was included in the simulations and provide details on its composition and structure. Address the role of the membrane in the study and its relevance to the interactions between CXCR4 and small compounds 

      To stabilize CXCR4 and more accurately reproduce the real environment in the MD simulation, the system was embedded in a lipid bilayer using the Membrane Builder tool (Sunhwan J. et al. Biophys. J. 2009) from the CHARMM-GUI server. The membrane was composed of 175 molecules of the fatty acid 1-palmitoyl-2-oleoyl-sn-glycero-3phosphocholine (POPC) in each leaflet. The protein-membrane complex was solvated with TIP3 water molecules. Chloride ions were added up to a concentration of 0.15 M in water, and sodium ions were added to neutralize the system. This information was previously described in detail.

      (50) 4. Molecular Dynamics Protocol: 

      -Provide a more detailed and coherent explanation of the molecular dynamics protocol. Clarify the specific steps, parameters, and conditions used in the simulations. Ensure that the protocol aligns with established best practices in the field.

      Simulations were calculated on an Asus 1151 h170 LVX-GTX-980Ti workstation, with an Intel Core i7-6500 K Processor (12 M Cache, 3.40 GHz) and 16 GB DDR4 2133 MHz RAM, equipped with a Nvidia GeForce GTX 980Ti available for GPU (Graphics Processing Unit) computations. MD simulations were performed using AMBER14 (Case D.A. et al. AMBERT 14, Univ. of California, San Francisco, USA, 2014) with ff14SB (Maier J.A. et al. J. Chem. Theory Comput. 2015) and lipid14 (Dickson C. J. et al. J. Chem. Theory Comput. 2014) force fields in the NPT thermodynamic ensemble (constant pressure and temperature). Minimization was performed using 3500 Steepest Descent steps and 4500 Conjugate Gradient steps three times, firstly considering only hydrogens, next considering only water molecules and ions, and finally minimizing all atoms. Equilibration raises system temperature from 0 to 300 K at a constant volume fixing everything but ions and water molecules. After thermalization, several density equilibration phases were performed. In the production phase, 50 ns MD simulations without position restraints were calculated using a time step of 2 fs. Trajectories of the most interesting poses were extended to 150 ns. All bonds involving hydrogen atoms were constrained with the SHAKE algorithm (Lippert R.A. et al. J. Chem. Phys. 2007). A cutoff of 8 Å was used for the Lennard-Jones interaction and the short-range electrostatic interactions. Berendsen barostat (Berendsen H.J. et al. J. Chem. Phys.  1984) and Langevin thermostat were used to regulate the system pression and temperature, respectively. All trajectories were processed using CPPTRAJ (Roe D.R. & Cheatham III T.E. J. Chem. Theory Comput. 2013) and visualized with VMD (Visual Molecular Dynamics) (Humphrey W. et al. J. Mol. Graphics. 1996). To reduce the complexity of the data, Principal Component Analysis (PCA) was performed on the trajectories using CPPTRAJ.

      (51) Consider updating the molecular dynamics protocol to incorporate more contemporary methodologies, considering advancements in simulation techniques and software.

      In our answer to points 6 and 47, we describe why we use the technology based on Swiss-model and PELE analysis and how we have now used Alphafold and other more contemporary methodologies to confirm that the small compounds bind the selected pocket.

      (52) Figure 1A: 

      •  Consider switching to a cavity representation for CXCL12 to enhance clarity and emphasize the cleft.

      Fig. 1A has been modified to emphasize the cleft.

      (53) Explicitly show the TMV-TMVI cleft in the figure for a more comprehensive visualization. 

      In Fig. 1A we have added an insert to facilitate TMV-TMVI visualization.

      (54) Figure 1B: 

      •  Clearly explain the meaning of the second DMSO barplot to avoid confusion. 

      To clarify this panel, we have modified the figure and the figure legend. Panel B now includes a complete titration of the three compounds analyzed in the manuscript.  The first bar shows cell migration in the absence of both treatment with AMD3100 and stimulation with CXCL12.  The second bar shows migration in response to CXCL12 in the absence of AMD3100. The third bar shows the effect of AMD3100 on CXCL12-induced migration, as a known control of inhibition of migration.  We hope that this new representation of the data results is clearer.

      (55) Figure 1C: 

      •  Provide a clear legend explaining the significance of the green shading on the small compounds. 

      The legend for Fig. 1C has been modified accordingly to the reviewer’s suggestion.

      (56) Figure 2: 

      •  Elaborate on the role of fibronectin in the experiment and explain the specific contribution of CD86-AcGFP.

      The ideal situation for TIRF-M determinations is to employ cells on a physiological substrate complemented with or without chemokines. Fibronectin is a substrate widely used in different studies that allows cell adhesion, mimicking a physiological situation. Jurkat cells express alpha4beta1 and alpha5beta1 integrins that mediate adhesion to fibronectin (Seminario M.C. et al. J. Leuk. Biol. 1999).

      Regarding the use of CD86-AcGFP in TIRF-M experiments. We currently determine the number of receptors in individual trajectories of CXCR4 using, as a reference, the MSI value of CD86-AcGFP that strictly showed a single photobleaching step (Dorsch S. et al. Nat Methods 2009).

      We preferred to use CD86-AcGFP in cells instead of AcGFP on glass, to exclude any potential effect on the different photodynamics exhibited by AcGFP when bound directly to glass. In any case, this issue has been clarified in the revised version.

      (57) Figure 3D: 

      •  Include a plot for the respective band intensity to enhance data presentation 

      The plot showing the band intensity analysis of the experiments shown in Fig. 3D was already included in the original version (see old Supplementary Fig. 3). However, in the revised version, we include these plots in the same figure as panels 3E and 3F.  As a control of inhibition of CXCL12 stimulation, we have also included a new figure (Supplementary Fig. 4) showing the effect of AMD3100 on CXCL12-induced activation of Akt and ERK as analyzed by western blot.

      (58) Consider adding AMD3100 as a control for comparison. 

      In agreement with the reviewer’s suggestion, we have added the effect of AMD3100 in most of the functional experiments performed.

      (59) Figure 4: 

      •  Address the lack of positive controls in Figure 4 and consider their inclusion for a more comprehensive analysis. 

      DMSO bars correspond to the control of the experiment, as they represent the effect of CXCL12 in the absence of any allosteric modulator. As previously described in this point-by-point reply, DMSO bars correspond to the control performed with the solvent with which the small compounds, at maximum concentration, are diluted.  Therefore, they show the effect of the solvent on CXCL12 responses. In any case, and in order to facilitate the comprehension of the figure we have also added the controls in the absence of DMSO to demonstrate that the solvent does not affect CXCL12-mediated functions, together with the effect of the orthosteric inhibitor AMD3100. In addition, we have also included representative images of the effect of the different compounds on CXCL12-induced polarization (Fig. 4C).

      (60) In Figure 4A, carefully assess overlapping error bars and ensure accurate interpreta7on. If necessary, consider alternative representation. 

      We have tried alternative representations of data in Fig. 4A, but in all cases the figure was unclear. We believe that the way we represent the data in the original manuscript is the most clear and appropriate.  Nevertheless, we have now included significance values as a table annexed to the figure, as well as the effect of AMD3100, as a control of inhibition

      (61) Supplementary Figure 1A: 

      •  Improve the clarity of bar plots for better understanding. Consider reordering them from the most significant to the least. 

      This was a good idea, and therefore Supplementary Fig. 1A has been reorganized to improve clarity.

      (62) Supplementary Figure 1C: 

      •  Clarify the rationale behind choosing the 12.5 nM concentration and explain if different concentrations of CXCL12 were tested. 

      In old Supplementary Fig. 1C, we used untreated cells, that is, CXCL12 was not present in the assay.  These experiments were performed to test the potential toxicity of DMSO (solvent) or the negative allosteric modulators on Jurkat cells. The 12.5 nM concentration of CXCL12 mentioned in the figure legend applied only to panels A and B, as indicated in the figure legend. We previously optimized this concentration for Jurkat cells using different concentrations of CXCL12 between 5 and 100 nM.  Nevertheless, we have reorganized old supplementary fig. 1 and clarified the figure legend to avoid misinterpretations (see Supplementary Fig 1A, B and Supplementary Fig. 2A, B).

      (63) Explain the observed reduction in fluorescence intensity for AGR1.135. 

      The cell cycle analysis has been moved from Supplementary Fig. 1C to a new Supplementary Fig. 2.  It now includes the flow cytometry panels to show fluorescence intensity as a function of the number of cells analyzed (Panel 1A) as well as a table (panel B) with the percentage of cells in each phase of the cell cycle. We believe that the apparent reduction in fluorescence that the reviewer observes is mainly due to the number of events analyzed. However, we have changed the flow cytometry panels for others that are more representative and included a table with the mean of the different results. When we determined the percentage of cells in each cell cycle phase, we observed that it looks very similar in all the experimental conditions. That is, none of the compounds affected any of the cell cycle phases. We have also included the effect of H2O2 and staurosporine as control compounds inducing cell death and cell cycle alteration of Jurkat cells.

      (64) Supplementary Table 1: 

      •  Include a column specifying the scoring for each compound to provide a clear reference for readers. 

      To facilitate references to readers, we have now included the inhibitory effect of each compound on Jurkat cell migration in the revised version of this table. 

      (65) Minor Points 

      Page 2 - Abstract: Rephrase the first sentence of the abstract to enhance fluidity. 

      Although the entire manuscript was revised by a professional English editor, we appreciate the valuable comments of this reviewer and we have corrected these issues accordingly.

      (66) Page 2 - Abstract: Explicitly define "CXCR4" as "C-X-C chemokine receptor type 4" the first time it appears.

      We have not used C-X-C chemokine receptor type 4 the first time it appears in the abstract. CXCR4 is an acronym normally accepted to identify this chemokine receptor, and it is used as CXCR4 in many articles published in eLife. However, we introduce the complete name the first time it appears in the introduction.

      (67) Page 2 - Abstract: Explicitly define "CXCL12" as "C-X-C motif chemokine 12" the first time it is mentioned. 

      As we have discussed in the previous response, we have not used C-X-C motif chemokine 12 the first time CXCL12 appears in the abstract, as it is a general acronym normally accepted to identify this specific chemokine, even in eLife papers. However, we introduce the complete name the first time it appears in the introduction section.

      (68) Page 2 - Abstract: Explicitly define "TMV and TMVI" upon its first mention.

      The acronym TM has been defined as “Transmembrane” in the revised version

      (69) Page 2 - Abstract: Review the use of "in silico" in the sentence for accuracy and consider revising if necessary.

      With the term “in silico” we want to refer to those experiments performed on a computer or via computer simulation software. We have carefully reviewed its use in the new version of the manuscript.

      (70) Page 2 - Abstract: Add a comma after "compound" in the sentence, "We identified AGR1.137, a small compound that abolishes...".

      A comma after “compound” has been added in the revised sentence.

      (71) Page 2 - Significance Statement: Rephrase the first sentence of the "Significance Statement" to avoid duplication with the abstract.

      The first sentence of the Significance Statement has been revised to avoid duplication with the abstract. 

      (72) Page 2 - Significance Statement: Break down the lengthy sentence, "Here, we performed in silico analyses..." for better readability. 

      The sentence starting by “Here, we performed in silico analyses…” has been broken down in the revised manuscript.

      (73) Page 2 - Introduction: Replace "Murine studies" with a more specific term for clarity.

      The term “murine studies” is normally used to refer to experimental studies developed in mice. We have nonetheless rephrased the sentence.

      (74) Page 3 - Introduction: Rephrase the sentence for clarity: "Finally, using a zebrafish model, ..."

      The sentence has been now rephrased for clarity.

      (75) Results-AGR1.135 and AGR1.137 block CXCL12-mediated CXCR4 nanoclustering and dynamics: 

      Rephrase the sentence for clarity: "Retreatment with AGR1.135 and AGR1.137, but not with AGR1.131, substantially impaired CXCL12-mediated receptor nanoclustering.”

      The sentence has been rephrased for clarity.

      (76) Results - AGR1.135 and AGR1.137 incompletely abolish CXCR4-mediated responses in Jurkat cells: Clarify the sentence: "In contrast to the effect promoted by AMD3100, a binding-site antagonist of CXCR4..."

      The sentence has been modified for clarity.

      (77) Consider using "orthosteric" instead of "binding-site" antagonist.

      The term orthosteric is now used throughout to refer to a binding site antagonist.

      (78) Discussion: Use the term "in silico" only when necessary.

      We have carefully reviewed the use of “in silico” in the manuscript.

      (79) Discussion: Clarify the sentence: "...not affect neither CXCR2-mediated cell migration...". Confirm if "CXCL12" is intended.

      The sentence refers to the chemokine receptor CXCR2, which binds the chemokine CXCL2. To test the specificity of the compounds for the CXCL12/CXCR4 axis, we evaluated CXCL2-mediated cell migration.  The results indicated that CXCL2/CXCR2 axis was not affected by the negative allosteric modulators, whereas CXCL12-mediated cell migration was blocked.  The sentence has been clarified in the new version of the manuscript.

      (80) Figure 4B: Bold the "B" in the figure label for consistency.

      The “B” in Fig. 4B has been bolded.

      Reviewer #2

      (1) Fig 2. The SPT data is sub-optimal in its presentation as well as analysis. Example images should be shown. The analysis and visualization of the data should be reconsidered for improvements. Graphs with several hundreds, in some conditions over 1000 tracks, per condition are very hard to compare. The same (randomly selected representative set) number of data points should be shown for better visualization. Also, more thorough analyses like MSD or autocorrelation functions are lacking - they would allow enhanced overall representation of the data.

      In agreement with the reviewer’s commentary, we have modified the representation of Fig. 2. We have carefully read the paper published by Lord S.J. and col. (Lord S. J. et al., J. Cell Biol. 2020) and we apply their recommendations for these type of data. We have also included as supplementary material representative videos for the TIRF-M experiments performed to allow readers to visualize the original images. Regarding the MSD analyses, they were developed to determine all D1-4 values. According to the data published by Manzo & García-Parajo (Manzo C. & García-Parajo M.F. Rep.Prog. Phys. 2015) due to the finite trajectory length the MSD curve at large tlag has poor statistics and deviates from linearity. However, the estimation of the Diffusion Coefficient (D1-4) can be obtained by fitting of the short tlag region of the MSD plot giving a more accurate idea of the behavior of particles. In agreement we show D1-4 values and not MSD data. 

      Due to the space restrictions, it is very difficult to include all the figures generated, but, only for review purposes, we included in this point-by-point reply some representative plots of the MSD values as a function of the time from individual trajectories showing different types of motion obtained in our experiments (Author response image 7).

      Author response image 7.

      Representative MSD plots from individual trajectories of CXCR4-AcGFP showing different types of motion: A) confined, B) Brownian/Free, C) direct transport of CXCR4-AcGFP particles diffusing at the cell membrane detected by SPT-TIRF in resting JKCD4 cells.

      Further analysis, such as the classification based on particle motion, has not been included in this article. This classification uses the moment scaling spectrum (MSS), described by Ewers H. et al. 2005 PNAS, and requires particles with longer trajectories (>50 frames). Only for review purposes, we include a figure showing the percentage of the MSS-based particle motion classification for each condition. As expected, most of long particles are confined, with a slight increase in the percentage upon CXCL12 stimulation in all conditions, except in cell treated with AGR1.137 (Author response image 8).

      Author response image 8.

      Effects of the negative allosteric modulators on the Types of Motion of CXCR4. Percentage of single trajectories with different types of motion, classified by MSS (DMSO: 58 particles in 59 cells on FN; 314 in 63 cells on FN+CXCL12; AGR1.131: 102 particles in 71 cells on FN; 258in 69 cells on FN+CXCL12; AGR1.135: 86 particles in 70 cells on FN; 120 in 77 cells on FN+CXCL12; AGR1.137: 47 particles in 66 cells on FN; 74 in 64 cells on FN+CXCL12) n = 3.

      (2) Fig 3. The figure legends have inadequate information on concentrations and incubation times used, both for the compounds and other treatments like CXCL12 and forskolin. For the Western blot data, also the quantification should be added to the main figure. The compounds, particularly AGR1.137 seem to lead to augmented stimulation of pAKT and pERK. This should be discussed

      The Fig. 3 legend has been corrected in the revised manuscript. Fig. 3D now contains representative western blots and the densitometry evaluation of these experiments. As the reviewer indicates, we also detected in the western blot included, augmented stimulation of pAKT and pERK in cells treated with AGR1.137. However, as shown in the densitometry analysis, no significant differences were noted between the data obtained with each compound. As a control of inhibition of CXCL12 stimulation we have included a new Supplementary Fig. 4 showing the effect of AMD3100 on CXCL12-induced activation of Akt and ERK as analyzed by western blot.

      (3) Fig. 4 immunofluorescence data on polarization as well as the flow chamber data lack the representative images of the data. The information on the source of the T cells is missing. Not clear if this experiment was done on bilayers or on static surfaces.

      Representative images for the data shown in Figure 4B have been added in the revised figure (Fig. 4C). The experiments in Fig. 4B were performed on static surfaces. As indicated in the material and methods section, primary T cell blasts were added to fibronectin-coated glass slides and then were stimulated or not with CXCL12 (5 min at 37ºC) prior to fix permeabilize and stain them with Phalloidin. Primary T cell blasts were generated from PBMCs isolated from buffy coats that were activated in vitro with IL-2 and PHA as indicated in the material and methods section.

      (4) The data largely lacks titration of different concentrations of the compounds. How were the effective concentration and treatment times determined? What happens at higher concentrations? It is important to show, for instance, if the CXCR12 binding gets inhibited at higher concentrations. most experiments were performed with 50 uM, but HeLa cell data with 100 uM. Why and how was this determined? 

      The revised version contains a new panel in Fig. 1B to show a more detailed kinetic analysis with different concentrations (1-100 µM) of the compounds in the migration experiments using Jurkat cells. We choose 50 µM for further studies as it was the concentration that inhibits 50-75% of the ligand induced cell migration. 

      We have also included the effect of two doses of the compounds (10 and 50 µM) in the zebrafish model as well as AMD3100 (1 and 10 µM) as control (new Fig. 7D, E).  Tumors were imaged within 2 hours of implantation and tumor-baring embryos were treated with either vehicle (DMSO) alone, AGR1.131 or AGR1.137 at 10 and 50 µM or AMD3100 at 1 and 10 µM for three days, followed by re-imaging.

      Regarding the amount of CXCL12 used in these experiments, with the exception of cell migration assays in Transwells, where the optimal concentration was established at 12.5 nM, in all the other experiments the optimal concentration of CXCL12 employed was 50 nM. In the case of the directional cell migration assays, we use 100 nM to create the chemokine gradient in the device. These concentrations have been optimized in previous works of our laboratory using these types of experiments. It should also be noted that in the experiments using lipid bilayers or TIRF-M experiments, CXCL12 is used to coat the plates and therefore it is difficult to determine the real concentration that is retained in the surface after the washing steps performed prior adding the cells.

      (5) The authors state that they could not detect direct binding of the compounds and the CXCR14. It should be reported what approaches were tried and discussed why this was not possible. 

      We attempted a fluorescence spectroscopy strategy to formally prove the ability of AGR1.135 to bind CXCR4, but this strategy failed because the compound has a yellow color that interfered with the determinations. We also tried a FRET strategy (see supplementary Fig. 7) and detected a significant increase in FRET efficiency of CXCR4 homodimers in cells treated with AGR1.135; this effect was due to the yellow color of this compound that interferes with FRET determinations. In the same assays, AGR1.137 did not modify FRET efficiency for CXCR4 homodimers and therefore we cannot assume that AGR1.137 binds on CXCR4. All these data have been considered in the revised discussion.

      (6) The proliferation data in Supplementary Figure 1 lacks controls that affect proliferation and indication of different cell cycle stages. What is the conclusion of this data? More information on the effects of the drug to cell viability would be important.

      Toxicity in Jurkat cells was first determined by propidium iodide incorporation. Some compounds (i.e., AGR1.103 and VSP3.1) were discarded from further analysis as they were toxic for cells. In a deeper analysis of cell toxicity, even if these compounds did not kill the cells, we checked whether they could alter the cell cycle of the cells. New Supplementary Fig. 2 includes a table (panel B) with the percentage of cells in each cell cycle phase, and no differences between any of the treatments tested were detected. 

      Nevertheless, to clarify this issue the revised version of the figure also includes H2O2 and staurosporine stimuli to induce cell death and cell cycle alterations as controls of these assays.

      (7) The flow data in Supplementary Figure 2 should be statistically analysed. 

      Bar graphs corresponding to the old Supplementary Fig. 2 (new Supplementary Fig. 3) are shown in Fig. 3B. We have also incorporated the corresponding statistical analysis to this figure. 

      (8) In general, the authors should revise the figure legends to ensure that critical details are added. 

      We have carefully revised all the figure legends in the new version of the manuscript.

      (9) Bar plots are very poor in showing the heterogeneity of the data. Individual data points should be shown whenever feasible. Superplot-type of representation is strongly advised (https://doi.org/10.1083/jcb.202001064).

      We have carefully read the paper published by Lord S.J. and col. (Lord S. J. et al., J. Cell Biol. 2020) and we apply their recommendations for our TIRF-M data (see revised

      Fig.  2).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript by Liu et al explores the role of the UPR and immune regulators in the evaluation of nutritional quality in C. elegans. They identify neuronal UPR activation and the MAPK PMK-1 as key responders to low food quality. In particular, the data suggest that these pathways are activated by low levels of vitamin C synthesis that result from the low sugar levels present in heat-killed E. coli.

      Strengths:

      The results are intriguing and expand our understanding both of physiological food evaluation systems, and of the known roles of stress response pathways in organismal physiology. The authors use a range of techniques, encompassing imaging, metabolomic analysis, gene expression analysis, and behavioural assays, to support their claims.

      Thank you for your thorough review and acknowledgment of the strengths of our study.

      Weaknesses:

      There is limited mechanistic analysis in the study. In particular, how does low vitamin C trigger UPR activation? This is an intriguing finding that, if followed up, could potentially reveal a novel mechanism of UPR activation. In addition, how is the activation of the PMK-1 pathway driven by/coordinated with UPR activation? The data in some figures is not as convincing as it could be: the magnitude of the effect size is small in the supplementation experiments, and the statistical tests used are not always appropriate to enable multiple comparisons.

      (1) There is limited mechanistic analysis in the study. In particular, how does low vitamin C trigger UPR activation? This is an intriguing finding that, if followed up, could potentially reveal a novel mechanism of UPR activation. 

      Thank you for highlighting the need for further mechanistic analysis in our study. We appreciate the opportunity to clarify the process by which low vitamin C triggers UPR activation.

      Our investigation revealed that the vitamin C content in heat-killed E. coli (HK-E. coli) is comparable to that of live E. coli or HK-yfbR mutant E. coli (Figure 4-figure supplement 1A), indicating that the induction of unfolded protein response (UPR) in C. elegans by HK-E. coli is not solely attributed to low vitamin C levels but rather involves other unidentified factors.

      Through metabolomic analysis, we observed significant decreases in sugar levels, including lactose, D-(+)-sucrose, and D-(+)-glucose, in HK-E. coli (Figure 3B, Table S1). Notably, supplementing D-(+)-glucose effectively inhibited UPRER, immune response, and avoidance behavior induced by HK-E. coli (Figure 3E-H). These findings suggest that the deficiency in sugars in HK-E. coli triggers a stress response and avoidance behavior in animals, which can be alleviated by D-(+)-glucose supplementation.

      Furthermore, when comparing heat-killed E. coli mutant yfbR (HK-yfbR) to HK-E. coli, we observed significantly higher sugar levels, including lactose and D-(+)-sucrose, in HK-yfbR (Figure 3B). This was accompanied by reduced UPRER in animals feeding on HK-yfbR (Figure 3-figure supplement 1B), indicating that higher sugar levels may inhibit the induction of UPRER by low-quality food.

      Considering that the synthesis of vitamin C (VC) occurs through the glucuronate pathway, utilizing D-glucose as a precursor 1, 2 (Figure 4A), we investigated whether the vitamin C biosynthesis pathway is involved in evaluating low-quality food using D-glucose. Contrary to our initial hypothesis, animals fed live E. coli did not exhibit higher glucose levels compared to those fed low-quality food (HK_-E. coli_). Our results indicate that animals maintain similar VC levels when fed ideal food (live E. coli) compared to low-quality food (HK-E. coli) (Figure 4B), suggesting that animals do not stimulate VC biosynthesis under favorable food conditions. However, supplementation of D-GlcA or E. coli-yfbR mutation in HK-E. coli significantly improved VC levels when animals were fed low-quality food (HK-OP50) (Figure 4B, 4C). Moreover, VC or D-glucuronate (D-GlcA) supplementation inhibited HK-E. coli-induced UPRER (Figure 4D), indicating that glucose boosts the animal's ability to adapt to unfavorable food environments by increasing VC levels, thereby inhibiting UPRER, but not under favorable food conditions.

      These findings shed light on the complex interplay between vitamin C, sugar levels, and UPR activation, providing valuable insights into the mechanisms underlying food evaluation and stress response pathways in organisms.

      Overall, we are grateful for the reviewer's constructive feedback, which motivates us to continue our efforts to understanding how the UPR response contributes to the complexities of food evaluation and behavioral responses in organisms.

      (2) In addition, how is the activation of the PMK-1 pathway driven by/coordinated with UPR activation?

      Thank you for your insightful inquiry. In our discussion section, we have addressed this question by integrating new data and discussion to provide insights into the coordination between PMK-1 pathway activation and UPR activation.

      Previous studies have demonstrated that activating innate immunity, specifically the PMK-1 MAPK pathway, results in a reduction in translation3, as well as a shutdown of food digestion in animals4, likely aimed at reducing protein translation and cellular metabolism. To further investigate this relationship, we measured the translation level of animals fed with heat-killed E. coli (HK-E. coli) and found a significant reduction in total translation ability in these animals (Figure 5-figure supplement 1D). This observation suggests that activating innate immunity through the PMK-1 MAPK pathway may serve as a mechanism to slow down translation progress, thereby alleviating the pressure on the unfolded protein response (UPR) and preventing excessive UPRER activation.

      By integrating these findings, we propose a model wherein activation of the PMK-1 pathway coordinates with UPR activation to regulate translation and cellular metabolism in response to low-quality food. This coordinated response likely serves to maintain cellular homeostasis and prevent detrimental effects associated with excessive UPRER activation.

      These insights contribute to our understanding of the intricate interplay between innate immunity, cellular stress responses, and metabolic regulation in organisms facing nutritional challenges.

      (3) The data in some figures is not as convincing as it could be: the magnitude of the effect size is small in the supplementation experiments, and the statistical tests used are not always appropriate to enable multiple comparisons.

      We appreciate the reviewers' concerns regarding the data presentation and statistical analyses in some of our figures. In response to this feedback, we have made revisions to improve the robustness and clarity of our statistical methods.

      All statistical analyses were conducted using GraphPad Prism 8.0 software. Specifically, a two-tailed unpaired t-test was employed for the statistical analysis of two groups of samples, while one-way or two-way ANOVA was utilized for the statistical analysis of more than two groups of samples. These adjustments ensure appropriate statistical comparisons and enhance the reliability of our findings.

      Reviewer #2 (Public Review):

      Summary:

      In this work, the authors aim to better understand how C. elegans detects and responds to heat-killed (HK) E. coli, a low-quality food. They find that HK food activates two canonical stress pathways, ER-UPR, and innate immunity, in the nervous system to promote food aversion. Through the creative use of E. coli genetics and metabolomics, the authors provide evidence that the altered carbohydrate content of HK food is the trigger for the activation of these stress responses and that supplementation of HK food with sugars (or their biosynthetic product, vitamin C), reduces stress pathway induction and food avoidance. This work makes a valuable addition to the literature on metabolite detection as a mechanism for the evaluation of nutritional value; it also provides some new insight into the physiologically relevant roles of well-known stress pathways in modulating behavior.

      Strengths:

      -The work addresses an important question by focusing on understanding how the nervous system evaluates food quality and couples this with behavioral change. -The work takes full advantage of the tools available in this powerful system and builds on extensive previous studies on feeding behavior and stress responses in C. elegans.

      -Creative use of E. coli genetics and metabolite profiling enabled the identification of carbohydrate metabolism as a candidate source of food-quality signals.

      -For the most part, the studies are rigorous and logically designed, providing good support for the authors' model.

      We deeply appreciate the reviewer's insightful assessment of our study's strengths. 

      Weaknesses:

      -It is not clear how the mechanism identified here is connected to previously described, related processes. In particular, it is not clear whether this mechanism has a role in the detection of other low-quality foods. Further, the specificity of the ability of sugar/vitamin C to suppress stress pathway induction is unclear (i.e., does sugar/vitamin C have any effect on the activation of these pathways through other means?). Additionally, the relationship of this pathway to the vitamin B2-sensing mechanism previously described by the senior author is unclear. These issues do not weaken confidence in the authors' conclusions, but they do reduce the potential significance of the work.

      (1) In particular, it is not clear whether this mechanism has a role in the detection of other low-quality foods. 

      Thank you for your valuable feedback. In response to your inquiry, we investigated whether the UPRER (IRE-1/XBP-1) - Innate immunity (PMK-1/p38 MAPK) axis is specific to evaluating low-quality food (HK-E. coli) or if it plays a broader role in food detection.

      We conducted behavioral assays using N2, pmk-1, and xbp-1 mutant animals fed with normal E. coli food, inedible food (Saprophytic staphylococci)4, and pathogenic food (Pseudomonas aeruginosa-PA14)5. We found that N2, pmk-1, and xbp-1 mutant worms did not exhibit avoidance behavior when presented with normal food (OP50). However, both N2 and xbp-1 mutant worms were able to escape from inedible food (N2 was predominantly found on the border areas of the bacterial lawn and xbp-1 mutant worms on border and in), Saprophytic staphylococci, whereas pmk-1 mutant worms did not exhibit this avoidance behavior. Notably, N2 and xbp-1 mutant worms exhibited even more pronounced avoidance behavior when exposed to Pseudomonas aeruginosa, whereas pmk-1 mutant worms were more susceptible to infection by this pathogen (Figure 2-figure supplement 2C). These findings suggest that the UPR-Immunity pathway plays a crucial role in helping animals avoid low-quality food (HK-E. coli) by triggering an avoidance response. In contrast, the Innate immunity pathway, mediated by PMK-1/p38 MAPK, appears to play a key role in evaluating unfavorable food sources, such as HK-E. coli, Saprophytic staphylococci, and Pseudomonas aeruginosa, and helping animals avoid these environments.

      (2) Further, the specificity of the ability of sugar/vitamin C to suppress stress pathway induction is unclear (i.e., does sugar/vitamin C have any effect on the activation of these pathways through other means?). 

      Thank you for your inquiry regarding the specificity of the ability of sugar/vitamin C to suppress stress pathway induction. We aimed to address this question by investigating whether high levels of VC inhibit other stress-induced UPRER pathways.

      Previous studies have shown that both Tunicamycin6 and pathogenic bacteria, such as Pseudomonas aeruginosa-PA145, induce UPRER in C. elegans. In response to your query, we conducted experiments to examine whether VC supplementation inhibits UPRER induced by these stressors. Our findings indicate that VC supplementation does not inhibit UPRER induced by either Tunicamycin or PA14 (Author response image 1).

      These results suggest that while sugar/vitamin C may suppress stress pathway induction in the context of low-quality food, its effects may not extend to other stressors that induce UPRER through different mechanisms. This insight helps clarify the specificity of sugar/vitamin C's role in modulating stress pathway activation, contributing to a better understanding of the broader regulatory networks involved in stress response in C. elegans.

      Author response image 1.

      VC supplementation does not inhibit Tunicamycin or PA14-induced UPRER.

      (3) Additionally, the relationship of this pathway to the vitamin B2-sensing mechanism previously described by the senior author is unclear.

      In response to your comment, we would like to clarify the relationship of our pathway to the previously described vitamin B2-sensing mechanism we found. Previous studies have demonstrated that heat-killed E. coli (HK-E. coli) serves as a low-quality food source incapable of supporting the growth of C. elegans larvae, whereas supplementation with vitamin B2 (VB2) can restore animal growth7

      This study investigates the role of sugar deficiency in HK-E. coli, which induces the UPRER-immune response and avoidance behavior in C. elegans. Surprisingly, our findings indicate that supplementing HK-E. coli with carbohydrates such as D-Glc and D-GlcA does not promote animal development (Figure 3-figure supplement 2G), suggesting that carbohydrates are not essential for supporting animal growth on this food source. However, we did observe that carbohydrates play a critical role in inhibiting the UPRER-immune response induced by sugar deficiency in HK-E. coli.

      -The authors claim that the induction of the innate immune pathway reporter irg-5::GFP is "abolished" in pmk-1(RNAi) animals, but Figure S2K seems to show a clear GFP signal when these animals are fed HK-OP50. Similarly, the claim that feeding WT animals HK-OP50 enriches phospho-PMK-1 levels (Fig 2E) is unconvincing - only one western blot is shown, with no quantification, and there is a smear in the critical first lane.

      (1) The authors claim that the induction of the innate immune pathway reporter irg-5::GFP is "abolished" in pmk-1(RNAi) animals, but Figure S2K seems to show a clear GFP signal when these animals are fed HK-OP50. 

      We sincerely appreciate the reviewer's attention. To address this concern, we have replaced the images with higher resolution, larger ones in Figure 2-figure supplement 1-I. These updated images provide a clearer representation of the data, ensuring that all details are readily visible and enabling a more accurate interpretation of the results.

      (2) Similarly, the claim that feeding WT animals HK-OP50 enriches phospho-PMK-1 levels (Fig 2E) is unconvincing - only one western blot is shown, with no quantification, and there is a smear in the critical first lane.

      Thank you, following reviewer’s suggestion, we also repeated some of the western. We now replace the Figure 2E and quantified relative intensity of pPMK-1/tublin. We also provide the uncropped western blots images as source data ( “raw-data WB” file). 

      -The rationales for some of the paper's hypotheses could be improved. For example, the rationale for screening the E. coli mutant library is that some mutants, when heat-killed, may be missing a metabolite that induces the ER-UPR. A more straightforward hypothesis might be that some mutant E. coli strains aberrantly induce the ER-UPR when *not* heat-killed, because they are missing a metabolite that prevents stress pathway induction. This is not in itself a major concern, but it would be useful for the authors to provide a rationale for their hypothesis.

      Thank you for the insightful suggestion. We acknowledge the importance of providing a clear rationale for our hypotheses in the paper. In response to this feedback, we have enhanced the discussion section to better elucidate the rationale behind our hypotheses.

      One limitation of our study is the lack of explanation for why HK-E. coli activates UPRER and immunity. We hypothesized that when heat-killed, HK-E. coli may lack or contain altered levels of certain metabolites that either activate or inhibit UPRER and immunity, respectively. Additionally, we speculated that E. coli mutants killed by heat may lack metabolites that activate UPRER and immunity, or conversely, have increased levels of metabolites that inhibit these pathways.

      Fortunately, our investigation led to the discovery of the E. coli mutant yfbR, which inhibits UPRER and immunity by increasing carbohydrates that aid in resisting these stress pathways. Moving forward, we intend to further explore the intricate relationship between HK-E. coli and UPRER-immunity. This will be a key focus of our future research efforts.

      -The authors do not provide any explanation for some unexpected results from the E. coli screen. Earlier in the paper, the authors found that innate immune signaling is downstream of ER-UPR activation. However, of the 20 E. coli mutants that, when heat-killed, "did not induce... the UPR-ER reporter," 9 of them still activate the innate immune response. This seems at odds with the authors' simple model since it suggests that low-quality food can induce innate immune signaling independently of the ER-UPR. Further, only one of the 9 has an effect on behavior, even though failure to activate the innate immune pathway might be expected to lead to a behavioral defect in all of these.

      Thank you for your understanding, and we apologize for any confusion caused by our earlier statement. To provide clarification, our study revealed that out of the 20 E. coli mutants examined, none activated the UPRER. Among these mutants, 9 did not induce immunity, and interestingly, one out of these 9 mutants demonstrated the ability to inhibit avoidance behavior.

      This diversity in phenotypic outcomes can be attributed to the varied metabolites present in different E. coli mutants. To thoroughly evaluate the effects of these mutants, we conducted a comprehensive three-step screening process, utilizing UPRER marker, immunity marker, and avoidance behavior assays.

      Through this rigorous approach, we identified the E. coli mutant, yfbR, which exhibited the desired inhibitory effects on UPRER, immunity, and avoidance behavior.

      Subsequently, we conducted a metabolomics analysis of various food qualities (HK-K12, HK-yfbR, and Live-K12). Our findings revealed higher sugar levels in

      HK-yfbR and Live-K12 compared to HK-K12 (Figure 3B, Figure 3-figure supplement 2A, and Table S1), indicating that sugar deficiency might trigger the UPRER, immunity responses, and subsequent avoidance behavior. 

      -In a number of places, the writing style can make the authors' arguments difficult to follow.

      Thanks for the reviewer’s efforts. We changed all of these errors and polish the language of this paper. 

      -Some of the effect sizes observed by the authors are exceedingly small (e.g, the suppression of hsp-4::gfp induction by sugar supplementation in Figs 3C-E), raising some concern about the biological significance of the effect.

      Thank you for your feedback. In response to your concern, we have included additional clarification in the manuscript.

      We have added the following statement: “While sugar effectively inhibits the HK-E. coli-induced UPRER and immune response, it does not fully suppress it to the extent observed with live-E. coli (Figure 3C-F). This implies that additional nutrients present in live-E. coli might also contribute to the inhibition of UPRER and immune response.”

      This addition helps to address the observation that some effect sizes appear small, providing context and suggesting potential factors that may influence the outcomes. 

      -In some cases, there is a discrepancy between the fluorescence images and their quantitation (e.g., Figure 3E, where the effect of glucose on GFP fluorescence seems much stronger in the image than in the graph).

      Thank you for your valuable suggestion. In response, we have revised our image selection process to ensure impartiality. We now randomly select images to ensure they accurately represent the quantified data without bias. More details regarding this update can be found in Author response image 2.

      Author response image 2.

      More original picture corresponding to Figure 3E 

      Reviewer #3 (Public Review):

      Summary:

      Animals can evaluate food quality in many ways. In contrast to the rapid sensory evaluation with smell and taste, the mechanism of slow nutrient sensation and its impact on food choice is unexplored. The authors utilize C. elegans larvae and their bacterial food as an elegant model to tackle this question and reveal the detailed molecular mechanism to avoid nutrient-poor foods.

      Strengths:

      The strength of this study is that they identified the molecular identities of the critical players in bacterial food and C. elegans using unbiased approaches, namely metabolome analysis, E. coli mutant screening, and RNA sequencing. Furthermore, they strengthen their findings by thorough experiments combining multiple methods such as genetics, fluorescent reporter analysis, and Western blot.

      Thank you for highlighting the strengths of our study. 

      Weaknesses:

      The major caveat of this study is the reporter genes. The transcriptional reporters were used to monitor the UPRER and immune responses in the intestine of C. elegans.

      However, their tissue-specific rescue experiments suggest that the genes in the UPRER and immune response function in the neurons. Thus, we should carefully interpret the results of the reporter genes.

      Thank you for your insightful comment. We appreciate the opportunity to address your concerns regarding the interpretation of our reporter gene data.

      Upon reevaluation, we observed strong induction of the UPRER reporter

      (Phsp-4::GFP)8 and immunity reporter (Pirg-5::GFP)9 both in the intestine (Figure 1F-G) and in neurons (Figure 1-figure supplement 2A) in response to feeding unfavorable food (HK-E. coli). This suggests that both the UPRER and immune pathways may indeed respond to low-quality food (HK-E. coli) in multiple tissues of C. elegans. While we acknowledge that our tissue-specific rescue experiments suggest a role for these pathways in neurons, the intestinal fluorescence of Phsp-4::GFP or Pirg-5::GFP is easily observable and scorable. Therefore, we chose to focus our further analyses on the intestine for practical reasons.

      Overall, this work provides convincing data to support their model. In the C. elegans field, the behaviors of larvae are not well studied compared to adults. This work will pose an interesting question about the difference between larvae and adults in nutrition sensing in C. elegans and provide a framework and candidate molecules to be studied in other organisms.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Major suggestions:

      (1) My major overall comment is that the paper would be substantially strengthened by more mechanistic analysis. In particular, how does low vitamin C trigger UPR activation? This is an intriguing finding and it would be important to see it more fully explored.  

      Our study revealed that the vitamin C content in HK_-E. coli_ is comparable to that of live E. coli or HK-yfbR (Figure 4-figure supplement 1A), suggesting that the induction of unfolded protein response (UPR) in C. elegans by HK-E. coli is not attributed to low vitamin C levels, but rather to unknown factors.

      Metabolomic analysis showed that the sugar levels, including lactose, D-(+)-sucrose, and D-(+)-glucose, were significantly decreased in HK-E. coli (Figure 3B, Table S1).

      Furthermore, we found that supplementing D-(+)-glucose effectively inhibited UPRER (Figure 3E), immune response (Figure 3F, 3G, and Figure 3-figure supplement 2D), and avoidance behavior (Figure 3H) induced by HK-E. coli. Our findings suggest that the deficiency in sugars in HK-E. coli triggers a stress response and avoidance behavior in animals, which can be alleviated by D-(+)-glucose supplementation.

      Notably, when E. coli was heat-killed, we observed that the sugar levels, including lactose and D-(+)-sucrose, were significantly higher in the heat-killed E. coli mutant yfbR (HK-yfbR) compared to HK-E. coli (Figure 3B). Moreover, we found that UPRER was reduced in animals feeding HK-yfbR (Figure 3-figure supplement 1B), indicating that higher sugar levels may inhibit the induction of UPRER by low-quality food.

      The synthesis of vitamin C (VC) occurs through the glucuronate pathway, utilizing D-glucose as a precursor 1, 2 (Figure 4A). This led us to investigate whether the vitamin C biosynthesis pathway is involved in evaluating low-quality food by using D-glucose. In this study, we found that animals feeding live E. coli, which should produce more VC, exhibit higher glucose levels. However, our results show that animals maintain similar VC levels when fed ideal food (live E. coli) compared to low-quality food (HK-E. coli) (Figure 4B), suggesting that animals do not stimulate VC biosynthesis under favorable food conditions. In contrast, when animals are fed low-quality food (HK-OP50), we found that supplementing D-GlcA (Figure 4C) or E. coli-yfbR mutation (Figure 4B) in HK-E. coli can improve VC levels. Moreover, we found that VC or D-glucuronate (D-GlcA) supplementation inhibited HK-E. coli induced UPRER (Figure 4D). These data indicate that glucose boosts the animal's ability to adapt to unfavorable food environments by increasing VC levels, thereby inhibiting UPRER, but not in favorable food conditions.

      In addition,we asked whether high level of VC inhibits other stress induced UPRER. Previous study shown that Tunicamycin6 and pathogenic bacteria-Pseudomonas aeruginosa-PA145 induce UPRER in C. elegans. We found that VC supplementation does not inhibit Tunicamycin or PA14-induced URPER (Author response image 3). 

      Author response image 3.

      VC supplementation does not inhibit Tunicamycin or PA14-induced UPRER.

      In addition, how is the activation of the PMK-1 pathway driven by/coordinated with UPR activation? 

      If the authors do not want to pursue these directions experimentally in this study, the discussion would be strengthened by considering these questions and identifying candidate regulatory mechanisms for further exploration.

      In this study, we found that heat-killed E. coli (HK-E. coli), a low-sugar food, triggers cellular unfolded protein response (UPRER) and immune response. We also demonstrated that 1) the activation of UPRER by low-quality food depends on the IRE-1/XBP-1, 2) activation of immune response (PMK-1) is downstream of XBP-1 in responding to low-quality food.

      how is the activation of the PMK-1 pathway driven by/coordinated with UPR activation? 

      In our discussion part, we added new data and discussion to answer reviewer’s question. 

      A previous study has shown that activating innate immunity (PMK-1 MAPK) leads to a reduction in translation 3. Our own previous research has also demonstrated that PMK-1 activation causes a shutdown of food digestion in animals4, likely to reduce protein translation and cellular metabolism. To investigate this further, we measured the translation level of animals fed with HK-E. coli and found that total translation ability is significantly reduced in these animals (Figure 5-figure supplement 1D). This finding suggests that activating innate immunity (PMK-1 MAPK) may serve as a mechanism to slow down translation progress, thereby alleviating the pressure on the unfolded protein response (UPR) and preventing excessive UPRER activation.

      (2) Figure 2C: The data shows that xbp-1 mutants are significantly more likely to leave heat-killed E. coli. However, no other conditions are examined. Is this avoidance defect specific to heat-killed E. coli, or is it a more general effect of xbp-1 mutants - that is, are other conditions that evoke avoidance also affected by mutation of xbp-1? Is feeding behavior on regular E. coli altered in this background? The finding would be more relevant if the authors could clarify or provide more context for their claims here.

      We then asked whether UPRER (IRE-1/XBP-1) - Innate immunity (PMK-1/p38 MAPK) axis is specific to evaluate low-quality food (HK-E. coli). We examined the avoidance behavior phenotype of wild-type and mutant L1 animals by placing them on various food conditions, including normal E. coli food, inedible food (Saprophytic staphylococci) and pathogenic food (Pseudomonas aeruginosa-PA14), for a 24-hour period. We found that N2, pmk-1, and xbp-1 mutant worms did not exhibit avoidance behavior when presented with normal food (OP50). However, both N2 and xbp-1 mutant worms were able to escape from inedible food, Saprophytic staphylococci, whereas pmk-1 mutant worms did not show this avoidance. Notably, xbp-1 mutant worms exhibited even more pronounced avoidance behavior when exposed to Pseudomonas aeruginosa, whereas pmk-1 mutant worms were more susceptible to infection by this pathogen (Figure 2-figure supplement 2C). These findings suggest that the UPR-Immunity pathway plays a crucial role in helping animals avoid low-quality food by triggering an avoidance response. In contrast, the Innate immunity pathway, which is mediated by PMK-1/p38 MAPK, appears to play a key role in evaluating unfavorable food sources, such as HK-E. coli, Saprophytic staphylococci, and Pseudomonas aeruginosa, and helping animals avoid these environments.

      (3) Figure 3C-F: The magnitude of the changes between conditions shown in these panels is small. To what extent does this supplementation represent a full rescue? The findings would be strengthened if figures/images for the control condition (non-HK E. coli) were shown for comparison to allow the reader to assess the extent to which UPR/PMK-1 activation is rescued.

      In response to a reviewer's suggestion, we included live-E. coli as a control in our study. Notably, our data revealed that the addition of lactose, D-(+)-sucrose, and D-(+)-glucose partially inhibited the HK-E. coli-induced unfolded protein response (UPRER) and immune response, suggesting that other nutrients present in live-E. coli may also play a role in inhibiting UPRER.

      We added this in manuscript: “While sugar effectively inhibits the HK-E. coli-induced UPRER and immune response, it does not fully suppress it to the extent observed with live-E. coli (Figure 3C-F). This implies that additional nutrients present in live-E. coli might also contribute to the inhibition of UPRER and immune response.” 

      (4) Figure 5B-D: The magnitude of changes shown between conditions here again appear to be very small, even those labelled as statistically significant. It is important to ensure that the correct statistical tests have been used to assess the significance of these differences (see below).

      All statistical analyses were performed in Graphpad prism 8.0. Two-tailed unpaired t test was used for statistical analysis of two groups of samples,one-way or two-way ANOVA was used for statistical analysis of more than two groups of samples.

      (5) Methods: In the "Statistical analysis" section, the authors state that "All statistical analyses were performed using Student's t-test". However, this is not the appropriate test to use in experiments where multiple comparisons are made, which is true in several instances across the paper. In these cases, a more appropriate statistical test should be used.

      All statistical analyses were performed in Graphpad prism 8.0. Two-tailed unpaired t test was used for statistical analysis of two groups of samples,one-way or two-way ANOVA was used for statistical analysis of more than two groups of samples.

      Minor suggestions:

      (1) Figure S2: RNAi is usually delivered in a different E. coli strain, HT115. Is this the case with the RNAi knockdowns in Figure S2, and given that diet can influence UPR activation, is it possible that this different diet could change the phenotypes observed?

      This should be clarified by the authors.

      In this study, all RNAi experiments involved bleaching adult animals under RNAi strain culture conditions to obtain L1 animals. Subsequently, L1 animals were transferred to HK-E. coli OP50 for phenotype analysis. In response to a reviewer's suggestion, we observed that L1 animals obtained from mothers fed E. coli strains OP50, HT115, or K12 exhibited similar UPR induction under HK-E. coli OP50 feeding conditions (Author response image 4). These findings suggest that variations in diet did not alter the UPR phenotypes.

      Author response image 4.

      L1 animals obtained from mothers fed E. coli strains OP50, HT115, or K12 exhibited similar UPR induction under HK-E. coli OP50 feeding conditions 

      Reviewer #2 (Recommendations For The Authors):

      Line 182: "irg-5::GFP" should be "hsp-4::gfp".

      Thanks for the reviewer’s efforts. We have changed this error.

      Reviewer #3 (Recommendations For The Authors):

      Major comments:

      (1) The reporter genes of UPRER and immune response were analyzed in the intestine throughout the study. On the other hand, their rescue experiments suggest that these pathways function in the neurons. They should provide the fluorescence data in the neurons at least for Figures 1F and 1G to confirm that the intestinal response matches the neuronal response and mention that further analyses were done in the intestine for easy scoring.

      Consistent with the results of the RNA sequencing (RNA-seq) analysis, the UPRER reporter (Phsp-4::GFP)8 and immunity reporter (Pirg-5::GFP)9 were strongly induced in intestinal (Figure 1F-G) and neurons (Figure 1-figure supplement 2A) by feeding unfavorable food (HK-E. coli), suggesting that UPRER and immune pathways may respond to low-quality food (HK-E. coli). As intestinal fluorescence (Phsp-4::GFP or Pirg-5::GFP) is easy observation and scoring, the further analyses were done in the intestine. 

      (2) I have concerns about the interpretation of the p-PMK-1 data. Although the authors described that "p-PMK-1 is prominently increased" in the text (Line 150), it is unclear on the data (Figure 2E). Similarly, the authors' statement "p-PMK-1 is decreased in animals with D-GlcA (F).." was not fully supported by the data in Figure 4F. The experiment should be repeated and quantified. Moreover, pPMK-1 showed single bands in Figure 2E, but double bands in Figure 3G, 4F, and 4G. The authors should explain why that is the case and which band we should look at for Figures 3G, 4F, and 4G.

      As reviewer’s suggestion, we also repeated some of the western. We found that after longer expose, there are two bands for pPMK-1 (Figure 2E, new data; and “raw-data WB” file). The VHP-1 phosphatase is known to inhibit PMK-13. In our previous study, we found that worms treated with vhp-1(RNAi), which hyperactivates p-PMK-1 (lower band) 4. In contrast, the two bands are disappeared in pmk-1 mutant (Author response image 5). Thus, the lower band indicates the pPMK-1. We now replace the Figure 2E and quantified relative intensity of pPMK-1/tublin. We also provide the uncropped western blots images as source data ( “raw-data WB” file). 

      Author response image 5.

      In our previous study, we found that worms treated with vhp-1(RNAi), which hyperactivates p-PMK-1 (lower band) 4. In contrast, the two bands are disappeared in pmk-1 mutant. These pictures are extracted from our previous study4.

      (3) Heat-killed E. coli (HK-E. coli) is low-quality because the lack of sugar cannot support the growth of C. elegans larvae (Qi and Han, Cell, 2018). Thus, animals do not show the UPRER-immune response and avoidance when HK-E. coli is supplemented with sugars such as glucose (Line 225-227). If these sugars are the key, C. elegans larvae should be able to grow better with HK-E. coli supplemented with glucose. Authors should address this possibility.

      Previous studies have shown that heat-killed E. coli (HK-E. coli) is a low-quality food source that cannot support the growth of C. elegans larvae7. Here, we found that sugar deficiency in HK-E. coli induces the UPRER-immune response and avoidance behavior in C. elegans. Given this, we investigated whether sugar supplementation could promote animal growth when fed HK-E. coli. To our surprise, supplementing HK-E. coli with carbohydrates (D-Glc, D-GlcA) did not support animal development (Figure 3-figure supplement 2G), suggesting that carbohydrates are not essential for supporting animal growth on this food source. However, we did find that carbohydrates are critical for inhibiting the UPRER-immune response induced by sugar deficiency in HK-E. coli.

      (4) Line 884: Instead of the Student's t-test, the ANOVA should be used for multiple comparisons.

      All statistical analyses were performed in Graphpad prism 8.0. Two-tailed unpaired t test was used for statistical analysis of two groups of samples,one-way or two-way ANOVA was used for statistical analysis of more than two groups of samples.

      (5) Although the results are interesting and convincing, the manuscript needs some careful editing and proofreading. As far as I could catch, there are more than 100 errors and typos, as I summarized in minor comments. I recommend the authors proofread thoroughly to make this work easier to read.

      Thanks for the reviewer’s efforts. We changed all of these errors and polish the language of this paper. 

      Minor comments:

      (1) Line 30: nature -> natural

      (2) Line 86: elegnas -> elegans

      (3) Line 93: the17h -> the 17h

      (4) Line 97: response -> respond

      (5) Line106: responded -> respond

      (6) Lien 107-109: Add references for the three reporters

      (7) Line 114: immune -> immune pathway

      (8) Line 118: immune depended -> immune-dependent

      (9) Line 128, 594, 596: deferentially -> differentially

      (10) Line 131: Explain what IRE-1-mediated splicing of xbp-1 with references

      (11) Line 170: XPB-1 -> XBP-1

      (12) Line 179: URP -> UPR

      (13) Line 181: hsp-4::GFP -> Phsp-4::GFP

      (14) Line 183: Italicize E. coli; mutant -> mutants

      (15) Line 184: irg-5::GFP -> Pirg-5::GFP (2 places)

      (16) Line 197, 203, 206, 207: Lactose -> lactose

      (17) Line 206, 209, 217, 225, 228, 232, 237, 262, 442, 445, 604, 739: Glucose -> glucose

      (18) Line 218: Sugars deficiency -> sugar deficiency

      (19) Line 229: found contribute to -> found to contribute to

      (20) Line 235, 537, 539, 587, 599, 642, 855: Italicize E. coli

      (21) Line 236: same -> the same

      (22) Line 239: I recommend adding "in C. elegans". This study uses both E. coli and C.

      elegans genetics. Sometimes, it is confusing which organism was mentioned. It should be applied where it is necessary.

      (23) Line 240: additional -> addition

      (24) Line 339, 642: Italicize kgb-1

      (25) Line 390: Italicize Pseudomonas aeruginosa, Bacillus thuringiensis,

      Staphylococcus aureus, and Serratia marcescens

      (26) Line 394: wiht -> with

      (27) Line 400, 550: Change ER to superscript; Italicize ire-1, xbp-1, and pmk-1

      (28) Line 415: xpb-1 -> xbp-1

      (29) Line 460, 525, 531, 532, 617, 655: Italicize yfbR

      (30) Line 457, 468, 472, 475, 482, 497, 513, 624, 629, 633, 733. 758: Vitamin -> vitamin

      (31) Line 459: Make it clear what is the relationship between vitamin C and TAA

      (32) Line 527: Do not italicize mutant

      (33) Line 538: Phsp-6:GFP -> Phsp-6::GFP (to match other descriptions)

      (34) Line 540: Phsp-4:GFP -> Phsp-4::GFP (to match other descriptions)

      (35) Line 540: Italicize hsp-4

      (36) Line 543: Pirg-5:GFP -> Pirg-5::GFP (to match other descriptions) and italicize irg-5

      (37) Line 550, 881: Innate -> innate

      (38) Line 557, 560, 564, 838: Do not italicize HK

      (39) Line 561: Remove the extra space before "three"

      (40) Line 575, 577: Reporter -> reporter

      (41) Line 575, 607: Italicize Phsp-4::GFP

      (42) Line 577: immunity -> Immunity; Italicize Pirg-5::GFP

      (43) Line 585, 653: keio -> Keio

      (44) Line 586: hsp-4::GFP -> Phsp-4::GFP

      (45) Line 586, 589 (2 places): irg-5::GFP -> Pirg-5::GFP

      (46) Line 597: Remove "all"

      (47) Line 600: Trehalose -> trehalose

      (48) Line 609: Italicize Pirg-5::GFP

      (49) Line 615: critically -> critical

      (50) Line 636: Remove "+"

      (51) Line 656 (2 places), 682: Do not italicize OP50

      (52) Line 664: Lead -> lead

      (53) Line 681: Describe the composition of NGM or show the reference. Since this paper examines nutrition, the composition of the medium is crucial.

      (54) Line 686-706: Italicize all allele names. Be consistent with how to write the promoter to avoid confusion (e.g., ttx-3p -> Pttx-3). Be consistent with how to describe the transgene (e.g., Phsp-4::GFP(zcIs4) -> zcIs4[Phsp-4::GFP])

      (55) Line 710: Describe the composition of LB or show the reference. Since this paper examines nutrition, the composition of the medium is crucial.

      (56) Line 709, 856 (2 places), 858: Do not italicize K12 to make it consistent

      (57) Line 719: Podr-1p:RFP -> Podr-1::RFP

      (58) Line 722, 724: Italicize ges-1 and xbp-1

      (59) Line 723: Pges-1:xbp-1::GFP -> Pges-1::xbp-1::GFP

      (60) Line 735: Glucuronic -> glucuronic

      (61) Line 748: I believe it is 5 mm instead of 0.5 mm

      (62) Line 750: The equation should be (5 mm)2/(17.5 mm)2

      (63) Line 759: Remove the period after "pattern".

      (64) Line 766: Describe how they were synchronized

      (65) Line 774: Italicize Psysm-1p::GFP

      (66) Line 785: Insert a space before "until"

      (67) Line 787: the mutant -> mutant

      (68) Line 789, 792, 793, 795 (2 places): GPF -> GFP

      (69) Line 791: next -> Next; an -> a

      (70) Line 799: Remove a space before "MRC".

      (71) Line 804: I do not understand what "until adulthood" means in this context;

      Remove a space before "by". (I recommend searching double space and correcting it.)

      (72) Line 853: Metabolome -> metabolome

      (73) Line 893-1082: Species and gene names should be italicized in Reference

      (74) Figures 1F, 1G, S2F, S2G: The panels' order should match the bar graphs' order. The apparent difference in the representative data does not match the marginal difference in the bar graph in Fig. 1G. The authors should double-check the results.

      (75) Figure 1F, 2A, 2B, 3C, 3D, 3E, 4D, 4I, S1J, S2A, S2B, S2I, S3B, S3F, S3H: hsp-4::GFP -> Phsp-4::GFP

      (76)  Figure 1G, 2D, 3F, 4E, 4J, S1K, S2H, S3C, S3I: irg-5::GFP -> Pirg-5::GFP

      (77)  Figure 6: Liquids -> Lipids; Italicize ire-1, xbp-1, pmk-1

      (78)  Figure S1I: hsp-6::GFP -> Phsp-6::GFP

      (79)  In the legend for Figure S1 after Figure S1, (A), (B)... were duplicated. It is OK in the corresponding main text (Line 530)

      (80)  Figure S2F, S3G, S4C, S4D: sysm-1::GFP -> Psysm-1::GFP

      (81)  Figure S2G: irg-1::GFP -> Pirg-1::GFP

      (82)  Figure S3H and S3I: Describe which ones are Glu + conditions

      References: 

      (1) Patananan AN, Budenholzer LM, Pedraza ME, Torres ER, Adler LN, Clarke SG. The invertebrate Caenorhabditis elegans biosynthesizes ascorbate. Arch Biochem Biophys 569, 32-44 (2015).

      (2) Yabuta Y_, et al. L-Ascorbate Biosynthesis Involves Carbon Skeleton Rearrangement in the Nematode Caenorhabditis elegans. _Metabolites 10,  (2020).

      (3) Weaver BP, Weaver YM, Omi S, Yuan W, Ewbank JJ, Han M. Non-Canonical Caspase Activity Antagonizes p38 MAPK Stress-Priming Function to Support Development. Dev Cell 53, 358-369 e356 (2020).

      (4) Geng S_, et al. Gut commensal E. coli outer membrane proteins activate the host food digestive system through neural-immune communication. _Cell Host Microbe 30, 1401-1416 e1408 (2022).

      (5)  Richardson CE, Kooistra T, Kim DH. An essential role for XBP-1 in host protection against immune activation in C. elegans. Nature 463, 1092-1095 (2010).

      (6) Harding HP_, et al. An Integrated Stress Response Regulates Amino Acid Metabolism and Resistance to Oxidative Stress. _Molecular Cell 11, 619-633 (2003).

      (7) Qi B, Kniazeva M, Han M. A vitamin-B2-sensing mechanism that regulates gut protease activity to impact animal’s food behavior and growth. eLife 6, e26243 (2017).

      (8) Calfon M_, et al. IRE1 couples endoplasmic reticulum load to secretory capacity by processing the XBP-1 mRNA. _Nature 415, 92-96 (2002).

      (9) Bolz DD, Tenor JL, Aballay A. A Conserved PMK-1/p38 MAPK Is Required in Caenorhabditis elegans Tissue-specific Immune Response to Yersinia pestis Infection*. The Journal of Biological Chemistry 285, 10832 - 10840 (2010).

    1. Reviewer #2 (Public Review):

      Summary:

      This paper addresses an important computational problem in learning and memory. Why do related memory representations sometimes become more similar to each other (integration) and sometimes more distinct (differentiation)? Classic supervised learning models predict that shared associations should cause memories to integrate, but these models have recently been challenged by empirical data showing that shared associations can sometimes cause differentiation. The authors have previously proposed that unsupervised learning may account for these unintuitive data. Here, they follow up on this idea by actually implementing an unsupervised neural network model that updates the connections between memories based on the amount of coactivity between them. The authors use their modeling framework to simulate three recent empirical studies, showing that their model captures aspects of these findings that are hard to account for with supervised learning.

      Overall, this is a strong and clearly described work that is likely to have a positive impact on computational and empirical work in learning and memory. While the authors have written about some of the ideas discussed in this paper previously, a fully implemented and openly available model is a clear advance that will benefit the field. It is not easy to translate a high-level description of a learning rule into a model that actually runs and behaves as expected. The fact that the authors have made all their code available makes it likely that other researchers will extend the model in numerous interesting ways, many of which the authors have discussed and highlighted in their paper.

      Strengths:

      The authors succeed in demonstrating that unsupervised learning with a simple u-shaped rule can produce results that are qualitatively in line with the empirical reports. In each of the three models, the authors manipulate stimulus similarity (following Chanales et al.), shared vs distinct associations (following Favila et al.), or learning strength (a stand-in for blocked versus interleaved learning schedule; following Schlichting et al.). In all cases, with hand-tuning of additional parameters, the authors are able to produce model representations that fit the empirical results, but that can't easily be accounted for by supervised learning. Demonstrating these effects isn't trivial and a formal modeling framework for doing so is a valuable contribution. Overall, the work is very thorough. The authors investigate many different aspects of the learning dynamics (learning rate, oscillation strength, hidden layer overlap etc) in these models and produce several key insights. Of particular value are their demonstrations that when differentiation occurs, it occurs very quickly and asymmetrically and results in anti-correlated representations, as well as the distinction between symmetric and asymmetric integration in their model. The authors thoroughly acknowledge the relative difficulty of producing differentiation in their models relative to integration, and are now more clear about why they don't necessarily view this as mismatch with the empirical data. The authors are also more clear about the complicated activation dynamics in their model and why critical ranges for some parameters can't be given -- the number of interacting parameters mean that there are many combinations that could produce the critical activation dynamics and thus the same result. Despite this complexity, the paper is very clearly written; the authors do a good job of both formally describing their model as well as giving readers a high level sense of how many of their critical model components work.

      Weaknesses:

      Though the u-shaped learning rule is essential to this framework, the paper doesn't do any formal investigation of this learning rule or comparison with other learning rules. The authors do have a strong theoretical interest in this rule as well as experimental precedent for testing this rule, which they now thoroughly discuss in the paper. Still, a stronger argument in support of the non monotonic plasticity hypothesis could have been made by comparing this learning rule to alternatives. Additionally, the authors' choice of strongly prewiring associations makes it difficult to think about how their model maps onto experimental contexts where associations are only weakly learned. However, the authors thoroughly acknowledge why this was necessary and discuss this limitation in the paper.

    2. eLife assessment

      This paper presents important computational modeling work that provides a mechanistic account for how memory representations become integrated or differentiated (i.e., having distinct neural representations despite being similar in content). The authors provide convincing evidence that simple unsupervised learning in a neural network model, which critically weakens connections of units that are moderately activated by multiple memories, can account for three empirical findings of differentiation in the literature. The paper also provides insightful discussion on the factors contributing to differentiation as opposed to integration, and makes new predictions for future empirical work.

    3. Reviewer #1 (Public Review):

      Ritvo and colleagues present an impressive suite of simulations that can account for three findings of differentiation in the literature. This is important because differentiation-in which items that have some features in common, or share a common associate are less similar to one another than are unrelated items-is difficult to explain with classic supervised learning models, as these predict the opposite (i.e., an increase in similarity). A few of their key findings are that differentiation requires a high learning rate and low inhibitory oscillations, and is virtually always asymmetric in nature.

      This paper was very clear and thoughtful-an absolute joy to read. The model is simple and elegant, and powerful enough to re-create many aspects of existing differentiation findings. The interrogation of the model and presentation of the findings were both extremely thorough. The potential for this model to be used to drive future work is huge.

      The authors have been very responsive to my previous reviews and I have no further concerns and identify no major weaknesses.

    4. Reviewer #3 (Public Review):

      This paper proposes a computational account for the phenomenon of pattern differentiation (i.e., items having distinct neural representations when they are similar). The computational model relies on a learning mechanism of the nonmonotonic plasticity hypothesis, fast learning rate and inhibitory oscillations. In the revised paper, the authors justified the initialization of the model, added empirical evidence supporting the use of two turning points in the NMPH function and provided details of the learning mechanisms of the model. The relatively simple architecture of the model makes its dynamics accessible to the human mind. Furthermore, using similar model parameters, this model produces simulated data consistent with empirical data of pattern differentiation. The authors also provide insightful discussion on the factors contributing to differentiation as opposed to integration.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Ritvo and colleagues present an impressive suite of simulations that can account for three findings of differentiation in the literature. This is important because differentiation-in which items that have some features in common, or share a common associate are less similar to one another than are unrelated items-is difficult to explain with classic supervised learning models, as these predict the opposite (i.e., an increase in similarity). A few of their key findings are that differentiation requires a high learning rate and low inhibitory oscillations, and is virtually always asymmetric in nature.

      This paper was very clear and thoughtful-an absolute joy to read. The model is simple and elegant, and powerful enough to re-create many aspects of existing differentiation findings. The interrogation of the model and presentation of the findings were both extremely thorough. The potential for this model to be used to drive future work is huge. I have only a few comments for the authors, all of which are relatively minor.

      (1) I was struck by the fact that the "zone" of repulsion is quite narrow, compared with the zone of attraction. This was most notable in the modeling of Chanales et al. (i.e., just one of the six similarity levels yielded differentiation). Do the authors think this is a generalizable property of the model or phenomenon, or something idiosyncratic to do with the current investigation? It seems curious that differentiation findings (e.g., in hippocampus) are so robustly observed in the literature despite the mechanism seemingly requiring a very particular set of circumstances. I wonder if the authors could speculate on this point a bit-for example, might the differentiation zone be wider when competitor "pop up" is low (i.e., low inhibitory oscillations), which could help explain why it's often observed in hippocampus? This seems related a bit to the question about what makes something "moderately" active, or how could one ensure "moderate" activation if they were, say, designing an experiment looking at differentiation.

      We thank the reviewer for this comment. In the previous version of the manuscript, in the section entitled “Differentiation Requires a High Learning Rate and Is Sensitive to Activation Dynamics”, we discussed some reasons why differentiation may be more likely to be found in the hippocampus – namely, the high learning rate of the hippocampus and the sparsity of hippocampal activation patterns (pp. 27-28):

      “These results have implications for where to look for differentiation in the brain. Our finding that differentiation requires a high learning rate suggests that differentiation will be more evident in the hippocampus than in neocortex, insofar as hippocampus is thought to have a higher learning rate than neocortex (McClelland et al., 1995). In keeping with this prediction, numerous studies have found differentiation effects in hippocampus but not in neocortical regions involved in sensory processing (e.g., Chanales et al., 2017; Favila et al., 2016; Zeithamova et al., 2018). At the same time, some studies have found differentiation effects in neocortex (e.g., Schlichting et al., 2015; Wammes et al., 2022). One possible explanation of these neocortical differentiation effects is that they are being ``propped up’’ by top-down feedback from differentiated representations in the hippocampus. This explanation implies that disruptions of hippocampal processing (e.g., lesions, stimulation) will eliminate these neocortical differentiation effects; we plan to test this prediction in future work.

      Additionally, the simulations where we adjusted the oscillation amount (using our model of Schlichting et al., 2015) imply that differentiation will be most evident in brain regions where it is relatively hard to activate competitors. Given the U shape of the NMPH learning rule, limiting competitor activity makes it less likely that plasticity will ``cross over'' from weakening (and differentiation) to strengthening (and integration). Thus, within the hippocampus, subregions with sparser activity (e.g., dentate gyrus, and to a lesser extent, CA3; Barnes et al., 1990, GoodSmith et al., 2017; West et al., 1991) will be more prone to differentiation. There is strong empirical support for this prediction. For example, Wammes et al. (2022) manipulated the similarity of stimuli in a statistical learning experiment and found that moderate levels of visual similarity were associated with significant differentiation in the dentate gyrus but not other subregions. Also, numerous studies have found greater differentiation in dentate gyrus / CA3 than in CA1 (e.g., Dimsdale-Zucker et al., 2018; Wanjia et al., 2021; Molitor et al., 2021; Kim et al., 2017; but see Zheng et al., 2021).”

      In the revised draft we have supplemented this discussion with a new section entitled “Reconciling the Prevalence of Differentiation in the Model and in the Data” (pp. 30-31):

      “A key lesson from our model is that, from a computational perspective, it is challenging to obtain differentiation effects: The region of parameter space that gives rise to differentiation is much smaller than the one that gives rise to integration (for further discussion of this issue, see the section in Methods on Practical Advice for Getting the Model to Show Differentiation). However, the fact that integration is more prevalent in our simulations across parameter configurations does not mean that integration will be more prevalent than differentiation in real-life circumstances. What really matters in predicting the prevalence of differentiation in real life is how the parameters of the brain map on to parameters of the model: If the parameters of the brain align with regions of model parameter space that give rise to differentiation (even if these regions are small), this would explain why differentiation has been so robustly observed in extant studies. Indeed, this is exactly the case that we sought to make above about the hippocampus – i.e., that its use of especially sparse coding and a high learning rate will give rise to the kinds of neural dynamics that cause differentiation (as opposed to integration). As another example, while it is true that half of the overlap conditions in our simulation of Chanales et al. (2021) give rise to integration, this does not imply that integration will occur half of the time in the Chanales et al. (2021) study; it may be that the levels of overlap that are actually observed in the brain in Chanales et al. (2021) are more in line with the levels of overlap that give rise to differentiation in our model.”

      (2) With real fMRI data we know that the actual correlation value doesn't matter all that much, and anti-correlations can be induced by things like preprocessing decisions. I am wondering if the important criterion in the model is that the correlations (e.g., as shown in Figure 6) go down from pre to post, versus that they are negative in sign during the post learning period. I would think that here, similar to in neural data, a decrease in correlation would be sufficient to conclude differentiation, but would love the authors' thoughts on that.

      We thank the reviewer for bringing this up. In the paper, we define differentiation as the moving apart of representations – so we agree with the reviewer that it would be appropriate to conclude that differentiation is taking place when correlations go down from pre to post.

      In addition to the definitional question (“what counts as differentiation”), one can also ask the mechanistic question of what is happening in the model at the (simulated) neuronal level in conditions where differentiation (i.e., an average decrease in similarity from pre to post) occurs. Here, the model’s answer is clear: When the similarity of two pairmates decreases, it is because the pairmates have acquired anticorrelated representations at the (simulated) neuronal level. When similarity decreases on average from pre to post, but the average “post” similarity value is not negative, this is because there is a mix of outcomes across runs of the model (due to variance in the initial, random model weights and also variance in the order in which items are presented across training epochs) – some runs lead to differentiation (manifested as anticorrelated pairmate representations) whereas others lead to no change or integration. The average pre-to-post change depends on the relative frequencies with which these different outcomes occur.

      We have made several edits to the paper to clarify this point.

      We added a new section under “Results” in our simulation of Chanales et al. (2021) entitled, “Pairs of Items that Differentiate Show Anticorrelated Representations” (p. 15):

      “Figure 6B also highlights that, for learning rates where robust differentiation effects occur in aggregate (i.e., there is a reduction in mean pattern similarity, averaging across model runs), these aggregate effects involve a bimodal distribution across model runs: For some model runs, learning processes give rise to anticorrelated representations, and for other model runs the model shows integration; this variance across model runs is attributable to random differences in the initial weight configuration of the model. The aggregate differentiation effect is therefore a function of the proportion of model runs showing differentiation (here, anticorrelation) and the proportion of model runs showing integration. The fact that differentiation shows up as anticorrelation in the model's hidden layer relates to the learning effects discussed earlier:

      Unique competitor units are sheared away from (formerly) shared units, so the competitor ends up not having any overlap with the target representation (i.e., the level of overlap is less than you would expect due to chance, which mathematically translates into anticorrelation). We return to this point and discuss how to test for anticorrelation in the Discussion section.”

      We added new text to the “Take-Home Lessons” section in the Chanales et al. (2021) simulation (p. 17):

      “In particular, the simulations expose some important boundary conditions for when representational change can occur according to the NMPH (e.g., that differentiation depends on a large learning rate, but integration does not), and the simulations provide a more nuanced account of exactly how representations change (e.g., that differentiation driven by the NMPH is always asymmetric, whereas integration is sometimes asymmetric and sometimes symmetric; and that, when differentiation occurs on a particular model run, it tends to give rise to anticorrelated representations in the model's hidden layer).”

      We added new text to the “Nature of Representational Change” section in the Favila et al. (2016) simulation (p. 21):

      “Figure 8 - Supplement 1 also indicates that, as in our simulation of Chanales et al. (2021), individual model runs where differentiation occurs show anticorrelation between the pairmate representations, and gradations in the aggregate level of differentiation that is observed across conditions reflect differences in the proportion of trials showing this anticorrelation effect.”

      We added new text to the “Take-Home Lessons” section in the Favila et al. (2016) simulation (p.21):

      “As in our simulation of \cite{chanales2021adaptive}, we found that the NMPH-mediated differentiation was asymmetric, manifested as anticorrelation between pairmate representations on individual model runs, and required a high learning rate, leading to abrupt representational change.”

      We added new text to the “Nature of Representational Change” section in the Schlichting et al. (2015) simulation (p. 26):

      “Also, as in our other simulations, when differentiation occurs on a particular model run it tends to give rise to anticorrelated representations (results not shown).”

      We added new text to the “Take-Home Lessons” section in the Schlichting et al. (2015) simulation (pp. 26-27):

      “As in the other versions of our model, differentiation requires a high learning rate, and – on model runs when it occurs – it is asymmetric and gives rise to anticorrelated representations.”

      We added new text at the start of the Discussion (p. 27):

      “In addition to qualitatively replicating the results from the studies we simulated, our model gives rise to several novel predictions – most notably, that differentiation driven by the NMPH requires a rapid learning rate and, when it occurs for a particular pair of items, it is asymmetric and gives rise to anticorrelated representations.”

      We also added a new section in the Discussion entitled “Testing the Model's Prediction about Anticorrelation”, which (among other things) highlights the reviewer’s point that fMRI pattern similarity values can be affected by preprocessing choices (p. 30):

      “Even though we operationally define differentiation as a reduction in similarity with learning, the way that it actually shows up on individual model runs is as anticorrelation between pairmates; in the model, the size of the aggregate differentiation effect is determined by the proportion of model runs that show this anticorrelation effect (vs. no change or integration). This implies that, if we could get a clean measurement of the similarity of pairmates in an experiment, we might see a multimodal distribution, with some pairmates showing anticorrelation, and others showing increased correlation (integration) or no change in similarity. This kind of clean readout of the similarity of individual pairs might be difficult to obtain with fMRI; it is more feasible that this could be obtained with electrophysiology. Another challenge with using fMRI to test this prediction is that anticorrelation at the individual-neuron level might not scale up to yield anticorrelation at the level of the BOLD response; also, fMRI pattern similarity values can be strongly affected by preprocessing choices – so a negative pattern similarity value does not necessarily reflect anticorrelation at the individual-neuron level. A final caveat is that, while we predict that differentiation will show up as anticorrelation in the brain region that gives rise to the differentiation effect, this might not translate into anticorrelation in areas that are downstream of this region (e.g., if the hippocampus is the source of the differentiation effect, we would expect anticorrelation there, but not necessarily in neocortical regions that receive input from the hippocampus; we revisit this point later in the discussion, when we address limitations and open questions).”

      We added new text in the Discussion, under “Limitations and Open Questions” (p. 31):

      “Importantly, while hippocampus can boost the representation of unique features in neocortex, we expect that neocortex will continue to represent shared perceptual features (e.g., in Favila et al., 2016, the fact that both pairmates are photos of barns). For this reason, in paradigms like the one used by Favila et al. (2016), the predicted effect of hippocampal differentiation on neocortical representations will be a reduction in pattern similarity (due to upregulation in the representation of unique pairmate features) but neocortex should not cross over into anticorrelation in these paradigms (due to its continued representation of shared perceptual features). Indeed, this is exactly the pattern that Wanjia et al. (2021) observed in their study, which used similar stimuli to those used in Favila et al. (2016).”

      Lastly, we updated the Abstract (p. 1)

      “What determines when neural representations of memories move together (integrate) or apart (differentiate)? Classic supervised learning models posit that, when two stimuli predict similar outcomes, their representations should integrate. However, these models have recently been challenged by studies showing that pairing two stimuli with a shared associate can sometimes cause differentiation, depending on the parameters of the study and the brain region being examined. Here, we provide a purely unsupervised neural network model that can explain these and other related findings. The model can exhibit integration or differentiation depending on the amount of activity allowed to spread to competitors – inactive memories are not modified, connections to moderately active competitors are weakened (leading to differentiation), and connections to highly active competitors are strengthened (leading to integration). The model also makes several novel predictions – most importantly, that when differentiation occurs as a result of this unsupervised learning mechanism, it will be rapid and asymmetric, and it will give rise to anticorrelated representations in the region of the brain that is the source of the differentiation. Overall, these modeling results provide a computational explanation for a diverse set of seemingly contradictory empirical findings in the memory literature, as well as new insights into the dynamics at play during learning.”

      (3) For the modeling of the Favila et al. study, the authors state that a high learning rate is required for differentiation of the same-face pairs. This made me wonder what happens in the low learning rate simulations. Does integration occur?

      For the same-face condition of the Favila simulation, lowering learning rate does not result in an overall integration effect:

      Author response image 1.

      In other cases, we do see integration emerge at lower learning rates – e.g., in the Schlichting interleaved condition we see a small integration effect emerge for a learning rate value of 0.3:

      Author response image 2.

      Our view is that, while integration can emerge at low learning rates, it is not a reliable property of the model – in some cases, there is a “window” of learning rates where there is enough learning to drive integration but not enough to drive differentiation, and in other cases there is not. Given this lack of reliability across simulations, we would prefer not to discuss this in the paper.

      This paradigm has a lot of overlap with acquired equivalence, and so I am thinking about whether these are the sorts of small differences (e.g., same-category scenes and perhaps a high learning rate) that bias the system to differentiate instead of integrate.

      We agree that it would be very interesting to use the model to explore acquired equivalence and related phenomena, but we think it is out of scope of the current paper. We have added some text to the Discussion under “Limitations and Open Questions” (p. 32):

      “Another important future direction is to apply the model to a wider range of learning phenomena involving representational change – for example, acquired equivalence, which (like some of the studies modeled here) involves linking distinct stimuli to a shared associate (see, e.g., Honey and Hall, 1989; Shohamy and Wagner, 2008; Myers et al., 2003; Meeter et al., 2009; de Araujo Sanchez and Zeithamova, 2023). It is possible that some of these phenomena might be better explained by supervised learning, or a mixture of unsupervised and supervised learning, than by unsupervised learning alone.”

      (4) For the simulations of the Schlichting et al. study, the A and B appear to have overlap in the hidden layer based on Figure 9, despite there being no similarity between the A and B items in the study (in contrast to Favila et al., in which they were similar kinds of scenes, and Chanales et al., in which they were similar colors). Why was this decision made? Do the effects depend on some overlap within the hidden layer? (This doesn't seem to be explained in the paper that I saw though, so maybe just it's a visualization error?)

      Overlap in the pretrained hidden representations of A and B is not strictly necessary for these effects – it would be possible to reconfigure other parameters to get high levels of competition even if there were no overlap (e.g., by upregulating the strengths of connections from shared input features). Having said that, it is definitely true that overlap between the pretrained hidden representations boosts competition, and we think it is justified to posit this in the Schlichting simulation. We have now added an explanation for this in the paper (p. 23):

      “New text in Schlichting, “Knowledge Built into the Network”

      Matching the previous two simulations, we pretrained the weights so the hidden representations of the stimuli initially had 2/6 units in common. Even though the A and B stimuli used in the actual experiment did not have obvious feature overlap (they were randomly selected novel objects), it is important to note that the hidden layer is not simply a representation of the sensory features of the A and B stimuli; the hidden layer also receives input from the output layer, which represents the shared associate of A and B (X). We think that the presence of this shared associate justifies our use of initially-overlapping hidden representations.”

      (5) It seems as though there were no conditions under which the simulations produced differentiation in both the blocked and intermixed conditions, which Schlichting et al. observed in many regions (as the present authors note). Is there any way to reconcile this difference?

      We thank the reviewer for bringing this up. If we set the connection strength between X (in the output layer) and A (in the hidden layer) in the blocked condition to .9 instead of .999 (keeping this connection strength at .8 for the interleaved condition) and we set Osc to .0615, we observe differentiation in both conditions.

      Rather than replacing the original results in the paper, which would entail re-making the associated videos, etc., we have added a supplementary figure (Figure 10 - Supplement 1), which is included on p. 46.

      We also added the following to the Results section of the Schlichting simulation in the main text (p. 26):

      “Figure 10 - Supplement 1 shows results from an alternative parameterization where, in the low-oscillation-amplitude condition, differentiation is observed in both the blocked and interleaved conditions (mirroring results from Schlichting et al., 2015, who found differentiation in both conditions in several regions of interest, including parts of the hippocampus and medial prefrontal cortex).”

      (6) A general question about differentiation/repulsion and how it affects the hidden layer representation in the model: Is it the case that the representation is actually "shifted" or repelled over so it is no longer overlapping? Or do the shared connections just get pruned, such that the item that has more "movement" in representational space is represented by fewer units on the hidden layer (i.e., is reduced in size)? I think, if I understand correctly, that whether it gets shifted vs. reduce would depend on the strength of connections along the hidden layer, which would in turn depend on whether it represents some meaningful continuous dimension (like color) or not. But, if the connections within the hidden layer are relatively weak and it is the case that representations become reduced in size, would there be any anticipated consequences of this (e.g., cognitively/behaviorally)?

      The representations are shifted – this is discussed in the Chanales results section:

      “Because the activity ``set point'' for the hidden layer (determined by the kWTA algorithm) involves having 6 units active, and the unique parts of the competitor only take up 4 of these 6 units, this leaves room for activity to spread to additional units. Given the topographic projections in the output layer, the model is biased to ``pick up'' units that are adjacent in color space to the currently active units; because activity cannot flow easily from the competitor back to the target (as a result of the aforementioned severing of connections), it flows instead {\em away} from the target, activating two additional units, which are then incorporated into the competitor representation. This sequence of events (first a severing of the shared units, then a shift away from the target) completes the process of neural differentiation, and is what leads to the behavioral repulsion effect in color recall (because the center-of-mass of the color representation has now shifted away from the target).”

      Reviewer #2 (Public Review):

      This paper addresses an important computational problem in learning and memory. Why do related memory representations sometimes become more similar to each other (integration) and sometimes more distinct (differentiation)? Classic supervised learning models predict that shared associations should cause memories to integrate, but these models have recently been challenged by empirical data showing that shared associations can sometimes cause differentiation. The authors have previously proposed that unsupervised learning may account for these unintuitive data. Here, they follow up on this idea by actually implementing an unsupervised neural network model that updates the connections between memories based on the amount of coactivity between them. The goal of the authors' paper is to assess whether such a model can account for recent empirical data at odds with supervised learning accounts. For each empirical finding they wish to explain, the authors built a neural network model with a very simple architecture (two inputs layers, one hidden layer, and one output layer) and with prewired stimulus representations and associations. On each trial, a stimulus is presented to the model, and inhibitory oscillations allow competing memories to pop up. Pre-specified u-shaped learning rules are used to update the weights in the model, such that low coactivity leaves model connections unchanged, moderate coactivity weakens connections, and high coactivity strengthens connections. In each of the three models, the authors manipulate stimulus similarity (following Chanales et al), shared vs distinct associations (following Favila et al), or learning strength (a stand in for blocked versus interleaved learning schedule; following Schlichting et al) and evaluate how the model representations evolve over trials.

      As a proof of principle, the authors succeed in demonstrating that unsupervised learning with a

      simple u-shaped rule can produce qualitative results in line with the empirical reports. For instance, they show that pairing two stimuli with a common associate (as in Favila et al) can lead to *differentiation* of the model representations. Demonstrating these effects isn't trivial and a formal modeling framework for doing so is a valuable contribution. Overall, the authors do a good job of both formally describing their model and giving readers a high level sense of how their critical model components work, though there are some places where the robustness of the model to different parameter choices is unclear. In some cases, the authors are very clear about this (e.g. the fast learning rate required to observe differentiation). However, in other instances, the paper would be strengthened by a clearer reporting of the critical parameter ranges.

      We thank the reviewer for raising this point. The interdependence of parameters in our model makes it infeasible to identify critical parameter ranges. We have added a paragraph to the “Approach to Parameterization and Data Fitting” section in the Methods to address this point (p. 33):

      “The overall goal of this modeling work is to account for key empirical regularities regarding differentiation and integration and to establish boundary conditions on these regularities. As such, the modeling work described below focuses more on qualitative fits to general properties of the data space than on quantitative fits to results from specific studies. Automatic parameter optimization is not feasible for this kind of model, given the large number of model parameters and the highly interactive, nonlinear nature of competitive dynamics in the model; consequently, model fitting was done by hand.

      These complex interactions between parameters also make it infeasible to list “critical parameter ranges” for generating particular model outcomes. Our experience in working with the model has been that activation dynamics are what matter most for learning, and that disparate parameter sets can give rise to the same activation dynamics and -- through this -- the same learning effects; likewise, similar parameter sets can give rise to different activation dynamics and different learning outcomes. Consequently, in this paper we have focused on characterizing the dynamics that give rise to different learning effects (and how they can be affected by local parameter perturbations, e.g., relating to learning rate and oscillation size), rather than the – impossible, we believe – task of enumerating the full set of parameter configurations that give rise to a particular result.”

      For instance, it's clear from the manipulation of oscillation strength in the model of Schlichting et al that this parameter can dramatically change the direction of the results. The authors do report the oscillation strength parameter values that they used in the other two models, but it is not clear how sensitive these models are to small changes in this value.

      In some cases, the effects of oscillation strength are relatively smooth. For example, in the Favila simulation, increasing the oscillation amplitude Osc effectively recapitulates the U-shaped curve (i.e., higher levels of Osc lead to more competitor activation, which initially leads to weakening / differentiation but then gives way to strengthening / integration), as is shown for the Favila Different Face condition in this plot:

      Author response image 3.

      In the Chanales 2/6 overlap condition, the effects of varying Osc are more nonlinear:

      Author response image 4.

      We think this is attributable to the increased “all-or-none” recurrent dynamics in this simulation (due to the recurrent projections within the output layer), which make it more difficult to evoke moderate (vs. high) levels of activation. This difficulty in reliably obtaining graded activation dynamics is likely a consequence of the small-scale (“toy”) nature of the model and the simple inhibitory mechanisms employed here, as opposed to being a generalizable property of the brain – presumably, the actual brain employs more nuanced and effective means of controlling activation. Furthermore, we don’t think that the high prevalence of integration in the model’s parameter space necessarily translates into a prediction that integration should be more prevalent overall – see the new “Reconciling the Prevalence of Differentiation in the Model and in the Data” section described in response to one of the reviewer’s other points below. Due to the paper already being quite long, we have opted not to include the above plots / discussion in the paper.

      Similarly, it's not clear whether the 2/6 hidden layer overlap (only explicitly manipulated in the model of Chanales et al) is required for the other two models to work.

      When we were parameterizing the model, we opted to keep the 2/6 level of overlap for all of the simulations and we adjusted other parameters to fit the data; in part, this was because overlap can only be adjusted in discrete jumps, whereas other influential parameters in the model can be adjusted in a more graded, real-valued way. Our use of 2/6 overlap (as opposed to, say, 1/6 or 3/6 overlap) for the Favila and Schlichting models was done out of convenience, and should not be interpreted as a strong statement that this particular level of overlap is necessary for obtaining differentiation; we could easily get the model to show differentiation given other overlap levels by adjusting other parameters.

      Finally, though the u-shaped learning rule is essential to this framework, the paper does little formal investigation of this learning rule. It seems obvious that allowing the u-shape to collapse too much toward a horizontal line would reduce the model's ability to account for empirical results, but there may be other more interesting features of the learning rule parameterization that are essential for the model to function properly.

      Given that the paper is already quite long, we have opted not to include further exploration of the parameters of the U-shaped learning rule in the paper. However, for the reviewer’s information, we report the effects of a few illustrative manipulations of these parameters below. As a general principle, the effects of these manipulations make sense in light of the theoretical framework described in the paper.

      For example, the parameter “DRevMag” controls the size of the negative “dip” in the U-shaped curve (more negative values = a larger dip). Given that this negative dip is essential for severing weights to competitors and causing differentiation, shifting DRevMag upwards towards zero should shift the balance of the model away from differentiation and towards integration. This is indeed what we observe, as shown in this parameter sweep from the Chanales simulation:

      Author response image 5.

      As another example: The “DRev” parameter controls where the U-shaped curve transitions from negative weight change to positive weight change. Lower values of DRev mean that the region of coactivity values leading to negative weight change will be smaller, and the region of coactivity values leading to positive weight change will be larger. As such, we would expect that lower values of DRev would bias the model toward integration. That is indeed the case, as shown in this parameter sweep from the Schlichting Blocked simulation:

      Author response image 6.

      There are a few other points that may limit the model's ability to clearly map onto or make predictions about empirical data. The model(s) seems very keen to integrate and do so more completely than the available empirical data suggest. For instance, there is a complete collapse of representations in half of the simulations in the Chanales et al model and the blocked simulation in the Schlichting et al model also seems to produce nearly complete integration Even if the Chanales et al paper had observed some modest behavioral attraction effects, this model would seem to over-predict integration. The author's somewhat implicitly acknowledge this when they discuss the difficulty of producing differentiation ("Practical Advice for Getting the Model to Show Differentiation") and not of producing integration, but don't address it head on.

      We thank the reviewer for this comment – R1 had a similar comment. We have added a new section to the Discussion to address this point (p. 30):

      “Reconciling the Prevalence of Differentiation in the Model and in the Data.

      A key lesson from our model is that, from a computational perspective, it is challenging to obtain differentiation effects: The region of parameter space that gives rise to differentiation is much smaller than the one that gives rise to integration (for further discussion of this issue, see the section in Methods on Practical Advice for Getting the Model to Show Differentiation). However, the fact that integration is more prevalent in our simulations across parameter configurations does not mean that integration will be more prevalent than differentiation in real-life circumstances. What really matters in predicting the prevalence of differentiation in real life is how the parameters of the brain map on to parameters of the model: If the parameters of the brain align with regions of model parameter space that give rise to differentiation (even if these regions are small), this would explain why differentiation has been so robustly observed in extant studies. Indeed, this is exactly the case that we sought to make above about the hippocampus – i.e., that its use of especially sparse coding and a high learning rate will give rise to the kinds of neural dynamics that cause differentiation (as opposed to integration). As another example, while it is true that half of the overlap conditions in our simulation of Chanales et al. (2021) give rise to integration, this does not imply that integration will occur half of the time in the Chanales et al. (2021) study; it may be that the levels of overlap that are actually observed in the brain in Chanales et al. (2021) are more in line with the levels of overlap that give rise to differentiation in our model.”

      Second, the authors choice of strongly prewiring associations in the Chanales and Favila models makes it difficult to think about how their model maps onto experimental contexts where competition is presumably occurring while associations are only weakly learned. In the Chanales et al paper, for example, the object-face associations are not well learned in initial rounds of the color memory test. While the authors do justify their modeling choice and their reasons have merit, the manipulation of AX association strength in the Schlichting et al model also makes it clear that the association strength has a substantial effect on the model output. Given the effect of this manipulation, more clarity around this assumption for the other two models is needed.

      We thank the reviewer for bringing this up. We have edited the section entitled “A Note on Prewiring Representations” in the Methods to further justify our choice to prewire associations in the Chanales and Favila models (p. 37):

      “In our model, our practice of ``prewiring'' memory representations for the A and B pairmates serves two functions. In some cases, it is meant to stand in for actual training (as in the blocked / interleaved manipulation; the connections supporting the AX association are prewired to be stronger in the blocked condition than in the interleaved condition). However, the other, more fundamental role of prewiring is to ensure that the A and B input patterns evoke sparse distributed representations in the hidden layer (i.e., where some units are strongly active but most other units are inactive). In the real brain, this happens automatically because the weight landscape has been extensively sculpted by both experience and evolution. For example, in the real hippocampus, when the second pairmate is presented for the first time, it will evoke a sparse distributed representation in the CA3 subfield (potentially overlapping with the first pairmate’s CA3 representation) even before any learning of the second pairmate has occurred, due to the strong, sparse mossy fiber projections that connect the dentate gyrus to CA3 (McNaughton & Morris, 1987). As discussed above, we hypothesize that this initial, partial overlap between the second pairmate’s representation and the first pairmate’s representation can lead to pop-up of the unique features of the first pairmate’s representation, triggering learning that leads to differentiation or integration. In our small-scale model, we are effectively starting with a ``blank brain''; in the absence of prewiring, the A and B inputs would activate overly diffuse representations that do not support these kinds of competitive dynamics. As such, prewiring in our model is necessary for proper functioning. The presence of prewired A and B representations should therefore not be interpreted as reflecting a particular training history (except in the blocked / interleaved case above); rather, these prewired representations constitute the minimum step we would take to ensure well-defined competitive dynamics in our small-scale model.

      The fact that connection strengths serve this dual function – sometimes reflecting effects of training (as in our simulation of Schlichting et al., 2015) and in other cases reflecting necessary prewiring – complicates the interpretation of these strength values in the model. Our view is that this is a necessary limitation of our simplified modeling approach – one that can eventually be surmounted through the use of more biologically-detailed architectures (see Limitations and Open Questions in the Discussion).”

      Overall, this is strong and clearly described work that is likely to have a positive impact on computational and empirical work in learning and memory. While the authors have written about some of the ideas discussed in this paper previously, a fully implemented and openly available model is a clear advance that will benefit the field. It is not easy to translate a high-level description of a learning rule into a model that actually runs and behaves as expected. The fact that the authors have made all their code available makes it likely that other researchers will extend the model in numerous interesting ways, many of which the authors have discussed and highlighted in their paper.

      Reviewer #3 (Public Review):

      This paper proposes a computational account for the phenomenon of pattern differentiation (i.e., items having distinct neural representations when they are similar). The computational model relies on a learning mechanism of the nonmonotonic plasticity hypothesis, fast learning rate and inhibitory oscillations. The relatively simple architecture of the model makes its dynamics accessible to the human mind. Furthermore, using similar model parameters, this model produces simulated data consistent with empirical data of pattern differentiation. The authors also provide insightful discussion on the factors contributing to differentiation as opposed to integration. The authors may consider the following to further strengthen this paper:

      The model compares different levels of overlap at the hidden layer and reveals that partial overlap seems necessary to lead to differentiation. While I understand this approach from the perspective of modeling, I have concerns about whether this is how the human brain achieves differentiation. Specifically, if we view the hidden layer activation as a conjunctive representation of a pair that is the outcome of encoding, differentiation should precede the formation of the hidden layer activation pattern of the second pairmate. Instead, the model assumes such pattern already exists before differentiation. Maybe the authors indeed argue that mechanistically differentiation follows initial encoding that does not consider similarity with other memory traces?

      Related to the point above, because the simulation setup is different from how differentiation actually occurs, I wonder how valid the prediction of asymmetric reconfiguration of hidden layer connectivity pattern is.

      We thank the reviewer for this comment. In the revised manuscript, we have edited the “Note on Prewiring Representations” in the Methods to clarify how our assumptions about prewiring relate to what we really think is happening in the brain (p. 37):

      “In our model, our practice of ``prewiring'' memory representations for the A and B pairmates serves two functions. In some cases, it is meant to stand in for actual training (as in the blocked / interleaved manipulation; the connections supporting the AX association are prewired to be stronger in the blocked condition than in the interleaved condition). However, the other, more fundamental role of prewiring is to ensure that the A and B input patterns evoke sparse distributed representations in the hidden layer (i.e., where some units are strongly active but most other units are inactive). In the real brain, this happens automatically because the weight landscape has been extensively sculpted by both experience and evolution. For example, in the real hippocampus, when the second pairmate is presented for the first time, it will evoke a sparse distributed representation in the CA3 subfield (potentially overlapping with the first pairmate’s CA3 representation) even before any learning of the second pairmate has occurred, due to the strong, sparse mossy fiber projections that connect the dentate gyrus to CA3 (McNaughton & Morris, 1987). As discussed above, we hypothesize that this initial, partial overlap between the second pairmate’s representation and the first pairmate’s representation can lead to pop-up of the unique features of the first pairmate’s representation, triggering learning that leads to differentiation or integration. In our small-scale model, we are effectively starting with a ``blank brain''; in the absence of prewiring, the A and B inputs would activate overly diffuse representations that do not support these kinds of competitive dynamics. As such, prewiring in our model is necessary for proper functioning. The presence of prewired A and B representations should therefore not be interpreted as reflecting a particular training history (except in the blocked / interleaved case above); rather, these prewired representations constitute the minimum step we would take to ensure well-defined competitive dynamics in our small-scale model.

      The fact that connection strengths serve this dual function – sometimes reflecting effects of training (as in our simulation of Schlichting et al., 2015) and in other cases reflecting necessary prewiring – complicates the interpretation of these strength values in the model. Our view is that this is a necessary limitation of our simplified modeling approach – one that can eventually be surmounted through the use of more biologically-detailed architectures (see Limitations and Open Questions in the Discussion).”

      Although as the authors mentioned, there haven't been formal empirical tests of the relationship between learning speed and differentiation/integration, I am also wondering to what degree the prediction of fast learning being necessary for differentiation is consistent with current data. According to Figure 6, the learning rates lead to differentiation in the 2/6 condition achieved differentiation after just one-shot most of the time. On the other hand, For example, Guo et al (2021) showed that humans may need a few blocks of training and test to start showing differentiation.

      We thank the reviewer for mentioning this. We have added a paragraph to the “Differentiation Requires a High Learning Rate and Is Sensitive to Activity Dynamics” section of the Discussion that addresses this point (pp. 28-29):

      “Although the results from Wanjia et al. (2021) provide strong support for the model's prediction that differentiation will be abrupt, they raise another question: What explains variance across items in when this abrupt change takes place? The answer to this question remains to be seen, but one possibility is encoding variability: If we assume that participants stochastically sample (i.e., attend to) the features of the scene pairmates, it is possible that participants might initially fail to sample the features that distinguish the scene pairmates, which can be quite subtle – and if the distinguishing features of the pairmates are not represented in high-level visual regions (i.e., the pairmates are represented in these regions as having the same features), this could delay the onset of differentiation until the point at which the distinguishing features happen (by chance) to be sampled.”

      Related to the point above, the high learning rate prediction also seems to be at odds with the finding that the cortex, which has slow learning (according to the theory of complementary learning systems), also shows differentiation in Wammes et al (2022).

      We now address this point in the section of the Discussion entitled “Differentiation Requires a High Learning Rate and Is Sensitive to Activity Dynamics” (p. 27):

      “Our finding that differentiation requires a high learning rate suggests that differentiation will be more evident in the hippocampus than in neocortex, insofar as hippocampus is thought to have a higher learning rate than neocortex (McClelland et al., 1995). In keeping with this prediction, numerous studies have found differentiation effects in hippocampus but not in neocortical regions involved in sensory processing (e.g., Chanales et al., 2017; Favila et al., 2016; Zeithamova et al., 2018). At the same time, some studies have found differentiation effects in neocortex (e.g., Schlichting et al., 2015; Wammes et al., 2022). One possible explanation of these neocortical differentiation effects is that they are being ``propped up’’ by top-down feedback from differentiated representations in the hippocampus.”

      More details about the learning dynamics would be helpful. For example, equation(s) showing how activation, learning rate and the NMPH function work together to change the weight of connections may be added. Without the information, it is unclear how each connection changes its value after each time point.

      We thank the reviewer for this comment. We have made two major changes to address this concern. First, we have edited the “Learning” section within “Basic Network Properties” in the main text (pp. 6-7):

      “Connection strengths in the model between pairs of connected units x and y were adjusted at the end of each trial (i.e., after each stimulus presentation) as a U-shaped function of the coactivity of x and y, defined as the product of their activations on that trial. The parameters of the U-shaped learning function relating coactivity to change in connection strength (i.e., weakening / strengthening) were specified differently for each projection where learning occurs (bidirectionally between the input and hidden layers, the hidden layer to itself, and the hidden to output layer). Once the U-shaped learning function for each projection in each version of the model was specified, we did not change it for any of the various conditions. Details of how we computed coactivity and how we specified the U-shaped function can be found in the Methods section.”

      Second, we have added the requested equations to the “Learning” part of the Methods (pp. 37-38):

      The right side of the function, strong activation leads to strengthening of the connectivity, which I assume will lead to stronger activation on the next time point. The model has an upper limit of connection strength to prevent connection from strengthening too much. The same idea can be applied to the left side of the function: instead of having two turning points, it can be a linear function such that low activation keeps weakening connection until the lower limit is reached. This way the NMPH function can take a simpler form (e.g., two line-segments if you think the weakening and strengthening take different rates) and may still simulate the data.

      We thank the reviewer for mentioning this. We have added a new paragraph in the “Learning” section of the Methods to justify the particular shape of the learning curve (pp. 38-39):

      “Evidence for the U-shaped plasticity function used here (where low activation leads to no change, moderate activation leads to weakening, and higher levels of activation lead to strengthening) was previously reviewed in Ritvo et al. (2019). In brief, there are three lines of work that support the U shape: First, multiple neurophysiological studies have found that moderate postsynaptic depolarization leads to synaptic weakening and higher levels of depolarization lead to synaptic strengthening (e.g., Artola et al., 1990; Hansel et al., 1996). Second, human neuroscience studies have used pattern classifiers, applied to fMRI and EEG data, to measure memory activation, and have related this measure to subsequent memory accessibility; several studies using this approach have found that low levels of activation lead to no change in memory strength, moderate levels of activation lead to impaired subsequent memory, and higher levels of activation lead to increased subsequent memory (e.g., Newman and Norman, 2010; Detre et al., 2013; Kim et al., 2014; for related findings, see Lewis-Peacock and Norman, 2014; Wang et al., 2019). Third, a recent human fMRI study by Wammes et al. (2022) manipulated memory activation by varying the visual similarity of pairmates and observed a U-shaped function relating visual similarity to representational change in the hippocampus, whereby low levels of pairmate similarity were associated with no change, moderate levels of similarity were associated with differentiation, and the differentiation effect went away at higher levels of similarity.

      We have also included a pointer to this new paragraph in the “Nonmonotonic Plasticity Hypothesis” section of Introduction (p. 2):

      (for further discussion of the empirical justification for the NMPH, see the Learning subsection in the Methods)”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A few additional minor things about data presentation and the like:

      (1) Figure 1 legend - a more general description of how to interpret the figure might be helpful for more naive readers (e.g., explaining how one can visualize in the schematic that there is overlap in the hidden layer between A and B). Also, from the Figure 1 depiction, it's not clear what is different about the setup from the initial left hand side panels in A, B, C, to make it such that activity spreads strongly to A in panel A, weakly in panel B, and not at all in panel C since the weights are the same. Is there a way to incorporate this into the graphic, or describe it in words?

      To address this point, we have added the following text to the Figure 1 caption (p. 3):

      “Note that the figure illustrates the consequences of differences in competitor activation for learning, without explaining why these differences would arise. For discussion of circumstances that could lead to varying levels of competitor activation, see the simulations described in the text.”

      (2) I believe not all of the papers cited on lines 193-195 actually have similarity manipulations in them. I'd recommend double checking this list and removing those less relevant to the statement.

      Thank you for pointing this out; we have removed the Ballard reference and we have clarified what we mean by similarity reversal (p. 7):

      “The study was inspired by recent neuroimaging studies showing ``similarity reversals'', wherein stimuli that have more features in common (or share a common associate) show less hippocampal pattern similarity (Favila et al., 2016; Schlichting et al., 2015; Molitor et al., 2021; Chanales et al., 2017; Dimsdale-Zucker et al., 2018; Wanjia et al., 2021; Zeithamova et al., 2018; Jiang et al., 2020; Wammes et al., 2022).”

      (3) I wanted a bit more detail about how the parameters were set in the main paper, not just in the methods. Even something as brief as noting that model fitting was done by hand by tweaking parameters to re-create the empirical patterns (if I'm understanding correctly) would have been helpful for me.

      To address this point, we have added the following text under “Basic Network Properties” (p. 4):

      “Our goal was to qualitatively fit key patterns of results from each of the aforementioned studies. We fit the parameters of the model by hand as they are highly interdependent (see the Methods section for more details).”

      (4) In Figure 4E, it would be helpful to describe the x and y axes of the MDS plots in the legend.

      To address this point, we have added the following new text to the Figure 4 caption that clarifies how the MDS plots were generated (p. 11):

      “MDS plots were rotated, shifted, and scaled such that pairmate 1before is located at (0,0), pairmate 2before is located directly to the right of pairmate 1before, and the distance between pairmate 1before and pairmate 2before is proportional to the baseline distance between the pairmates.”

      (5) Figure 6 - at first I thought the thicker line was some sort of baseline, but I think it is just many traces on top of one another. If other readers may be similarly confused, perhaps this could be stated.

      Thanks for this comment. We have updated Figure 6 (p. 16).

      We have also updated the caption.

      I am having a lot of difficulty understanding the terms "competitor-to-competitor,"

      "competitor-to-target/shared," and "target/shared-to-target/shared," and therefore I don't fully get Figure 5. I think it might be helpful to expand the description of these terms where they are first introduced in the paper (p. 13?). I think I am missing something crucial here, and I am not quite sure what that is-which I know is not very helpful! But, to narrate my confusion a bit, I thought that these terms would somehow relate to connections between different connections of the network. For example is competitor-to-competitor within the hidden layer? Or is this somehow combining across relevant connections that might span different pairs of layers in the model? And, I really have no idea why it is "target/shared."

      Thank you for these comments. We have updated Figure 5 and we have also made several changes to the main text and the figure caption to address these points.

      Changes to the main text (p. 13):

      “Whether symmetric or asymmetric integration occurs depends on the relative strengths of connections between pairs of unique competitor units (competitor-competitor connections) compared to connections between unique competitor units and shared units (competitor-shared connections) after the first trial (Figure 5; note that the figure focuses on connections between hidden units, but the principle also applies to connections that span across layers). Generally, coactivity between unique competitor units (competitor-competitor coactivity) is less than coactivity between unique competitor units and shared units (competitor-shared coactivity), which is less than coactivity between unique target units and shared units (target-shared coactivity).”

      (7) Relatedly in Figure 13, I understand how some competitor-to-target/shared connections could be spared in the bottom instance given panel B. However, I'm struggling to understand how that relates to the values in the corresponding chart in panel A. What about panel A, bottom (vs. the top) means lower coactivities between some competitor-to-target/shared? Is it because if the noise level is higher, the "true" activation of competitor-to-target/shared connections is weaker? I think again, I'm missing something critical here! and wonder if other readers may be in the same situation. (I know the authors described this also on p. 36, but I'm still confused!)

      We have updated Figure 13 to clarify these points.

      (8)  In Figure 9, I believe there is no caption for panel D. Also, it looks as though the item unit active for A and B is the same. I wonder if this is an error?

      Thank you for catching these errors! They have both been fixed.

      Reviewer #2 (Recommendations For The Authors):

      -Perhaps I missed it, but I think defining coactivity (how it is computed) in the main text would be useful for readers, as this is critical for understanding the model. I did find it in the methods.

      We thank the reviewer for this suggestion. We have updated the “Learning” section within “Basic Network Properties” in the main text to address this point (pp. 6-7):

      “Connection strengths in the model between pairs of connected units x and y were adjusted at the end of each trial (i.e., after each stimulus presentation) as a U-shaped function of the coactivity of x and y, defined as the product of their activations on that trial. The parameters of the U-shaped learning function relating coactivity to change in connection strength (i.e., weakening / strengthening) were specified differently for each projection where learning occurs (bidirectionally between the input and hidden layers, the hidden layer to itself, and the hidden to output layer). Once the U-shaped learning function for each projection in each version of the model was specified, we did not change it for any of the various conditions. Details of how we computed coactivity and how we specified the U-shaped function can be found in the Methods section.”

      -The modeling results in the different face condition are at odds with the data for the Favila et al model (they observe some differentiation in the paper and the model predicts no change). This could be due to a number of unmodeled factors, but it is perhaps worth noting.

      Thank you for pointing this out. It is possible to better capture the pattern of results observed by Favila et al. in their paper (with some differentiation in the different-face condition and even more differentiation in the same-face condition) by slightly adjusting the model parameters (specifically, by setting the oscillation amplitude Osc for the hidden layer to .1 instead of .067).

      Rather than replacing the old (Osc \= .067) results in the paper, which would entail re-making the associated videos, etc., we have added a supplementary figure (Figure 8 - Supplement 1; see p.45):

      We also added new text to the Favila Results, under “Differentiation and Integration” (p. 20):

      “Note also that the exact levels of differentiation that are observed in the different-face and same-face conditions are parameter dependent; for an alternative set of results showing some differentiation in the different-face condition (but still less than is observed in the same-face condition), see Figure 8 - Supplement 1.”

      -Related to my comment in the public review about pre-wiring associations, in the caption for Figure 9 (Schlichting model), the authors report "In both conditions, the pre-wired connection linking the "item B" hidden units to the "item X" output unit is set to .7. In the interleaved condition, the connection linking the "item A" hidden units to the "item X" output unit is set to .8, to reflect some amount of initial AX learning. In the blocked condition, the connection linking the "item A" hidden units to the "item X" output unit is set a higher value (.999), to reflect extra AX learning." What are the equivalent values for the other models, especially the Favila model since the structure is the same as Schlichting? I understood all the "strong" connections to be .99 unless otherwise stated. If that's the case, I don't understand why the blocked Schlichting model and the Favila model produce opposite effects. More clarity would be useful here.

      We have added a new paragraph to the results section for the Schlicting model (under “Differentiation and Integration”) to clarify why the blocked Schlichting model and the Favila model show different results (p. 24):

      “Note that the key feature driving integration in the blocked condition of this simulation is not the high strength of the connection from X to A on its own – rather, it is the asymmetry in the pretrained connection strengths from X to A (.999) and from X to B (.7). This asymmetry, which is meant to reflect the extensive training on A-X that occurred before the initial presentation of B-X, results in the A-X hidden representation decisively winning the competition during B-X presentation, which then leads to the B input also being linked to this representation (i.e., integration). It is instructive to compare this to the same-face condition from our simulation of Favila et al. (2016): In that simulation, the two pairmates are also linked strongly (.99 initial connection strength) to a shared associate, but in that case the connections are equally strong, so there is more balanced competition -- in this case, the competitor representation only comes to mind moderately (instead of displacing the target representation), so the result is differentiation instead of integration.”

      -The meaning of the different colored dots in Figure 5 is bit hard to keep track of, even given the legend labels. The figure might benefit from a model sketch highlighting each of the different coactivity types. The left side of Fig 13 was useful but again somehow mapping on the colors would help further. Another note on these figures: what does having two dots of each color mean? Is it just an illustration of the variance? There would be more dots if there was one dot per coactivity value.

      We have updated Figure 5 and Figure 13 to clarify these points (including a clarification that the dots only represent a subset of the possible pairings between units).

      -While I appreciate the goal of the paper is to account for these three studies, readers who aren't familiar with or specifically interested in these studies may appreciate a small amount of intuition on why formalizing unsupervised learning models may be broadly important for computational investigations of learning/memory/cognition.

      We have added the following text under “Basic Network Properties” in the Introduction to address this point (p. 4):

      “Achieving a better understanding of unsupervised learning is an important goal for computational neuroscience, given that learning agents have vastly more opportunities to learn in an unsupervised fashion than from direct supervision (for additional discussion of this point, see, e.g., Zhuang et al., 2021).”

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this useful study, the authors report the efficacy, hematological effects, and inflammatory response of the BPaL regimen (containing bedaquiline, pretomanid, and linezolid) compared to a variation in which Linezolid is replaced with the preclinical development candidate spectinamide 1599, administered by inhalation in tuberculosis-infected mice. The authors provide convincing evidence that supports the replacement of Linezolid in the current standard of care for drug-resistant tuberculosis. However, a limitation of the work is the lack of control experiments with bedaquiline and pretomanid only, to further dissect the relevant contributions of linezolid and spectinamide in efficacy and adverse effects.

      We acknowledge a limitation in our study due to lack of groups with monotherapy of bedaquiline and pretomanid however, similar studies to understand contribution of bedaquiline and pretomanid to the BPaL have been published already (references #4 and #60 in revised manuscript).  Our goal was to compare the BPaS versus the BPaL with the understanding that TB treatment requires multidrug therapy.   We omitted monotherapy groups to reduce complexity of the studies because the multidrug groups require very large number of animals with very intensive and complex dosing schedules. Even if B or Pa by themselves have better efficacy than the BPa or BPaL combination, patients will not be treated with only B or Pa because of very high risk of developing drug resistance to B or/and PA. If drug resistance is developed for B or Pa, the field will lose very effective drugs against TB. 

      Although the manuscript is well written overall, a re-formulation of some of the stated hypotheses and conclusions, as well as the addition of text to contextualize translatability, would improve value.

      Manuscript has been edited to address these critiques.  Answers to individual critiques are below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript is an extension of previous studies by this group looking at the new drug spectinamide 1599. The authors directly compare therapy with BPaL (bedaquiline, pretomanid, linezolid) to a therapy that substitutes spectinamide for linezolid (BPaS). The Spectinamide is given by aerosol exposure and the BPaS therapy is shown to be as effective as BPaL without adverse effects. The work is rigorously performed and analyses of the immune responses are consistent with curative therapy.

      Strengths:

      (1) This group uses 2 different mouse models to show the effectiveness of the BPaS treatment.

      (2) Impressively the group demonstrates immunological correlates associated with Mtb cure with the BPaS therapy.

      (3) Linezolid is known to inhibit ribosomes and mitochondria whereas spectinaminde does not. The authors clearly demonstrate the lack of adverse effects of BPaS compared to BPaL.

      Weaknesses:

      (1) Although this is not a weakness of this paper, a sentence describing how the spectinamide would be administered by aerosolization in humans would be welcomed.

      We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and its delivery from a RS01 Plastiape inhaler device (reference #59 in revised manuscript).  To address this critique, we added a last paragraph in discussion “It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler" (reference #59 in revised manuscript).  

      Reviewer #2 (Public Review):

      Summary:

      Replacing linezolid (L) with the preclinical development candidate spectinamide 1599, administered by inhalation, in the BPaL standard of care regimen achieves similar efficacy, and reduces hematological changes and proinflammatory responses.

      Strengths:

      The authors not only measure efficacy but also quantify histological changes, hematological responses, and immune responses, to provide a comprehensive picture of treatment response and the benefits of the L to S substitution.

      The authors generate all data in two mouse models of TB infection, each reproducing different aspects of human histopathology.

      Extensive supplementary figures ensure transparency. 

      Weaknesses:

      The articulation of objectives and hypotheses could be improved.

      We edited to "The AEs were associated with the long-term administration of the protein synthesis inhibitor linezolid. Spectinamide 1599 (S) is also a protein synthesis inhibitor of Mycobacterium tuberculosis with an excellent safety profile, but which lacks oral bioavailability. Here, we propose to replace L in the BPaL regimen with spectinamide administered via inhalation and we demonstrate that inhaled spectinamide 1599, combined with BPa ––BPaS regimen––has similar efficacy to that of BPaL regimen while simultaneously avoiding the L-associated AEs.

      Reviewer #3 (Public Review):

      Summary:

      In this paper, the authors sought to evaluate whether the novel TB drug candidate, spectinamide 1599 (S), given via inhalation to mouse TB models, and combined with the drugs B (bedaquiline) and Pa (pretomanid), would demonstrate similar efficacy to that of BPaL regimen (where L is linezolid). Because L is associated with adverse events when given to patients long-term, and one of those is associated with myelosuppression (bone marrow toxicity) the authors also sought to assess blood parameters, effects on bone marrow, immune parameters/cell effects following treatment of mice with BPaS and BPaL. They conclude that BPaL and BPaS have equivalent efficacy in both TB models used and that BPaL resulted in weight loss and anemia (whereas BPaL did not) under the conditions tested, as well as effects on bone marrow.

      Strengths:

      The authors used two mouse models of TB that are representative of different aspects of TB in patients (which they describe well), intending to present a fuller picture of the activity of the tested drug combinations. They conducted a large body of work in these infected mice to evaluate efficacy and also to survey a wide range of parameters that could inform the effect of the treatments on bone marrow and on the immune system. The inclusion of BPa controls (in most studies) and also untreated groups led to a large amount of useful data that has been collected for the mouse models per se (untreated) as well as for BPa - in addition to the BPaS and BPaL combinations which are of particular interest to the authors. Many of these findings related to BPa, BPaL, untreated groups, etc corroborate earlier findings and the authors point this out effectively and clearly in their manuscript. To go further, in general, it is a well-written and cited article with an informative introduction.

      Weaknesses:

      The authors performed a large amount of work with the drugs given at the doses and dosing intervals started, but at present, there is no exposure data available in the paper. It would be of great value to understand the exposures achieved in plasma at least (and in the lung if more relevant for S) in order to better understand how these relate to clinical exposures that are observed at marketed doses for B, Pa, and L as well as to understand the exposure achieved at the doses being evaluated for S. If available as historical data this could be included/cited. Considering the great attempts made to evaluate parameters that are relevant to clinical adverse events, it would add value to understand what exposures of drug effects such as anemia, weight loss, and bone marrow effects, are being observed. It would also be of value to add an assessment of whether the weight loss, anemia, or bone marrow effects observed for BPaL are considered adverse, and the extent to which we can translate these effects from mouse to patient (i.e. what are the limitations of these assessments made in a mouse study?). For example, is the small weight loss seen as significant, or is it reversible? Is the magnitude of the changes in blood parameters similar to the parameters seen in patients given L? In addition, it is always challenging to interpret findings for combinations of drugs, so the addition of language to explain this would add value: for example, how confident can we be that the weight loss seen for only the BPaL group is due to L as opposed to a PK interaction leading to an elevated exposure and weight loss due to B or Pa?

      We totally agree with this critique but the studies suggested by the reviewer are very expensive and

      logistically/resource intensive. Data reported in this manuscript was used as preliminary data in a RO1 application to NIH-NIAID that included studies proposed above by this reviewer. The authors are glad to report that the application got a fundable score and is currently under consideration for funding by NIH-NIAID.   The summary of proposed future studies is included in the last paragraph of the discussion in this revised manuscript. 

      Turning to the evaluations of activity in mouse TB models, unfortunately, the evaluations of activity in the BALB/c mouse model as well as the spleens of the Kramnik model resulted in CFU below/at the limit of detection and so, to this reviewer's understanding of the data, comparisons between BPaL and BPaS cannot be made and so the conclusion of equivalent efficacy in BALB/c is not supported with the data shown. There is no BPa control in the BALB/c study, therefore it is not possible to discern whether L or S contributed to the activity of BPaL or BPaS; it is possible that BPa would have shown the same efficacy as the 3 drug combinations. It would be valuable to conduct a study including a BPa control and with a shorter treatment time to allow comparison of BPa, BPaS, and BPaL. 

      We agree with the reviewer these studies need to be done.  Some of them were recently published by our colleague Dr. Lyons (reference #60 in revised manuscript). The studies proposed by the reviewer will be performed under a new award under consideration for funding by the NIH-NIAID, the summary of future studies is included in the last paragraph of the discussion in this revised manuscript. 

      In the Kramnik lungs, as the authors rightly note, the studies do not support any contribution of S or L to BPa - i.e. the activity observed for BPa, BPaL, and BPaS did not significantly differ. Although the conclusions note equivalency of BPaL and BPaS, which is correct, it would be helpful to also include BPa in this statement;

      We edited and now included in lines #191 as requested 

      It would be useful to conduct a study dosing for a longer period of time or assessing a relapse endpoint, where it is possible that a contribution of L and/or S may be seen - thus making a stronger argument for S contributing an equivalent efficacy to L. The same is true for the assessment of lesions - unfortunately, there was no BPa control meaning that even where equivalency is seen for BPaL and BPaS, the reader is unable to deduce whether L or S made a contribution to this activity.

      Added in the future plans in the last paragraph of discussion

      “Future studies are already under consideration for funding by NIH-NIAID to understand the pharmacokinetics of mono, binary and ternary combinations of BPaS. These studies also aim to identify the optimal dose level and dosing frequency of each regimen along with their efficacy and relapse free-sterilization potential. Studies are also planned using a model-based pharmacokinetic-pharmacodynamic (PKPD) framework, guided by an existing human BPa PKPD model (reference #61 in revised manuscript), to find allometric human dose levels, dosing frequencies and treatment durations that will inform the experimental design of future clinical studies. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Although this is not a weakness of this paper, a sentence describing how the spectinamide would be administered by aerosolization in humans would be welcomed.

      Last paragraph of discussion was added “It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler". We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and delivered from a RS01 Plastiape inhaler device (reference #59 in revised manuscript)

      Reviewer #2 (Recommendations For The Authors):

      Major comments

      The Abstract lacks focus and could more clearly convey the key messages.

      Edited as requested 

      The two mouse models and why they were chosen need to be described earlier. Currently, it's covered in the first section of the Discussion, but the reader needs to understand the utility of each model in answering the questions at hand before the first results are described, either in the introduction or in the opening section of the results.

      Thank you for suggestion, we agree.  We moved the first paragraph in discussion to last paragraph in Introduction. 

      Line 130: Please justify the doses and dosing frequency for S. A reference to a published manuscript could suffice if compelling.

      The dosing and regimens were previously reported by our groups in ref 21 and 22 in revised manuscript.- 

      (21) Robertson GT, Scherman MS, Bruhn DF, Liu J, Hastings C, McNeil MR, et al. Spectinamides are effective partner agents for the treatment of tuberculosis in multiple mouse infection models. J Antimicrob Chemother.

      2017;72(3):770–7. 

      (22) Gonzalez-Juarrero M, Lukka PB, Wagh S, Walz A, Arab J, Pearce C, et al. Preclinical Evaluation of Inhalational Spectinamide-1599 Therapy against Tuberculosis. ACS Infect Dis. 2021;7(10):2850–63. 

      Figures 1 E to H: several "ns" are missing, please add them.

      Edited as requested 

      Line 184 to 190: suggest moving the body weight plots to a Supplemental Figure, and at least double the size of the histology images to convey the message of lines 192-203.

      Please include higher magnification insets to illustrate the histopathological findings. In that same section, please add a sentence or two describing the lesion scoring concept/method. It is a nice added feature, not widespread in the field, and deserves a brief description.

      Edited as requested.  We added detailed description for scoring method in M&M under histopathology and lesion scoring

      Line 206: please add an introductory sentence explaining why one would expect S to cause (or not) hematological disruption, and why MCHC and RDW were chosen initially (they are markers of xyz). The first part of Figure 3 legend belongs to the Methods.

      To address this critique we added in #225-226 “The effect of L in the blood profile of humans and mouse has been reported (references #38-42 in revised manuscript) but the same has not been reported for S” . In line #229-230 we added “Of 20-blood parameters evaluated, two blood parameters were affected during treatment”. 

      The first part of Figure 3 legend belongs to the Methods.

      We edited Figure 3 to “During therapy of mice in Figure 1, the blood was collected at 1, 2- and 4-weeks posttreatment. The complete blood count was collected in VETSCAN® HM5 hematology analyzer (Zoetis)”.

      Line 218: please explain why the 4 blood parameters that are shown were selected, out of the 20 parameters surveyed.

      We added an explanation in line 239-240 “out 20-blood parameters evaluated, a total of four blood parameters were affected at 2 and 4-weeks-of treatment”.

      Line 243 and again Line 262 (similar to comment Line 206): please add an introductory paragraph explaining the motivation to conduct this analysis and the objective. Can the authors put the experiment in the context of their hypothesis?

      To address this critique, we added in line #235-237 “The Nix-TB trial associated the long-term administration of L within the BPaL regimen as the causative agent resulting in anemia in patients treated with the BPaL regimen (5).”

      Figure 4C (and the plasma and lung equivalent in the SI). This figure needs adequate labeling of axes: X axis = LOG CFU? Please add tick marks for all plots since log CFU is only shown for the bottom line. Y axes have no units: pg/mL as in B?

      Figure legend were edited to add (Y axis:pg/ml) and (X axis; log10CFU).  

      Line 255-256: please remove "pronounced" and "profound". There is a range of CFU reduction and cytokine reduction, from minor to major. The correlation trend is clear and those words are not needed.

      Edited as requested 

      Line 277-289, Figure 6: given the heterogeneity of a C3HeB/FeJ mouse lung (TB infected), and the very heterogeneous cell population distribution in these lungs (Fig. 6A), the validity of whole lung analysis on 2 or 3 mice (the legend should state what 1, 2 and 3 means, individual mice?) is put into question. "F4/80+ cells were observed significantly higher in BPaS compared to UnRx control": Figure S14 suggests a statistically significant difference, but nothing is said about the other cell type, which appears just as much reduced in BPaS compared to UnRx as F4/80+. Overall, sampling the whole lung for these analyses should be mentioned as a limitation in the Discussion.

      We agree with the reviewer that "visually" it appears as other populations in addition to F4/80 have statistical significance.  We run again the two way Anova with Tukey test and only the BPaS and UnRx for F4/80 is significant. 

      We edited figure S16 (previously S14) to add ns for every comparation.  

      In Figure 6A was edited ;  N=2 are 2 mice for Unrx and n=3 mice for BPaL/BPaS each.

      Line 355-360: "The BPa and BPaL regimens altered M:E in the C3HeB/FeJ TB model by suppressing myeloid and inducing erythroid lineages" This suggests that altered M:E is not associated with L, putting into question the comparison between BPaS, BPaL, and UnRx. Can the authors comment on how M:E is altered in BPa and not in BPaS?

      Our interpretation to this result was that addition of S in our regimen BPsS was capable of restoring the M:E ratio altered by the BPa and BPaL. This interpretation was included in main text in line #263-264 and is also now added to abstract

      Line 379: discuss the limitations of working with whole lungs.

      Sorry we cannot understand this request. In our studies we always work with whole lungs if the expected course of histopathology/infection among lung lobes is very variable (as is the case of C3HeB/Fej TB model)

      Concluding paragraph: "Here we present initial results that are in line with these goals." If such a bold claim is made, there needs to be a discussion on the translatability of the route of administration and the dose of S. Otherwise, please rephrase.

      We added the following last paragraph to discussion:

      To conclude, the TB drug development field is working towards developing shorter and safer therapies with a common goal of developing new multidrug regimens of low pill burden that are accessible to patients, of short duration (ideally 2-3 months) and consist of 3-4 drugs of novel mode-of-action with proven efficacy, safety, and limited toxicity. Here we present initial results for new multidrug regimens containing inhaled spectinamide 1599 that are in line with these goals. It is proposed that human use of spectinamides 1599 will be administered using a dry powder formulation delivered by the RS01 Plastiape dry powder inhaler.  We already reported on the aerodynamic properties of dry powder spectinamide 1599 within #3 HPMC capsules and delivered from a RS01 Plastiape inhaler device (reference #59 in revised manuscript). Future studies are already under consideration for funding by NIHNIAID to understand the pharmacokinetics of mono, binary and ternary combinations of BPaS. These studies also aim to identify the optimal dose level and dosing frequency of each regimen along with their efficacy and relapse free-sterilization potential. Studies are also planned using a model-based pharmacokinetic-pharmacodynamic (PKPD) framework, guided by an existing human BPa PKPD model (references #60 and 61 in revised manuscript) , to find allometric human dose levels, dosing frequencies and treatment durations that will inform the experimental design of future clinical studies.

      Minor edits

      Adverse events, not adverse effects (side effects)

      Edited as requested

      BALB/c (not Balb/c, please change throughout).

      Edited as requested

      Line 92: replace 'efficacy' with potency or activity.

      Edited as requested

      "Live" body weight: how is that different from "body weight"? Suggest deleting "live" throughout, or replace with "longitudinally recorded" if that's what is meant, although this is generally implied.

      Edited as requested

      The last line of Figure 2 legend is disconnected. 

      Line 331: delete "human".

      Edited as requested

      Reviewer #3 (Recommendations For The Authors):

      We thank the reviewer for these suggestions.  The data presented in this manuscript with 4 weeks of treatment along with monitoring of effects of therapy in blood, bone marrow and immunity have been submitted for a RO1 application to NIH-NIAID, which have received a fundable score and is under funding consideration. All the points suggested by the reviewer(s) here are included in the research proposed in the RO1 application including manufacturing and physico-chemically characterize larger scale of dry powders of spectinmides and evaluation of their aerodynamic performance for human or animal use; Pharmacokinetics and efficacy studies to determine the optimal dose level and dosing frequency for new multidrug regimens containing spectinamides. These studies include mono, binary and ternary combinations of each multidrug regimen along with their efficacy and relapse free- sterilization potential. These studies will also develop PK/PD simulation-based allometric scaling to aid in human dose projections inhalation. We hope the reviewer will understand all together these studies will last 4-5 years.  

      Although I truly appreciate the great efforts of the authors, I suggest that in order to better evaluate the contribution of S versus L to BPa in these models, repeat studies be run that:

      (a) include BPa groups to allow the contribution of S and L to be assessed. Included in research proposed RO1 application mentioned above

      (b) use shorter treatment times in BALB/c to allow comparisons at end of Tx CFU above the LOD. We have added new data for 2 weeks treatment with BPaL and BPaS in Balb/c mice infected with MTb that was removed from previous submission of this manuscript

      (c) use longer treatment times and ideally a relapse endpoint in Kramnik to allow

      assessment of L and S as contributors to BPa (i.e. give a chance to see better efficacy of BPaL or BPaS versus BPa) and also measure plasma exposures of all drugs (or lung levels if this is the translatable parameter for S) to allow detection of any large DDI and also understand the translation to the clinic. Related to the safety parameters, it would be really great to understand whether or not the observations for BPaL would be labeled adverse in a toxicology study/in a clinical study, and it would be useful to include information on the magnitude of observations seen here versus in the clinic (eg for the hematological parameters).

      The research proposed in the RO1 application mentioned above included extensive PK, extended periods of treatment beyond 1 month of treatment (2-5 months as needed to reach negative culturable bacterial from organs) and of course relapse studies. 

      Minor point: I suggest rewording "high safety profile" when describing spectinomides in the intro - or perhaps qualify the length of dosing where the drug is well tolerated

      "high safety profile" was replaced by “an acceptable safety profile”

    2. eLife assessment

      In this useful study, the authors report the efficacy, hematological effects, and inflammatory response of the BPaL regimen (containing bedaquiline, pretomanid, and linezolid) compared to a variation in which Linezolid is replaced with the preclinical development candidate spectinamide 1599, administered by inhalation in tuberculosis-infected mice. The authors provide convincing evidence that supports the replacement of Linezolid in the current standard of care for drug-resistant tuberculosis. The work will be of interest to those studying tuberculosis treatment regimens.

    3. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript entitled A Modified BPaL Regimen for Tuberculosis Treatment<br /> replaces Linezolid with Inhaled Spectinamides by Malik Zohaib Ali et al. is an extension of previous studies by this group looking at the new drug spectinamide 1599. The authors directly compare therapy with BPaL (bedaquiline, pretomanid, linezolid) to a therapy that substitutes spectinamide for linezolid (BPaS). The Spectinamide is given by aerosol exposure and the BPaS therapy is shown to be as effective as BPaL without adverse effects. The work is rigorously performed and analyses of the immune responses are consistent with curative therapy.

      Strengths:<br /> 1) This group uses 2 different mouse models to show the effectiveness of the BPaS treatment.<br /> 2)Impressively the group demonstrates immunological correlates associated with Mtb cure with the BPaS therapy.<br /> 3)Linezolid is known to inhibit ribsomes and mitochondria whereas spectinaminde does not. The authors clearly demonstrate the lack of adverse effects of BPaS compared to BPaL.

      Weaknesses:<br /> 1) Although this is not a weakness of this paper, a sentence describing how the spectinamide would be administered by aerosolization in humans would be welcomed.

    4. Reviewer #2 (Public Review):

      Summary:<br /> Replacing linezolid (L) with the preclinical development candidate spectinamide 1599, administered by inhalation, in the BPaL standard of care regimen achieves similar efficacy, reduces hematological changes and por-inflammatory responses.

      Strengths:<br /> The authors not only measure efficacy but also quantify histological changes, hematological responses and immune responses, to provide a comprehensive picture of treatment response and the benefits of the L to S substitution.

      The authors generate all data in two mouse models of TB infection, each reproducing different aspects of human histopathology.

      Extensive supplementary figures ensure transparency.

      Weaknesses:<br /> Articulation of objectives and hypotheses can be improved, as suggested below.

    5. Reviewer #3 (Public Review):

      Summary:<br /> In this paper, the authors sought to evaluate whether the novel TB drug candidate, spectinamide 1599 (S), given via inhalation to mouse TB models, and combined with the drugs B (bedaquiline) and Pa (pretomanid), would demonstrate similar efficacy to that of BPaL regimen (where L is linezolid). Because L is associated with adverse events when given to patients longterm, and one of those is associated with myelosuppression (bone marrow toxicity) the authors also sought to assess blood parameters, effects on bone marrow, immune parameters/cell effects following treatment of mice with BPaS and BPaL. They conclude that BPaL and BPaS have equivalent efficacy in both TB models used and that BPaL resulted in weight loss and anemia (whereas BPaS did not) under the conditions tested, as well as effects on bone marrow.

      Strengths:<br /> The authors used two mouse models of TB that are representative of different aspects of TB in patients (which they describe well), intending to present a fuller picture of the activity of the tested drug combinations. They conducted a large body of work in these infected mice to evaluate efficacy and also to survey a wide range of parameters that could inform the effect of the treatments on bone marrow and on the immune system. The inclusion of BPa controls (in most studies) and also untreated groups led to a large amount of useful data that has been collected for the mouse models per se (untreated) as well as for BPa - in addition to the BPaS and BPaL combinations which are of particular interest to the authors. Many of these findings related to BPa, BPaL, untreated groups etc corroborate earlier findings and the authors point this out effectively and clearly in their manuscript. To go further, in general, it is a well written and cited article with an informative introduction.

      Weaknesses:<br /> The authors performed a large amount of work with the drugs given at the doses and dosing intervals stated, but there is no exposure data available at this time. The authors intend to evaluate exposure-effect relationships in future work. An understanding of the exposures at which the efficacy and adverse effects are seen will assist in the translation of these findings to the clinic.<br /> In addition, it is always challenging to interpret findings for combinations of drugs and for now, the data available cannot attribute confidence to the weight loss seen for only the BPaL group to L specifically, as opposed to a PK interaction leading to an elevated exposure and weight loss due to B or Pa. It is not yet possible, then to state that what is seen are "L-associated AEs" - this is assumed only.<br /> The evaluations of activity in the BALB/c mouse model as well as the spleens of the Kramnik model resulted in CFU below/at the limit of detection so comparisons between BPaL and BPaS cannot be made and so the conclusion of equivalent efficacy in BALB/c is not supported with the data shown. There is no BPa control in the BALB/c study, therefore it is not possible to discern whether L or S contributed to the activity of BPaL or BPaS. The same is true for the assessment of lesions - unfortunately, there was no BPa control meaning that even where equivalency is seen for BPaL and BPaS, the reader is unable to deduce whether L or S made a contribution to this activity.<br /> Although these weaknesses limit what we can learn from the current body of data, the authors note that further studies will be done to increase understanding of the points above.

    1. eLife assessment

      ImmCellTyper presents a useful toolkit for CyTOF data analysis, integrating BinaryClust for semi-supervised clustering and cell type annotation. The evidence supporting the findings is convincing, with appropriate and validated methodology. This tool will be helpful to researchers in immunology and cytometry, offering a robust solution for cell type identification and differential analysis.

    2. Reviewer #3 (Public Review):

      Summary:

      ImmCellTyper is a new toolkit for Cytometry by time-of-flight data analysis. It includes BinaryClust, a semi-supervised clustering tool (which takes into account the prior biological knowledge), designed for automated classification and annotation of specific cell types and subpopulations. ImmCellTyper also integrates a variety of tools to perform data quality analysis, batch effect correction, dimension reduction, unsupervised clustering, and differential analysis.

      Strengths:

      The proposed algorithm takes into account the prior knowledge.<br /> The results on different benchmark indicates competitive or better performance (in terms of accuracy and speed) depending on the method.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript presented a useful toolkit designed for CyTOF data analysis, which integrates 5 key steps as an analytical framework. A semi-supervised clustering tool was developed, and its performance was tested in multiple independent datasets. The tool was compared to human experts as well as supervised and unsupervised methods. 

      Strengths: 

      The study employed multiple independent datasets to test the pipeline. A new semi-supervised clustering method was developed. 

      Weaknesses: 

      The examination of the whole pipeline is incomplete. Lack of descriptions or justifications for some analyses. 

      We thank the reviewer’s overall summary and comments of this manuscript. In the last part of the results, we showcased the functionalities of ImmCellTyper in covid dataset, including quality check, BinaryClust clustering, cell abundance quantification, state marker expression comparison within each identified cell types, cell population extraction, subpopulation discovery using unsupervised methods, and data visualization etc. We added more descriptions in the text based on the reviewer’s suggestions. 

      Reviewer #2 (Public Review): 

      Summary: 

      The authors have developed marker selection and k-means (k=2) based binary clustering algorithm for the first-level supervised clustering of the CyTOF dataset. They built a seamless pipeline that offers the multiple functionalities required for CyTOF data analysis. 

      Strengths: 

      The strength of the study is the potential use of the pipeline for the CyTOF community as a wrapper for multiple functions required for the analysis. The concept of the first line of binary clustering with known markers can be practically powerful. 

      Weaknesses: 

      The weakness of the study is that there's little conceptual novelty in the algorithms suggested from the study and the benchmarking is done in limited conditions. 

      We thank the reviewer’s overall summary and comments of this manuscript. While the concept of binary clustering by k-means is not novel, BinaryClust only uses it for individual markers to identify positive and negative cells, then combine it with the pre-defined matrix for cell type identification. This has not been introduced elsewhere. Furthermore, ImmCellTyper streamlines the entire analysis process and enhances data exploration on multiple levels. For instance, users can evaluate functional marker expression level/cellular abundance across both main cell types and subpopulations; Also, this computational framework leverages the advantages of both semi-supervised and unsupervised clustering methods to facilitate subpopulation discovery. We believe these contributions warrant consideration as advancements in the field.  

      As for the benchmarking, we limited the depth only to main cell types rather than subpopulations. The reason is because we only apply BinaryClust to identify main cell types; For the cell subsets discovery, unsupervised methods integrated in this pipeline has already been published and widely used by the research community. Therefore, it does not seem to be necessary for additional benchmarking.

      Reviewer #3 (Public Review): 

      Summary: 

      ImmCellTyper is a new toolkit for Cytometry by time-of-flight data analysis. It includes BinaryClust, a semi-supervised clustering tool (which takes into account prior biological knowledge), designed for automated classification and annotation of specific cell types and subpopulations. ImmCellTyper also integrates a variety of tools to perform data quality analysis, batch effect correction, dimension reduction, unsupervised clustering, and differential analysis. 

      Strengths: 

      The proposed algorithm takes into account the prior knowledge. 

      The results on different benchmarks indicate competitive or better performance (in terms of accuracy and speed) depending on the method. 

      Weaknesses: 

      The proposed algorithm considers only CyTOF markers with binary distribution. 

      We thank the reviewer’s overall summary and comments of this manuscript. Binary classification can be considered as an imitation of human gating strategy, as it is applied to each marker. For example, when characterizing the CD8 T cells, we aim for CD19-CD14-CD3+CD4- population, which is binary in nature (either positive and negative) and follows the same logic as the method (BinaryClust) we developed. Results indicated that it works very well for well-defined main cell lineages, particularly when the expression of the defining marker is not continuous. However, the limitation is for subpopulation identification, because a handful makers behave in a continuum manner, so we suggest unsupervised method after BinaryClust, which also brings another advantage of identifying unknown subsets beyond our current knowledge, and none of the semi-supervised tools can achieve that. To address the reviewer’s concern, we considered the limitation of binary distribution, but it does not profoundly affect the application of the pipeline.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Many thanks for the reviewers’ comments and suggestions, please see below the point-to-point response:

      (1) The style of in-text reference citation is not consistent. Many do not have published years.

      The style of the reference citation has been revised and improved.  

      (2) The font size in the table of Figure 1 is too small, so is Figure 2. 

      The font size has been increased.

      (3) Is flowSOM used as part of BinaryClust? How should the variable running speed of BinaryClust be interpreted, given that it is occasionally slower and sometimes faster than flowSOM in the datasets?

      To answer reviewer’s question, flowSOM is not a part of BinaryClust. They are separate clustering methods that have been incorporated into the ImmCellTyper pipeline. As described in Figure 1, BinaryClust, a semi-supervised method, is used to classify the main cell lineages; while flowSOM, an unsupervised method, is recommended here for further subpopulation discovery. So, they operate independently of each other. To avoid confusions, we slightly modified Figure 1 for clarification.

      Regarding the variability in running speed in Figure 4. The performance of algorithms can indeed be influenced by the characteristics of the datasets, such as size and complexity. The differences observed between the covid dataset and the MPN dataset, such as marker panel, experimental protocol, and data acquisition process etc., could account for this variation. Our explanation is that flowSOM suits better the data structure of covid dataset, which might be the reason why it is slightly faster to analyse compared to the MPN dataset. Moreover, for the covid dataset, the runtime for both BinaryClust and flowSOM is less than 100s, and the difference is not notable. 

      (4) In the Method section ImmCellTyper workflow overview, it is difficult to link the description of the pipeline to Figure 8. There are two sub-pipelines in the text and seven steps in the figure. What are their relations? Some steps are not introduced in the text, such as Data transformation and SCE object construction. What is co-factor 5?

      Figure 8 provides an overview of the entire workflow for CyTOF data analysis, starting from the raw fcs file data and proceeding until downstream analysis (seven steps). But the actual implementation of the pipeline was divided into two separate sections, as outlined in the vignettes of the ImmCellTyper GitHub page (https://github.com/JingAnyaSun/ImmCellTyper/tree/main/vignettes).

      Users will initially run ‘Intro_to_batch_exam_correct’ to perform data quality check and identify potential batch effects, followed by ‘Intro_to_data_analysis’ for data exploration. We agree with the reviewer that the method for this section is a bit confusing, so we’ve added more description for clarification.

      In processing mass cytometry data, arcsine transformation is commonly applied to handle zero values, skewed distributions, and to improve visualization as well as clustering performance. The co-factor here is used as a parameter to scale down the data to control the width of the linear region before arcsine transformation. We usually get the best results by using co-factor 5 for CyTOF data.   

      (5) For differential analysis, could the pipeline analyze paired/repeated samples?

      For the statistical step, ImmCellTyper supports both two-study group comparison using Mann-Whitney Wilcoxon test, and multiple study group comparison (n>2) using Kruskal Wallis test followed by post hoc analysis (pairwise Wilcoxon test or Dunn’s test) with multiple testing correction using Benjamini-Hochberg Procedure.

      Certainly, this pipeline allows flexibilities, users can also extract the raw data of cell frequencies and apply suitable statistical methods for testing.

      (6) In Figure 2A, the range of the two axes is different for Dendritic cells, which could be misleading. Why the agreement is bad for dendritic cells?

      The range for the axes is automatically adapted to the data structure, which explains why they may not necessarily be equal. The co-efficient factor for the correlation of DCs is 0.958, compared to other cell types (> 0.99), it is relatively worse but does not indicate poor agreement.

      Moreover, the abundance of DCs is much less than other cell types, comprising approximately 2-5% of whole cells. As a result, even small differences in abundance may appear to as significant variations. For example, a difference of 1% in DC abundance represents a 2-fold change, which can be perceived as substantial.

      Overall, while the agreement for DCs may appear comparatively lower, it is not necessarily indicative of poor performance, considering both the coefficient factor and the relative abundance of DCs compared to other cell types.

      (7) In the Results section BinaryClust achieves high accuracy, what method was used to get the p-value, such as lines 212, 213, etc.?

      The accuracy of BinaryClust was tested using F-measure and ARI against ground truth (manual gating), the detailed description/calculation can be found in methods. For line 212 and 213, the p-value was calculated using ANOVA for the interaction plot shown in Figure 3. We’ve now added the statistical information into the figure legend.   

      (8) The performance comparison between BinaryClust and LDA is close. The current comparison design looks unfair. Given LDA only trained using half data, LDA may outperform BinaryClust.

      It is true that LDA was trained using half data, which is because this method requires manual gating results as training dataset to build a model, then apply the model to the rest of the files to label cell types. Here we used 50% of the whole dataset as training set. We are of course very happy to implement any additional suggestions for a better partition ratio.

      (9) There are 5 key steps in the proposed workflow. However, not every step was presented in the Results.

      Thanks for the comments. The results primarily focused on demonstrating the precision and performance of BinaryClust in comparison with ground truth and existing tools. Additionally, a case study showcasing the application/functions of the entire pipeline in a dataset was also presented. Due to limitation in space, the implementation details of the pipeline were described in the method section and github documentations, which users/readers can easily access.

      Reviewer #2 (Recommendations For The Authors): 

      The tools suggested by the authors could be potentially useful to the community. However, it's difficult to understand the conceptual novelty of the algorithms suggested here. The concept of binary clustering has been described before (https://doi.org/10.1186/s12859-022-05085-zhttps://doi.org/10.1152/ajplung.00104.2022), and it mainly utilizes k-means clustering set to generate binary clusters based on selected markers. Other algorithms associated with the package are taken from other studies. 

      We acknowledge the reviewer’s comment regarding the novelty of our method. While the concept of binary clustering by k-means has been previously described to transcriptome data, our approach applies it to CyTOF data analysis, which has not been introduced elsewhere. Furthermore, ImmCellTyper streamlines the entire analysis process and enhances data exploration on multiple levels. For instance, users can evaluate functional marker expression level/cellular abundance across both main cell types and subpopulations; Also, as stated in the manuscript, this computational framework leverages the advantages of both semi-supervised and unsupervised clustering methods to facilitate subpopulation discovery. We believe these contributions warrant consideration as advancements in the field.  

      In addition, the benchmarking of clustering performance, especially to reproduce manual gating and comparison to tools such as flowSOM is not comprehensive enough. The result for the benchmarking test could significantly vary depending on how the authors set the ground truth (resolution of cell type annotations). The authors should compare the tool's performance by changing the depth of cell type annotations. Especially, the low abundance cell types such as gdT cells or DCs were not effectively captured by the suggested methods. 

      Thanks for the comment. We appreciate the reviewer’s concern. However, as illustrated in figure 1, our approach uses BinaryClust, a semi-supervised method, to identify main cell types rather than directly targeting subpopulations. The reason is because semi-supervised method relies on users’ prior definition thus is limited to discover novel subsets. In the ImmCellTyper framework, unsupervised method was subsequently applied for subset exploration following the BinaryClust step.

      Regarding benchmarking, we focused on testing the precision of BinaryClust for main cell type characterization, because it is what the method is used for in the pipeline, and we believe this is sufficient. As for the cell subsets discovery, the unsupervised methods we integrated has already been published and widely used by the research community. Therefore, it does not seem to be necessary for additional benchmarking.

      Moreover, as shown in Figure 3 and Table 1, our results indicated that the F-measure for DCs and gdT cells in BinaryClust is 0.80 and 0.92 respectively, which were very close to ground truth and outperformed flowSOM, demonstrating its effectiveness. 

      We hope these clarifications address the reviewer’s concern.

      Minor comments: 

      (1) In Figure 4, it's perplexing to note that BinaryClust shows the slowest runtime for the COVID dataset, compared to the MPN dataset, which features a similar number of cells. What causes this variation? Is it dependent on the number of markers utilized for the clustering? This should be clarified/tested. 

      Thanks for the comment, but we are not sure that we fully understand the question. As shown in figure 4 that BinaryClust has slightly higher runtime in MPN dataset than covid dataset, which is reasonable because and the cell number in MPN dataset is around 1.6 million more than covid dataset.

      (2) Some typos are noted: 

      - DeepCyTOF and LDA use a maker expression matrix extracted → "marker"?* 

      Corrected.

      - Datasets(Chevrier et al.)which → spacing* 

      Corrected.

      - This is due to the method's reliance → spacing*

      Corrected.

      Reviewer #3 (Recommendations For The Authors): 

      Is it possible to accommodate more than two levels within the clustering process, i.e., can the proposed semi-supervised clustering tool be extended to multi-levels instead of binary?

      Thanks for the comments. Binary classification can be considered as an imitation of human gating strategy, as it is applied to each marker. For example, when characterizing the CD8 T cells, we aim for CD19-CD14-CD3+CD4- population, which is binary in nature (either positive and negative) and follows the same logic as the method (BinaryClust) we developed. Results indicated that it works very well for well-defined main cell lineages. However, the limitation is for subpopulation identification, because a handful of makers behave in a continuum manner, so we would suggest unsupervised method after BinaryClust, which also brings another advantage of identifying unknown subsets beyond our current knowledge, and none of the semi-supervised tools can achieve that. To answer the reviewer’s question, it is possible to set the number to 3,4,5 rather than just 2, but considering the design and rationale of the entire framework (as describe in the manuscript and above), it doesn’t seem to be necessary.

      Could you please comment on why on the COVID dataset, BinaryClust was slower as compared to flowSOM?

      Thanks for the question. The performance of algorithms can indeed be affected by the characteristics of the datasets, such as their size and complexity. The covid and MPN datasets differ in various aspects including marker panel, experimental protocol, and data acquisition process, among others, which wound account for the observed variation in speed. So, our explanation is flowSOM suits better for the structure of covid dataset than MPN dataset.  Additionally, for covid dataset, both BinaryClust and flowSOM have runtimes of less than 100s, and the difference between the two isn’t particularly dramatic.

      Minor errors: 

      Line#215 "(ref) " reference is missing

      Added.

      Figure 3, increase the font of the text in order to improve readability. 

      Increased.

      Line#229 didn't --> did not. 

      Corrected

      Line#293 repetition of the reference. 

      The repetition is due to the format of the citation, which has been revised.

    1. eLife assessment

      This joint computational/experimental study demonstrates the ability of synthetic peptides derived from the stalk-tethered agonist in Polycystin-1 (PC1) to re-activate signaling by a stalkless C-terminal fragment of PC1. The study is valuable as it discovered peptide agonists for PC1 and the integrated in vitro and in silico approach is potentially applicable to the analysis of related systems. Following the revision, the line of evidence presented in the current manuscript is considered convincing.

    2. Reviewer #1 (Public Review):

      Summary:

      This research used cell-based signaling assay and Gaussian-accelerated molecular dynamics (GaMD) to study peptide-mediated signaling activation of Polycystin-1 (PC1), which is responsible for the majority of autosomal dominant polycystic kidney disease (ADPKD) cases. Synthetic peptides of various lengths derived from the N-terminal portion of the PC1 C-terminal fragment (CTF) were applied to HEK293T cells transfected with stalkless mouse CTF expression construct. It was shown that peptides including the first 7, 9, and 17 residues of the N-terminal portion could activate signaling to the NFAT reporter. To further understand the underlying mechanism, docking and peptide-GaMD simulations of peptides composed of the first 9, 17, and 21 residues from the N-terminal portion of the human PC1 CTF were performed. These simulations revealed the correlation between peptide-CTF binding and PC1 CTF activation characterized by the close contact (salt bridge interaction) between residues R3848 and E4078. Finally, a Potts statistical model was inferred from diverged PC1 homologs to identify strong/conserved interacting pairs within PC1 CTF, some of which are highly relevant to the findings from the peptide GaMD simulations. The peptide binding pockets identified in the GaMD simulations may serve as novel targets for design of therapeutic approaches for treating ADPKD.

      Strengths:

      (1) The experimental and computational parts of this study complement and mostly support each other, thus increasing the overall confidence in the claims made by the authors.

      (2) The use of exogenous peptides and a stalkless CTF in the GaMD is a step forward compared to earlier simulations using the full CTF, CTF mutants, or the stalkless CTF alone. And it led to findings of novel binding pockets.

      (3) Since the PC1 shares characteristics with the Adhesion class of GPCRs, the approaches used in this work may be extended to other similar systems.

      Weaknesses:

      (1) Only results for selective peptides (p9, p17 p21) binding with the protein were shown. It would be interesting to see the interaction between some (if not all) of the other peptides with the protein.

      (2) The convergence of the simulations is not very good. The results should be interpreted more qualitatively rather than quantitively because large variations in the free energy profile were seen between different replicates. Although these simulations might have identified representative low-energy binding conformations of the peptides, whether they have explored all possible conformations is still a question.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript, "Activation of Polycystin-1 Signaling by Binding of Stalk-derived Peptide Agonists", by Miao and coworkers. The autosomal dominant polycystic kidney disease (ADPKD) is a major form of polycystic kidney disease (PKD). To provide better treatment and avoid side effects associated with currently available options, the authors investigated an interesting GPCR, polycystin-1 (PC1), as a potential therapeutic target. In vitro and in silico studies were combined to identify peptide agonists for PC1 and to elucidate their roles in PC1 signaling. Overall, regarding the significance of the findings, this work described valuable peptide agonists for PC1 and the combined in vitro and in silico approach can be useful to study a complex system like PC1. However, the strength of the evidence is incomplete, as more experiments are needed as controls to validate the computational observations. The work appears premature.

      Strengths:

      (1) This work first described the experimental discovery of short peptides designed to mimic the stalk region of PC1, followed by computational investigation using docking and MD simulations. PC1 is a complex membrane protein and an emerging target for ADPKD, but it can be challenging to study. The knowledge and the peptide discovery can be valuable and useful to understand the mechanism and potential modulation of PC1.<br /> (2) The authors published the mechanistic study of PC1 and identified key interacting residues such as N3074-S3585 and R3848-E4078, using very similar techniques (PNAS 2022, 119(19), e2113786119). This work furthers this research by identifying peptides that are stalk mimics for PC1 activation.<br /> (3) Eight peptides were designed and tested experimentally first; three were computationally studied with docking and GaMD simulations to understand their mechanism (s).

      Weaknesses:

      (1) The selectivity of the peptides between PC1 and PC2 remains unknown in this revision.

      Overall, my comments were mostly addressed properly.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors demonstrate the activation of Polycystin-1 (PC1), a G-protein coupled receptor, using small peptides derived from its original agonist, the stalk TA protein. In the experimental part of the study, the authors performed cellular assays to check the peptide-induced reactivation of a mutant form of PC1 which does not contain the stalk agonist. The experimental data is supported by computational studies using state-of-the-art Gaussian accelerated Molecular Dynamics (GaMD) and bioinformatics analysis based on sequence covariance. The computer simulations revealed the mechanistic details of the binding of the said peptides with the mutant PC1 protein and discovered different bound, unbound, and intermediate conformations depending on the peptide size and sequence. Due to the use of reliable and well-established molecular simulation algorithms and the physiological relevance of this protein autosomal dominant polycystic kidney disease (ADPKD) make this work particularly valuable.

      Strengths:

      This work is exploratory and its goal is to establish that small peptides can be used to probe the PC1 signaling process. The authors have provided sufficient evidence to justify this claim. Their GaMD simulations have produced free-energy landscapes that differentiate the interaction of PC1 with three different synthetic peptides and demonstrate the associated conformational dynamics of the receptor protein. Their trajectory analysis and sequence covariance analysis could identify residue-specific interactions that facilitate this process. The authors also performed residue-wise and total interaction energy calculations to substantiate their findings.

      Weaknesses:

      The reported free energy landscapes are not fully converged. But they are still sufficient to gain biological insight.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      This research used cell-based signaling assay and Gaussian-accelerated molecular dynamics (GaMD) to study peptide-mediated signaling activation of Polycystin-1 (PC1), which is responsible for the majority of autosomal dominant polycystic kidney disease (ADPKD) cases. Synthetic peptides of various lengths derived from the N-terminal portion of the PC1 C-terminal fragment (CTF) were applied to HEK293T cells transfected with stalkless mouse CTF expression construct. It was shown that peptides including the first 7, 9, and 17 residues of the N-terminal portion could activate signaling to the NFAT reporter. To further understand the underlying mechanism, docking and peptide-GaMD simulations of peptides composed of the first 9, 17, and 21 residues from the N-terminal portion of the human PC1 CTF were performed. These simulations revealed the correlation between peptide-CTF binding and PC1 CTF activation characterized by the close contact (salt bridge interaction) between residues R3848 and E4078. Finally, a Potts statistical model was inferred from diverged PC1 homologs to identify strong/conserved interacting pairs within PC1 CTF, some of which are highly relevant to the findings from the peptide GaMD simulations. The peptide binding pockets identified in the GaMD simulations may serve as novel targets for the design of therapeutic approaches for treating ADPKD.

      We greatly appreciate the reviewer’s encouraging and positive comments. The reviewer’ specific comments are addressed pointwise below and changes to the text will be highlighted in yellow in the revised manuscript.

      (1) The GaMD simulations all include exogenous peptides, thus lacking a control where no such peptide is present (and only stalkless CTF). An earlier study (PNAS 2022 Vol. 119 No. 19 e2113786119) covered this already, but it should be mentioned here that there was no observation of close/activation for the stalkless CTF.

      We appreciate the reviewer’s concern about the lack of a control where no exogenous peptide is present. As suggested by the reviewer, we are adding more details about the study on the stalkless CTF as a control in the Introduction of the revised manuscript. 

      (2) Although 5 independent trajectories were generated for each peptide, the authors did not provide sufficient details regarding the convergence of the simulation. This leaves some uncertainties in their results. Given that the binding poses changed relative to the starting docked poses for all three peptides, it is possible that some other binding pockets and/or poses were not explored.

      We appreciate the reviewer’s comment regarding the convergence of the simulation results. This is clarified in the revised manuscript as: 

      “We have calculated free energy profiles of individual simulations for each system, including the p9, p17, and p21, as shown below (Figs. S5, S6 and S8). For the p9 peptide, the “Bound” lowenergy state was consistently identified in the 2D free energy profile of each individual simulation (Fig. S5). For the p17 peptide, Pep-GaMD simulations were able to refine the peptide conformation from the "Unbound” to the "Intermediate” and “Bound” states in Sim1 and Sim5, while the peptide reached only the "Intermediate” state in the other three simulations (Fig. S6). For the p21 peptide, Pep-GaMD was able to refine the peptide docking conformation to the

      "Bound” state in all the five individual simulations (Fig. S8).”

      “It is important to note that the free energy profiles calculated from GaMD simulations of PC1 CTF were not fully converged since certain variations were observed among the individual simulations. Nevertheless, these calculations allowed us to identify representative low-energy binding conformations of the peptides.”

      (3) The free energy profiles (Figures 2 to 4) based on the selected coordinates provide important information regarding binding and CTF conformational change. However, it is a coarsegrained representation and complementary analysis such as RDFs, and/or contact maps between the peptide and CTF residues might be helpful to understand the details of their interactions. These details are currently only available in the text. 

      Following the reviewer's suggestion, we have now included a set of protein contact maps showing contacts between the peptides and the TOP domain for each peptide in the representative "Bound” state in revised Supplementary Information (Fig. S4). The contact maps serve to visualize the list of contacts mentioned in the main text. This will be clarified in the revised manuscript.

      (4) The use of a stalkless CTF is necessary for studying the functions of the exogenous peptides. However, the biological relevance of the stalkless CTF to ADPKD was not clearly explained, if any.

      We appreciate the reviewer’s comment. As correctly assessed by the reviewer, the stalkless CTF is not a biological form of PC1 observed in ADPKD, but rather was used as the simplest or least complex system in which the activities and binding of exogenous peptides could be studied. However, in ADPKD, there are numerous missense mutations reported within the GPCR autoproteolysis-inducing (GAIN) domain that have been shown to prevent or inhibit cleavage at the GPCR-coupled proteolysis site (GPS). Loss of PC1 GPS cleavage, which is known to cause ADPKD, would retain or sequester the stalk tethered agonist within the interior of the GAIN domain, which would presumably interfere with interactions between stalk tethered agonist residues and the remainder of the CTF. Furthermore, there are 10 single nucleotide polymorphisms reported within the stalk sequence (ADPKD Variant Database; https://pkdb.mayo.edu/welcome), most of which we have found to significantly reduce CTF-mediated activation of the NFAT reporter (Magenheimer BS, et al., Constitutive signaling by the C-terminal fragment of polycystin1 is mediated by a tethered peptide agonist; bioRxiv 2021.08.05.455255). In particular, the ADPKD-associated G3052R stalk mutation that was analyzed along with the stalkless CTF by GaMD simulations (Pawnikar et al, PNAS, 2022) has the same reduction in activity as the stalkless CTF in the cellular signaling reporter assays and the same loss of closed conformation interactions in GaMD analyses. As such, we believe the stalkless CTF has biological relevance from the aspect that it mimics the deficiency in signaling activation observed for PC1 CTF stalk mutants. This is clarified in the revised manuscript in the Introduction, page 5, “constructs encoding a stalkless PC1 CTF (a nonbiological mutant of PC1 with deletion of the first 21 N-terminal residues of CTF) and three ADPKD-associated…”) and near the beginning of the Discussion, page 16, where the biological relevance of studying the stalkless CTF is explained

      (5) The authors might want to clarify if a stalkless CTF is commonly seen in ADPKD, or if it is just a construct used for this study.

      The stalkless CTF is not a biological form of PC1, but rather a construct used for this study. This was clarified in the revised manuscript (see response above).

      (6) (Pages 7-8) "...we generated expression constructs of mouse (m) PC1 consisting of the CD5 signal peptide sequence fused in frame with the stalk sequence of mCTF ...". What is the CD5 signal peptide sequence here? What is its use?

      The CD5 signal peptide sequence is “MPMGSLQPLATLYLLGMLVASVLG” from the T cell surface glycoprotein, CD5. Since the N-terminus of PC1 CTF is derived from a posttranslational, autocatalytic, endoproteolytic cleavage event, this isoform is already membraneembedded and therefore lacks its endogenous signal peptide. The CD5 signal peptide coding sequence is added to the PC1 CTF expression constructs in order to ensure translation and insertion of the encoded protein at the endoplasmic reticulum. Additional details were added to the Experimental Procedures, page 2 of Supporting Information.

      (7) (Page 8) "All peptides were appended with a C-terminal, 7-residue hydrophilic sequence (GGKKKKK) to increase solubility". How did the authors make sure that this sequence has no influence on the signaling? 

      To determine the possible effect of the hydrophilic GGKKKKK sequence on signaling, we had a ‘solubility tag’ peptide (LGGKKKKK) synthesized and purified by GenScript. It was necessary to add an N-terminal Leu residue to the 7-residue hydrophilic tag sequence in order for the highly hydrophilic peptide to be recovered. Effect of treatment with the solubility tag peptide on activation of the NFAT reporter was assessed for both empty vector- and ∆stalkCTF-transfected cells in 3 separate signaling experiments (see figure below). Each experiment also included a negative control treatment (no peptide/culture medium only addition) and a positive control treatment (stalk peptide p17). The p17 peptide we had available was derived from the stalk sequence of human PC1 that differs from the mouse PC1 sequence at residues 15 and 17, which are two poorly conserved positions within the stalk sequence (see Reviewer 2, Response 3). In the first experiment with the solubility tag and human p17 peptides (B in figure below), we inadvertently used the empty expression vector and ∆stalkCTF expression construct from mouse PC1. After realizing our error, we then performed 2 additional signaling experiments (C and D in figure below) with the ‘correct’ human ∆stalkCTF expression construct and empty vector. In the revised manuscript, we have provided the results from each of the 3 experiments as Fig. S2 (below).

      (8) (Page 9) "Using a computational model of the ΔStalk PC1 CTF developed previously". The authors might want to expand here a little to give a short review about the structure preparation.

      We appreciate the reviewer’s suggestion regarding the addition of details for structure preparation for Stalkless CTF. We have added these details in section “Docking and Pep-GaMD simulations of peptide agonist binding to stalkless PC1 CTF” on Page 10 in the revised manuscript:  “The cryo-EM structure of human PC1-PC2 complex (PDB: 6A70) was used to build the computational model for WT PC1 CTF. As the protein had several missing regions including the Stalk and several loops, homology modeling of the missing regions was done using I-TASSER web server. Using the WT PC1 CTF model, computational model for ΔStalk was generated by deleting the first 21 residues (3049-3069) of the WT PC1 and using the structure for stalkless CTF, we successfully docked the p9, p17 and p21 stalk peptides with HPEPDOCK.  The peptides all bound to the TOP domain and the interface between the TOP domain and extracellular loop 1 (ECL1) of CTF.”

      (9) How was "contact" defined when counting the number of contacts used in the 2D PMFs (Figures 2-4). Response: We appreciate the reviewer’s comment regarding the definition of the number of contacts used in the 2D PMFs. This has been clarified in the revised manuscript as: “The number of contacts is calculated between any atom pairs within 4 Å distance of the peptide and extracellular domains of PC1 protein.”

      (10) How was the ranking of GaMD clusters done? It looks from Figure 3A that the "intermediate" state is more favorable compared to the "bound" state, but it was claimed in the text the "bound" state was ranked 1st. 

      Thanks to the reviewer for this comment. It has been clarified in the revised

      Supplementary Information: “Three independent Pep-GaMD simulations were combined to perform structural clustering using the hierarchical agglomerative clustering algorithm in CPPTRAJ. A 3 Å RMSD cutoff was used for each peptide system. PyReweighting was then applied to calculate the original free energy values of each peptide structural cluster with a cutoff of 500 frames. The structural clusters were finally ranked according to the reweighted free energy values.” And in the revised main text: “It is important to note that the free energy profiles calculated from GaMD simulations of PC1 CTF were not fully converged since certain variations were observed among the individual simulations. The free energy values of 2D PMF minima shown in Figure 3A could differ from those in the 1D PMF minima of peptide structural clusters, especially with the usage of distinct reaction coordinates. Nevertheless, these calculations allowed us to identify representative low-energy binding conformations of the peptides.”

      (11) When mentioning residue pair distances, such as in the sentence "The distance between the TOP domain residue R3848 and PL residue E4078 was 3.8 Å (Fig. 4D)" on page 12, it should be clarified if these distances are average distance, or a statistical error can be given.

      We appreciate the reviewer’s comment regarding the TOP Domain and PL distance between residues R3848-E4078. This has been clarified on page 14 in the revised manuscript as:

      “The distance between the TOP domain residue R3848 and PL residue E4078 was 3.8 Å. The distance was extracted from the top-ranked structural cluster of the p21 bound to the ΔStalk CTF, corresponding to the “Closed/Active” low-energy conformational state. (Fig. 4E)”.

      (12) More analysis of the GaMD can be performed. For example, the authors observed a single "bound" state for p21, but there must be some flexibility in the peptide and the protein itself. The authors might want to consider adding some plots illustrating the flexibility of the peptide residues (for example, a RMSD plot). Contact maps can also be added to visualize the results currently discussed in the text. 

      We thank the reviewer for their constructive suggestions. To characterize flexibility of the peptide and protein in the revised manuscript, we have added plots of the TOP-PL interaction distance between residues R3848-E4078 in PC1, the radius of gyration (Rg) of p21 and root-mean square deviation (RMSD) of p21 relative to the starting HPEPDOCK conformation of the peptide in the new Fig. S7. The peptide-protein contact map has also been added in the new Fig. S4.

      (13) (Page 7) In the sentence `...sampled the "Closed/Active" low-energy state relative to the large number of Stalk-TOP contacts`, I suggest using "related to" instead of "relative to"

      We thank the reviewer for the comment, and we have replaced "relative to" to “related to” in the following sentence `...sampled the "Closed/Active" low-energy state relative to the large number of Stalk-TOP contacts`

      (14) (Page 7) In the sentence `Our previous study utilized expression constructs of human PC1 CTF, however, in order to prepare for ...`, "PC1 CTF, however," -> "PC1 CTF. However,"

      We thank the reviewer for the comment, and we have replaced "PC1 CTF, however," to "PC1 CTF. However," in the following sentence `Our previous study utilized expression constructs of human PC1 CTF, however, in order to prepare for ...`.

      Reviewer 2:

      The autosomal dominant polycystic kidney disease (ADPKD) is a major form of polycystic kidney disease (PKD). To provide better treatment and avoid side effects associated with currently available options, the authors investigated an interesting GPCR, polycystin-1 (PC1), as a potential therapeutic target. In vitro and in silico studies were combined to identify peptide agonists for PC1 and to elucidate their roles in PC1 signaling. Overall, regarding the significance of the findings, this work described valuable peptide agonists for PC1 and the combined in vitro and in silico approach can be useful to study a complex system like PC1. However, the strength of the evidence is incomplete, as more experiments are needed as controls to validate the computational observations. The work appears premature.

      We greatly appreciate the reviewer’s encouraging and positive comments. The reviewer’ specific comments are addressed pointwise below and changes to the text will be highlighted in yellow in the revised manuscript.

      (1) The therapeutic potential of PC1 peptide agonists is unclear in the introduction. For example, while the FDA-approved drug Jynarque was mentioned, the text was misleading as it sounded like Jynarque targeted PC1. In fact, it targets another GPCR, the vasopressin receptor 2 (V2). A clear comparison of targeting PC1 over V2 pathways and their therapeutic relevance can help the readers better understand the importance of this work. Importantly, a clear background on the relationship between PC1 agonism and treatments for ADPKD is necessary.

      We understand the confusion that was caused by the brevity of our introductory paragraph and will clarify the differences in therapeutic targeting between Jynarque and our PC1 stalk-derived peptides in the revised manuscript. We will also expound on the rationale for targeting PC1 agonism as a therapeutic approach for ADPKD versus Jynarque. For example: It is known that ADPKD disease severity is dependent on the functional levels of PC1. Jynarque is a small molecule antagonist of the arginine vasopressin receptor 2, V2R, whose signaling, and production of cAMP has been shown to be increased in ADPKD. As this drug targets one of the downstream aberrant pathways, it is only capable of slowing disease progression and has numerous undesirable side effects. We reasoned that a therapeutic agent capable of stimulating and thus augmenting PC1 signaling function would be a safer, cyst initiation-proximal treatment capable of preventing cyst formation with few side effects.

      (2) PC1 is a complex membrane protein, and most figures focus on the peptide-binding site. For general readers (or readers that did not read the previous PNAS publication), it is hard to imagine the overall structure and understand where the key interactions (e.g., R3848-E4078) are in the protein and how peptide binding affects locally and globally. I suggest enhancing the illustrations.

      We thank the reviewer for the constructive comment on adding more illustrations for the PC1 protein to understand the overall structure and the location of the key interaction R3848E4078. We have included these suggestions and modified the main figures in the revised manuscript.  

      (3) The authors used the mouse construct for the cellular assays and the peptide designs in preparation for future in vivo assays. This is helpful in understanding biology, but the relevance of drug discovery is weakened. Related to Point 1, the therapeutic potential of PC1 peptide agonist is largely missing.

      The therapeutic potential of a PC1 peptide agonist is addressed in response #1 above. As mentioned in the manuscript and recognized by the reviewer, the cellular signaling assays were performed with the mouse PC1 CTF expression construct and with peptides based on the mouse PC1 stalk sequence for future, pre-clinical studies, while the peptide binding studies were performed with the human PC1 stalk sequence. We feel the relevance for drug discovery is not significantly weakened for a number of reasons: 1) as shown in Fig. 1A, the stalk sequence is highly conserved between mouse and human PC1, specifically there are only 2 residue differences present within peptides p17 and p21. One of the differences is a ‘semi-conservative’ Gln-Arg substitution at peptide residue 15, while the second difference is a conservative Ile-Val substitution at peptide residue 17; 2) we have found that an Arg to Cys mutation within the mouse PC1 CTF stalk has the same effect on signaling as the corresponding human Gln to Cys ADPKD-associated mutation which was analyzed in Pawnikar et al., 2022; and 3) both peptide residues 15 and 17 represent highly variable positions within the PC1 stalk as shown in the sequence logo (below) of the stalk sequence from 16 vertebrate species; and 4) while addressing the potential effect of the hydrophilic solubility tag on stalk peptide-mediated rescue of CTF∆stalk signaling (see Reviewer 1 comments, point #7), we utilized the ‘human’ version of p17 as a positive control and tested its activation with both mouse and human CTF∆stalk expression constructs and found that human p17 peptide was also capable of stimulating the mouse CTF∆stalk protein (Fig. S2).

      Author response image 1.

      (4) More control experiments are needed. For example, a 7-residue hydrophilic sequence (GGKKKKK) is attached to the peptide design to increase solubility. This 7-residue peptide should be tested for PC1 activation as a control. Second, there is no justification for why the peptide design must begin with residue T3041. Can other segments of the stalk also be agonists?

      As mentioned above for Reviewer 1, the hydrophilic peptide has been synthesized and tested for activation of signaling by the stalkless CTF in the revised manuscript as Fig. S2. The design of peptides that begin with residue T3041 of mouse PC1 CTF is modeled on numerous similar studies for the family of adhesion GPCRs. Optimization of the binding and activity of the PC1 peptide agonist will be investigated in future studies and could include such parameters as whether the peptide must include the first residue and whether subsegments of the stalk are also agonists, however, we feel these questions are beyond the scope of this initial report.

      (5) There are some major concerns about the simulations: The GaMD simulations showed different binding sites of p-21, p-17, and p-9, and the results report the simulated conformations as "active conformational states". However, these are only computational findings without structural biology or mutagenesis data to validate. Further, neither docking nor the simulation data can explain the peptide SAR. Finally, it will be interesting if the authors can use docking or GaMD and explain why some peptide designs (like P11-P15) are less active (as control simulations).

      The reviewer brings up an important observation regarding differences in binding sites between peptides p9, p17 and p21. We will include discussion of this observation and our interpretations to the revised manuscript. While the present study is focused on identification of initial peptides that are able to activate the PC1 CTF, we shall include further mutation experiments and simulations, peptide SAR and optimization of the lead peptides in future studies. This has been clarified in the revised manuscript.

      (6) Additional experiments for the controls and for validating the simulations. Additional simulations to explain the SAR.

      We appreciate the reviewer’s comment for additional experiments for the controls and additional simulations to explain the SAR. For future studies, we shall include further mutation experiments and simulations, peptide SAR and optimization of the lead peptides.

      (7) What is the selectivity of the peptides between PC1 and PC2?

      We have not tested the selectivity of the peptides for PC1 versus PC2 primarily because transfection of PC2 does not activate the NFAT reporter. However, it is possible that co-transfection of PC2 with the PC1 CTF could alter stalk peptide binding. This will be important to consider in future studies.

      Reviewer 3:

      The authors demonstrate the activation of Polycystin-1 (PC1), a G-protein coupled receptor, using small peptides derived from its original agonist, the stalk TA protein. In the experimental part of the study, the authors performed cellular assays to check the peptide-induced reactivation of a mutant form of PC1 which does not contain the stalk agonist. The experimental data is supported by computational studies using state-of-the-art Gaussian accelerated Molecular Dynamics (GaMD) and bioinformatics analysis based on sequence covariance. The computer simulations revealed the mechanistic details of the binding of the said peptides with the mutant PC1 protein and discovered different bound, unbound, and intermediate conformations depending on the peptide size and sequence. The use of reliable and well-established molecular simulation algorithms and the physiological relevance of this protein autosomal dominant polycystic kidney disease (ADPKD) make this work particularly valuable.

      We greatly appreciate the reviewer’s encouraging and positive comments. The reviewer’ specific comments are addressed pointwise below and changes to the text will be highlighted in yellow in the revised manuscript.

      (1) No control has been used for the computational (GaMD) study as the authors only report the free energy surface for 3 highly agonistic peptides but for none of the other peptides that did not induce an agonistic effect. Therefore, in the current version, the reliability of the computational results is not foolproof.

      We appreciate the reviewer’s concern about the lack of control with the other peptides that did not induce an agonistic effect. To address the reviewer’s concern, we have included more details on the study of the stalkless CTF and the solubility tag peptide (Fig. S2) as controls in the revised manuscript.

      (2) All discussions about the residue level interactions focused only on geometric aspects (distance, angle, etc) but not the thermodynamic aspect (e.g. residue-wise interaction energy). Considering they perform a biased simulation; the lack of interaction energy analysis only provides a qualitative picture of the mechanism.

      As mentioned by the reviewer, we have added MM/PBSA analysis results in the revised manuscript and SI.

      Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) analysis was performed to calculate the binding free energies of peptides p9, p17 and p21 to PC1 CTF. The analysis was performed using the trajectory in which the peptide was bound to the receptor. In MM/PBSA, the binding free energy of the ligand (L) to the receptor (R) to form the complex (RL) is calculated as:

      where GRL is the Gibbs free energy of the complex RL, GR is the Gibbs free energy of the molecule R in its unbound state and GL is the Gibbs free energy of the molecule L in its unbound state, respectively. 

      𝛥𝐺𝑏𝑖𝑛𝑑 can be divided into contributions of different interactions as:

      in which

      where ΔEMM , ΔGsol , 𝞓H and −TΔS are the changes in the gas-phase molecular mechanics (MM) energy, solvation free energy, enthalpy and conformational entropy upon ligand binding, respectively. ΔEMM includes the changes in the internal energies ΔEint (bond, angle and dihedral energies), electrostatic energies ΔEelec , and the van der Waals energies ΔEvdW. ΔGsol is the sum of the electrostatic solvation energy ΔGPB/GB (polar contribution) and the nonpolar contribution ΔGSA between the solute and the continuum solvent. The polar contribution is calculated using either the Poisson Boltzmann (PB) or Generalized Born (GB) model, while the nonpolar energy is usually estimated using the solvent-accessible surface area (SASA) where 𝞬 is surface tension coefficient and b is the constant offset. The change in conformational entropy −TΔS is usually calculated by normal-mode analysis on a set of conformational snapshots taken from MD simulations. However, due to the large computational cost, changes in the conformational entropy are usually neglected as we were concerned more on relative binding free energies of the similar peptide ligands.

      MM/PBSA analysis was performed using the gmx_MMPBSA software with the following command line:

      gmx_MMPBSA -O -i mmpbsa.in -cs com.tpr -ci index.ndx -cg 1 13 -ct com_traj.xtc -cp topol.top -o FINAL_RESULTS_MMPBSA.dat -eo FINAL_RESULTS_MMPBSA.csv Input file for running MM/PBSA analysis:

      &general

      sys_name="Prot-Pep-CHARMM",

      startframe=1, endframe=200, # In gmx_MMPBSA v1.5.0 we have added a new PB radii set named charmm_radii. 

      # This radii set should be used only with systems prepared with CHARMM force fields. 

      # Uncomment the line below to use charmm_radii set

      # PBRadii=7,

      /

      &pb

      # radiopt=0 is recommended which means using radii from the prmtop file for both the PB calculation and for the NP

      # calculation

      istrng=0.15, fillratio=4.0, radiopt=0

      The relative rank of the overall peptide binding free energies (Table S1) was consistent with the experimental signaling data, i.e., p21>p9>p17, for which p21 showed the largest binding free energy value of binding (-40.29±6.94 kcal/mol).

      (3) It is not mentioned clearly whether the reader should interpret the free energy landscapes quantitatively or qualitatively. Considering no error analysis or convergence plots are reported for the GaMD free energy surfaces, it may be assumed the results are qualitative. The readers should consider this caveat and not try to quantitatively reproduce these free energy landscapes with other comparable techniques.

      We appreciate the reviewer’s comment whether the free energy landscapes should be interpreted quantitatively or qualitatively. The presented free energy landscapes could be considered semi-quantitative since the simulations are not fully converged. This will be clarified in the revised manuscript as: “It is important to note that the free energy profiles calculated from GaMD simulations of PC1 CTF were not fully converged since certain variations were observed among the individual simulations. Nevertheless, these calculations allowed us to identify representative low-energy binding conformations of the peptides.”

      (4) Energy decomposition analysis similar to the following paper (https://pubs.acs.org/doi/10.1021/bi201856m) should be provided to understand the residue level enthalpic contribution in the peptide-protein interaction.

      As mentioned by the reviewer, we have performed residue-wise interaction energy analysis and included the analysis results in the revised manuscript and SI.

      Residue-wise interaction energy analysis was performed on peptides p9, p17 and p21 using the trajectory in which the peptide was bound to the PC1 CTF using the gmx_MMPBSA software with the following command line:

      gmx_MMPBSA -O -i mmpbsa.in -cs com.tpr -ct com_traj.xtc -ci index.ndx -cg 3 4 -cp topol.top -o FINAL_RESULTS_MMPBSA.dat -eo FINAL_RESULTS_MMPBSA.csv -do FINAL_DECOMP_MMPBSA.dat -deo FINAL_DECOMP_MMPBSA.csv

      Input file for running residue-wise energy decomposition analysis:

      &general

      sys_name="Decomposition", startframe=1, endframe=200,

      # forcefields="leaprc.protein.ff14SB"

      /

      &gb

      igb=5, saltcon=0.150,

      /

      # make sure to include at least one residue from both the receptor #and peptide in the print_res mask of the &decomp section.

      # this requirement is automatically fulfilled when using the within keyword.

      # http://archive.ambermd.org/201308/0075.html

      &decomp

      idecomp=2, dec_verbose=3, print_res="A/854-862 A/1-853”,

      /

      Residue-wise energy decomposition analysis allowed us to identify key residues that contributed the most to the peptide binding energies. These included residues T1 and V9 in p9 (Table S2), residues T1, R15 and V17 in p17 (Table S3), and residues P10, P11, P19 and P21 in p21 and residue W3726 in the PC1 CTF (Table S4). The energetic contributions of these residues apparently correlated to the sequence coevolution predicted from the Potts model.

      (5) To showcase the reliability of the computational approach, the authors should perform the MD simulation studies with one peptide that did not show any significant agonistic effect in the experiment. This will work as a control for the computational protocol and will demonstrate the utility of the pep-GaMD simulation in this work.

      We appreciate the reviewer’s concern about the lack of control with the other peptides that did not induce an agonistic effect. It is difficult for us to add more MD simulations on the other peptides, due to student leave after PhD graduation. But to address the reviewer’s concern, we have included more details on the study of the stalkless CTF as a control in the revised manuscript.

      (6) To assess the accuracy of the computational results the authors should mention (either in the main text or SI) whether the reported free energy surfaces were the average of the five simulations or computed from one simulation. In the latter case, free energy surfaces computed from the other four simulations should be provided in the SI. In addition, how many binding unbinding events have been observed in each simulation should be mentioned.

      We appreciate the reviewer’s comment regarding convergence of the simulation free energy surfaces. In response to Reviewer 1, we have calculated free energy profiles of individual simulations for each system, including the p9, p17, and p21 (Figs. S5, S6 and S8). 

      “We have calculated free energy profiles of individual simulations for each system, including the p9, p17, and p21 (Figs. S5, S6 and S8). For the p9 peptide, the “Bound” low-energy state was consistently identified in the 2D free energy profile of each individual simulation (Fig. S5). For the p17 peptide, Pep-GaMD simulations were able to refine the peptide conformation from the "Unbound” to the "Intermediate” and “Bound” states in Sim1 and Sim5, while the peptide reached only the "Intermediate” state in the other three simulations (Fig. S6). For the p21 peptide, PepGaMD was able to refine the peptide docking conformation to the "Bound” state in all the five individual simulations (Fig. S8).”

      “It is important to note that the free energy profiles calculated from GaMD simulations of PC1 CTF were not fully converged since certain variations were observed among the individual simulations. Nevertheless, these calculations allowed us to identify representative low-energy binding conformations of the peptides.”

    1. eLife assessment

      This valuable study analyzes the role of rpgrip1l encoding a ciliary transition zone component in the development of neuroinflammation and scoliotic phenotypes in zebrafish. Through proteomic and experimental validation in vivo, the authors demonstrated increased Annexin A2 expression and astrogliosis in the brains of scoliosis fish. Anti-inflammatory drug treatment restored normal spine development in these mutant fish, thus providing additional convincing evidence for the role of neuroinflammation in the development of scoliosis in zebrafish.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, Djebar et al. perform a comprehensive analysis of mutant phenotypes associated with the onset and progression of scoliosis in zebrafish ciliary transition zone mutants rpgrip1l and cep290. They determine that rpgrip1l is required in foxj1a-expressing cells for normal spine development, and that scoliosis is associated with brain ventricle dilations, loss of Reissner fiber polymerization, and the loss of 'tufts' of multi-cilia surrounding the subcommissural organ (the source of Reissner substance). Informed by transcriptomic and proteomic analyses, they identify a neuroinflammatory response in rpgrip1l and cep290 mutants that is associated with astrogliosis and CNS macrophage/microglia recruitment. Furthermore, anti-inflammatory drug treatment reduced scoliosis penetrance and severity in rpgrip1l mutants. Based on their data, the authors propose a feed-forward loop between astrogliosis, induced by perturbed ventricular homeostasis, and immune cells recruitment as a novel pathogenic mechanism of scoliosis in zebrafish ciliary transition zone mutants.

      Strengths:

      - Comprehensive characterization of the causes of scoliosis in ciliary transition zone mutants rpgrip1l and cep290<br /> - Comparison of rpgrip1l mutants pre- and post-scoliosis onset allowed authors to identify specific phenotypes as being correlated with spine curvature, including brain ventricle dilations, loss of Reissner fiber, and loss of cilia in proximity to the subcommissural organ<br /> - Elegant genetic demonstration that increased urotensin peptide levels do not account for spinal curvature in rpgrip1l mutants<br /> - The identification of astrogliosis and Annexin over-expression in glial cells surrounding diencephalic and rhombencephalic ventricles as being correlated with scoliosis onset and severe curve progression is a very interesting finding, which may ultimately inform pathogenic mechanisms driving spine curvature

      Weaknesses:

      - The fact that cilia loss/dysfunction and Reissner fiber defects cause scoliosis in zebrafish is already well established in the literature, as is the requirement for cilia in foxj1a-expressing cells<br /> - Neuroinflammation has already been identified as the underlying pathogenic mechanism in at least 2 previously published scoliosis models (zebrafish ptk7a and sspo mutants)<br /> - Anti-inflammatory drugs like aspirin, NAC and NACET have also previously been demonstrated to suppress scoliosis onset and severe curve progression in these models<br /> Therefore, although similar observations in rpgrip1l and cep290 mutants (as reported here) add to a growing body of literature that supports a common biological mechanism underlying spine curvature in zebrafish, novelty of reported findings is diminished.<br /> - Although authors demonstrate that astrogliosis and/or macrophage or microglia cell recruitment are correlated with scoliosis, they do not formally demonstrate that these events are sufficient to drive spine curvature. Thus, the functional consequences of astrogliosis and microglia infiltration remain uncertain.<br /> - Authors do not investigate the effect of anti-inflammatory treatments on other phenotypes they have correlated with spinal curve onset (like ventricle dilation, Reissner fiber loss, and multi-cilia loss around the subcommissural organ). This would help to identify causal events in scoliosis.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Djebar et al investigated the role and the underlying mechanism of the ciliary transition zone protein Rpgrip1l in zebrafish spinal alignment. They showed that rpgrip1l mutant zebrafish develop a nearly full penetrance of body curvature at juvenile stages. The mutant fish have cilia defects associated with ventricular dilations and loss of the Reissner fibers. Scoliosis onset and progression are also strongly associated with astrogliosis and neuroinflammation, and anti-inflammatory drug treatment prevents scoliosis in mutant zebrafish, suggesting a novel pathogenic mechanism for human idiopathic scoliosis. This study is quite comprehensive with high quality data, and the manuscript is well written, providing important information on how the ciliary transition zone protein functions in maintaining the zebrafish body axis straightness.

      Strengths:

      Very clear and comprehensive analysis of the mutant zebrafish.

    4. Author response:

      The following is the authors’ response to the original reviews.

      (1) Please provide more background about Rpgrip1l in the introduction, particularly the past studies of mammalian homolog of Rpgrip11, if any? Is there any human disease associated with Rpgrip1l? Do these patients have scoliosis phenotype? 

      • We have added more background on the human ciliopathies caused by RPGRIP1L mutations and on their occasional association with early onset scoliosis (lines 45-54 page 2 in the introduction, see cited references). 

      (2) The allele is a large deficiency of most of the coding region of rpgrip1l, can you give details in the Supplementary data of how you show this by genotyping? It would be good to explain that this mutation is most likely behaving as a null, if you have RNAseq data that supports this please note that. Otherwise, it may be incorrect to assume it is a null allele as your shorthand nomenclature states. If you do not have stronger evidence that the deficiency allele is behaving as a null allele, then please think about using an allele nomenclature as outlined at ZFIN:  

      • We now describe in the results section (Lines 72-76, page 3) the extent of the deletion of rpgrip1l ∆/∆ (22 exons out of 26) that creates an early stop at position 88 of 1256 aas. We have submitted to ZFIN our two novel mutant lines: rpgrip1l∆  is recorded as rpgrip1l bps1 and rpgrip1l ex4 as rpgrip1l bps2 , and we provide this information in the text. Transcriptomics data confirmed this allele is behaving as a null as the most down-regulated transcript found in the brain of rpgrip1l ∆/∆ is rpgrip1l transcript itself, (volcano plot in Fig 5A, described in the results, Line 270-71, page 9).

      • We also have provided in Supplementary Figure 1 A’ a picture of a typical genotyping gel for the rpgrip1l∆ allele. Sequences of both CRISPR guide RNAs and genotyping primers are provided in the Math & Meth section. 

      (3) Throughout the manuscript, the authors refer to zebrafish mutant phenotypes as "juvenile scoliosis". However, scoliosis may not appear until 11 weeks post-fertilization in some animals. After 6-8 weeks of age, it would be more appropriate to describe the phenotype as "late-onset or adult scoliosis" to differentiate between other reported scoliosis mutants (such as hypomorphic or dominant negative alleles of scospondin) that start body curvatures at 3-5 dpf .

      • We think we can really qualify rpgrip1l-/- scoliosis as being a “juvenile scoliosis” as shown by the time course displayed in Fig 1B: rpgrip1l-/- scoliosis develops asynchronously between 4 weeks and 9 weeks (from 0.8 cm/1 cm to 1.6 cm, corresponding to juvenile stages according to Parichy et al, 2009 PMID: 19891001), after which it reaches a plateau. Half of the mutants are already scoliotic by 5 weeks and no scoliosis develops at adult stage, ie from 10 weeks on. We have acknowledged the late onset scoliosis in page 3 line 93.

      (4) A more careful demonstration of the individual vertebrae, using magnified high-resolution pictures in Figures 1D-G, should be made to more clearly show no obvious vertebral malformations are present. 

      • We now provide a movie in Sup Data that presents 3D views of controls and mutant spines, which show the intervertebral spaces as well as vertebral shape and size. With these images we could exclude vertebral fusion and the presence of dysmorphic vertebrae.

      (5) On page 5: the authors comment on transgenic expression of RPGRIP1L in foxj1a-lineages as "rescuing" scoliosis. This terminology is confusing, as rescuing a condition could be interpreted as inducing it where it was once absent. "Suppressing" scoliosis may be a more appropriate term. 

      • We agree with the reviewers, the “rescue” term is confusing, we changed it for “suppress” in the title of the paragraph (line 95 page 3) and within the text (line 115 page 3).

      (6) On page 5, lines 155-156: the authors state that "Indeed, no tissue-specific rescue has been performed yet in zebrafish ciliary gene mutants". This is misleading, as ptk7a and katnb1 mutations both disrupt cilia, and transgenic reintroduction of both ptk7a and katnb1 in foxj1a- expressing lineages has previously been shown to suppress cilia defects as well as scoliosis in these models. The statement should be removed for accuracy. 

      • We agree that we were not precise enough in our sentence: when we mentioned “ciliary gene” mutants, we were referring to genes whose products are enriched within cilia and directly affecting ciliogenesis, cilia content and maintenance such as TZ or BBS genes, without encompassing genes like ptk7 and katnb1 whose products perform multiple functions on top of cilia maintenance such as Wnt signalling and remodelling of the whole microtubule network respectively. We have therefore modified our sentence by adding zebrafish ciliary “TZ and BBS” genes (line 104, page 4).

      (7) Figure 2: panels A-B: In the text (line 196) you state that cilia length was increased and that Arl13b content was severely reduced. However, Panel B shows no significant length difference between scoliotic mutants and controls. This statement and graph should be corrected for accuracy. Also, the Arl13b staining is difficult to see in panel A - can channels be split, and/or quantified? 

      • We have now split the Arl13b and glutamylated tubulin channels (Fig 2 A-C”). We think that the reduction of Arl13b staining intensity is now obvious in both straight and scoliotic mutants (Compare 2A” with 2B” and 2C”). We were not able to quantify Arl13b staining using ciliary masks from glutamylated tubulin staining since both staining only partially overlap along the length of the cilium, Arl13b being more distal than glutamylated tubulin (Fig 2A’). 

      • Ciliary length was significantly increased (from 3.4 to 5.3 µ) in straight rpgrip1l-/-, while the average mean values for scoliotic rpgrip1l-/- were heterogenous (mean 4.1µ) and therefore not significantly different when compared to controls. This heterogeneity stems from the combined presence of both shorter and longer cilia in scoliotic fish, a finding we interpreted by the potential breakage over time of extra-long and thin cilia observed in scoliotic fish (as in Sup figure 1 H’’’, Sup Fig 2M’ and 2O’). 

      • We changed the text to be more accurate: we now state that cilia length increased in straight mutants, and became more heterogenous than controls in scoliotic mutants (line 143-144, page 5). 

      (8) Figure 3: Page 7, line 206: authors state that SCO-spondin secreting cells varied in number along SCO length. What is the evidence that these cells secrete SCO-spondin? The staining shown in Figure 3L-O appears to demonstrate extracellular accumulation of sspo:GFP. What is the evidence that this staining originated from cells in proximity to it? 

      The claim of SCO-secreting cells in Figure 2E-J is confusing. I assume you are using anatomy to infer the SCO is captured in these sections. This should be done in sspo-GFP animals (as in Figure 3) and/or dual anti-body labeling can be done to show SCO-secreting cells and cilia. 

      • We now show in Supplementary Figure 2 A-D a double staining for Sco-spondin-GFP and cilia (Ac-tub, Glu-Tub). Analyzing GFP staining along SCO length on successive sections, we identified the SCO producing cells on the diencephalic dorsal midline by their position under the posterior commissure (PC), which forms an Acetylated Tubulin positive arch), and counted the nuclei surrounded by cytoplasmic GFP from the most anterior region ( 24 cells wide, Sup Fig 2A-A’) to the most posterior region (4-8 cells wide, Sup Fig 2 C).` 

      • Furthermore, the close-ups presented on Fig 2A’ and 2B’ allow to detect the cytoplasmic Sspo-GFP staining around SCO nuclei, above the region presenting primary cilia pointing towards the diencephalic ventricle, both in controls and mutants at scoliosis onset (tail-up mutants), showing that the extracellular staining in B’ very likely originates from these cells. In these tail-up mutants, extracellular Sspo aggregates have not yet filled the whole diencephalic ventricle as in Fig 3 N and Q. 

      (9) Figure 5: Is the transcriptome data and proteomic data consistent for any transcripts and encoded protein products? Please highlight those consistent targets in both analyses. 

      • We would like to emphasize that the transcriptomic study was performed at scoliosis onset, at 5 weeks, while the proteomics analysis was performed at adult stage (3 months) so they cannot be directly compared.

      Moreover, low abundance proteins (such as centrosomal proteins and transcription factors like Foxj1a ) are not detected by label-free proteomics, without prior subcellular fractionation procedure (Lindemann et al, 2017 PMID: 28282288). The extraction protocol also does not allow to purify short neuropeptides such as Urp1-2.

      Nevertheless, we found four targets in common, now highlighted in red in Fig 5, Panel E: Anxa2, complement proteins

      C4 and C7a, and Stat3, all related to immune response, a GO term enriched in both studies as explained in the text (Lines 308-311, page 10). 

      The absence of many inflammation markers or immune response proteins at adult stage in scoliotic mutants most probably indicates a transient inflammatory episode at scoliosis onset, while astrogliosis, as detected by GFAP staining, increases with scoliosis severity. Along the same lines, the two-fold increase of Lcp1 cells within the tectum is present before axis curvature (in straight mutants) and disappears in scoliotic fish (Graph G in Sup Figure S5) as explained in the text, Lines 378-381, page 12, 

      (10) Supplementary Figure 1 F-H: What stage/age samples were used for SEM? It is only stated that they were 'adults'. It is also stated that cilia tufts in straight rpgrip1l-/- fish were morphologically normal but 'less dense'- this was not obvious from the figure. Can density be quantified? (otherwise, data does not support the statement). Similarly, can the statement that "cilia of mono-ciliated ependymal cells showed abnormal irregular structures compared to controls, with either bulged or thinner parts" be supported with measurements/quantification? 

      • The SEM study was performed on 3 months old fish, 3 controls and 5 mutants. We added this information in the figure legend. We could not quantify the number of ciliary tufts in the brain ventricle of the sole straight mutant that was analyzed. We therefore removed the statement that cilia were less dense in the straight mutant. Along the same lines, we mentioned that we could find mutant cilia of irregular shape as shown in Supplementary Figure S1, F”,G’’, H’’ and H’’’) (page 4, lines 124-129). 

      (11) Supplementary Figure 1D-E is never mentioned in the text. The Supplemental Figure legend also refers to a graph of cilia length that is not in the figure itself. As a result, many of the subsequent panel references are out of register. 

      • We now provide the correct version of the legend and refer to Sup Fig 1D-E in the text (page 3, lines 79-81) and its legend, page 53, lines 1616-1620.

      (12) Supplementary Figure 2A-F: Of interest, in panels C and F, it looks as though sspo:GFP is accumulating on cilia within the ventricles of rpgrip1l mutants. Can this be explored? Is it possible that abnormal aggregation of SSPO on cilia is ultimately leading to cilia loss, as you report for multi-ciliated cells surrounding the subcommissural organ? This could be a very interesting finding and possible mechanism for cilia loss.

      • Our observation of all brain sections led us to conclude that the majority of Sspo-GFP aggregates were floating within the brain ventricles of rpgrip1l-/- fish while a portion of aggregates were stuck on ventricle walls, in close contact with cilia as now shown on Supplementary figure S2 B’, outlined in legend page 54, lines 1634-1637. We agree that the contact between Sspo aggregates and cilia might have damaging consequences, either on cilia maintenance or on immune reaction induction and we now mention these possibilities in the discussion page16, lines 524-526. These research lines will be explored in the near future.

      (13) Supplementary Figure 5A-F is not mentioned in the manuscript. Please clarify the role of Anxa2 in neuroinflammation. Is increased Anxa2 expression in rpgrip1l mutant zebrafish reduced after anti-inflammatory drug treatment? What is the expression level of anxa2 in cep290 mutant zebrafish? 

      • We have now added mention to Supplementary Figure 5A-F in the text page 10 lines 328-331. 

      • We unfortunately did not have enough histological material to test Anxa2 staining on NACET treated fish after performing GFAP and Lcp1 staining, neither for dilatation measurement or multiciliated cells quantification. We agree this would have helped to better define which defect might be an indirect consequence of an inflammatory environment.

      • We tested the expression level of Anxa2 in cep290-/- fish. No labelling above control level was detected on cep290-/- brain sections that were positive for GFAP (N = 5). As GFAP staining in 3-4 weeks cep290-/- was not as intense and widespread as in adult rpgrip1l-/- (50% of GFAP + cells compared to 100% in the SCO for example), we concluded that Anxa2 expression may be upregulated after widespread or long-term astrogliosis/inflammation. Alternatively, Anxa2 overexpression could be specific to rpgrip1l-/- fish. 

      (14) A summary diagram at the end would be helpful for understanding the main findings. 

      We added a Graphical Abstract summarizing the main conclusions and hypotheses of this study. It is mentioned and explained in the Discussion section, p. 16 lines 504-508 and 516-529. 

      (15) The sspo-GFP zebrafish line should be listed in the STAR methods section: 

      The sspo-GFP line is now listed in the STAR methods, Scospondin-GFPut24, (Troutwine et al., 2020 PMID: 32386529), p.43, last line.

    1. eLife assessment

      This important work describes a compelling analysis of DNA damage-induced changes in nascent RNA transcripts, and a genome-wide screening effort to identify the responsible proteins. A significant discovery is the inability of arrested cells to undergo DNA damage-induced gene silencing, which, is attributed to an inability to mediate ATM-induced transcriptional repression. This work will be of general interest to the DNA damage, repair, and transcription fields, with a potential impact on the cancer field.

    2. Reviewer #1 (Public Review):

      This manuscript by Tyler and colleagues describes a thorough analysis of IR-induced changes in nascent RNA transcripts, and a genome-wide screening effort to identify the responsible proteins. The findings extend previous work describing DNA damage-induced transcriptional repression from DNA breaks in cis to bulk genomic DNA damage. A significant discovery is the inability of arrested cells to undergo DNA damage-induced gene silencing, which, at least at the rDNA locus, is attributed to an inability to mediate ATM-induced transcriptional repression. While the findings add to our knowledge of how DNA damage affects gene expression, there are several limitations to the current study that remain inadequately addressed. In addition, some of the proposed conclusions seem speculative and should be marked as such, omitted or experimentally supported.

      Two major concerns were as follows and have been addressed as outlined in the authors' response to this review:

      (1) The CIRSPR screen designed to detect regulators of damage-induced transcriptional repression is based on EU incorporation following a 7-day selection of stable knockout cells. As the authors point out, cell cycle arrest reduces rDNA transcription on its own. The screen, which assesses changes in sgRNA distribution in EU high cells, is thus likely to be dominated by factors that affect cell cycle progression. This is exemplified in the analyses of top hits related to neddylation. The screen's limitations in terms of identifying DDR effectors of damage-induced silencing needs to be clearly stated.

      (2) The authors confirm previous findings of DNA damage-induced repression of rDNA and histone gene transcription. The authors propose that these highly transcribed genes are more susceptible to silencing than the bulk of protein coding genes and propose a global damage-induced signaling event that is independent of DNA breaks in cis. While this is possible, it is not demonstrated in this manuscript, and the authors should acknowledge alternative explanations. For example, the loci found to be repressed by bulk IR are highly repetitive gene arrays that tend to form nuclear sub-compartments (nucleoli, histone bodies). As such, their likelihood of being in the vicinity of DNA damage is high, at least for a fraction of gene copies. The findings, therefore, remain consistent with cis-induced silencing. Moreover, silencing may spread through the relevant nuclear sub-compartments, consistent with the formation of DNA damage compartments described recently (PMID: 37853125).

      Other comments - also addressed in the authors' response:

      (1) The statement that silencing is due to transcription initiation rather than elongation is not sufficiently supported by the data. Could equivalent nascent transcript reduction not be the result of the suppression of elongating RNA PolII? To draw the proposed conclusion, the authors would need to demonstrate that RNA PolII initiation is altered, using RNA PollII ChIP and/or analysis of relevant RNA PolII phosphorylation patterns.

      (2) The lack of rDNA silencing in arrested cells is interesting, though the underlying mechanism remains unclear. To further corroborate the proposed defect in ATM-mediated signaling, the authors should look directly at ATM and Treacle phosphorylation upstream of TOPBP1.

      (3) The "change in relative heights of the EU low (G1) and EU high (S/G2) peaks" in Figs 5D, 5E and 6B is central to the proposed model of transcriptional changes being affected by cell cycle arrest. These differences should be visualized more clearly and quantified across independent experiments. Ideally, cell cycle stage should be dissected as in Fig. 2B. How do the authors envision cell cycle arrest triggers the defect in transcriptional silencing?

    3. Reviewer #2 (Public Review):

      In this manuscript, the authors attempted to study mechanisms of transcription inhibition in cells treated with IR. They observed that unlike transcription inhibition induced by UV damage that depends on histone chaperone HIRA, IR induced transcription inhibition is independent on HIRA. Through a CRISPR/Cas9 screen, they identified protein neddylation is important for transcription inhibition. By sequencing nascent RNA, they observed that down-regulated transcripts upon IR treatment are largely highly transcribed genes including histone genes and rDNA.

      This study utilized comprehensive approaches to fill in knowledge gap of IR-induced transcription inhibition.

      Comments on current version:

      The revised manuscript largely addressed my concerns.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1) The CIRSPR screen designed to detect regulators of damage-induced transcriptional repression is based on EU incorporation following a 7-day selection of stable knockout cells. As the authors point out, cell cycle arrest reduces rDNA transcription on its own. The screen, which assesses changes in sgRNA distribution in EU high cells, is thus likely to be dominated by factors that affect cell cycle progression. This is exemplified in the analyses of top hits related to neddylation. The screen's limitations in terms of identifying DDR effectors of damage-induced silencing need to be clearly stated. 

      Notably, our screen did identify known DNA damage response effectors of damage-induced silencing, for example ATM was a top hit, as discussed in the paper and shown in Fig. 5B. We consider that our unbiased approach had advantages because in addition to finding known DDR effectors, we uncovered novel requirements, such as the need for cells to be cycling, for transcriptional silencing in response to DNA damage. We didn’t find the canonical key cell cycle regulators in our screen. One possibility might be that cell cycle arrest or cell death upon their knock down may lead to out-competition during the seven-day treatment with doxycycline resulting in depletion from, rather than enrichment in, the targeting gRNAs from cells that maintain transcription 7 days after DNA damage.

      Comment 2) The authors confirm previous findings of DNA damage-induced repression of rDNA and histone gene transcription. The authors propose that these highly transcribed genes are more susceptible to silencing than the bulk of protein-coding genes and propose a global damage-induced signaling event that is independent of DNA breaks in cis. While this is possible, it is not demonstrated in this manuscript, and the authors should acknowledge alternative explanations. For example, the loci found to be repressed by bulk IR are highly repetitive gene arrays that tend to form nuclear sub-compartments (nucleoli, histone bodies). As such, their likelihood of being in the vicinity of DNA damage is high, at least for a fraction of gene copies. The findings, therefore, remain consistent with cis-induced silencing. Moreover, silencing may spread through the relevant nuclear sub-compartments, consistent with the formation of DNA damage compartments described recently (PMID: 37853125). 

      The reason for us “suggest(ing) that the reduced bulk abundance of nascent transcripts after IR may occur in trans as a programmed event” was based on the gene length-independent and IR dose-independent nature of the gene silencing shown in Fig. 2D and Fig. 4C), not that rDNA and histone gene expression went down the most after IR. Indeed, we stated that “Those genes that were normally most highly transcribed were repressed after IR, while genes that were normally expressed at intermediate or low levels tended to be induced after IR (Fig. 4A). The mechanistic reason for this is unclear.” We thank the reviewer for the suggestion that this may be due to these genes existing in nuclear sub-compartments. We have now incorporated this possibility into the discussion.

      Other comments: 

      (1) The statement that silencing is due to transcription initiation rather than elongation is not sufficiently supported by the data. Could equivalent nascent transcript reduction not be the result of the suppression of elongating RNA PolII? To draw the proposed conclusion, the authors would need to demonstrate that RNA PolII initiation is altered, using RNA PollII ChIP and/or analysis of relevant RNA PolII phosphorylation patterns. 

      Figure 4F shows the distribution of nascent transcript reads throughout the open reading frame of the repressed genes. It shows that the transcript abundance throughout the ORF, including at the 5’ end, is reduced. This pattern is consistent with a defect in initiation. We have now clarified the description of these results to state that: “Our data is consistent with the possibility that the major mechanism for the repression of the ~1,000 protein coding genes after IR is at the transcriptional initiation stage. However, our data do not rule out that elongation may be additionally repressed after IR, as this would not be observed in our analyses due to concomitant repression of transcriptional initiation.” 

      (2) The lack of rDNA silencing in arrested cells is interesting, though the underlying mechanism remains unclear. To further corroborate the proposed defect in ATM-mediated signaling, the authors should look directly at ATM and Treacle phosphorylation upstream of TOPBP1. 

      We would love to have shown that ATM dependent phosphorylation does not occur upon IR. We had attempted this multiple times but unfortunately the available phospho Treacle antibodies were not suitable for rigorous analyses in our hands.

      (3) The "change in relative heights of the EU low (G1) and EU high (S/G2) peaks" in Figures 5D, 5E, and 6B is central to the proposed model of transcriptional changes being affected by cell cycle arrest. These differences should be visualized more clearly and quantified across independent experiments. Ideally, the cell cycle stage should be dissected as in Figure 2B. How do the authors envision cell cycle arrest triggers the defect in transcriptional silencing? 

      In the previous version, the last paragraph described one possibility for how rDNA may fail to be repressed in arrested cells after IR, based on the results shown in Fig. 7F and G.  We have now added a paragraph in the discussion section beginning “Why would cell cycle arrest in G1 or G2 phases of the cell cycle prevent transcriptional repression of rDNA and histone genes after IR?”

      Reviewer #2:

      (1) Define ERCC normalization. 

      We apologize for this omission. We now have explained ERCC normalization and have added a citation to a commentary that we wrote on spike-in controls 2015 for further explanation.

      (2) On page 8, the authors speculate that genes involved in immune response after IR was activated due to cytoplasmic DNA in pre-B cells. Where are these cytoplasmic DNAs from? Is there any literature indicating that 30 30-minute IR treatment can induce cytoplasmic DNA? 

      We have removed this speculation, as there is no evidence currently to support it.

      (3) Related to the points above, are ERVs or repetitive DNA elements up-regulated upon IR treatment, which in turn results in increased expression of genes involved in immune response? 

      The induction of cytokines as a rapid response to irradiation is a major part of the immediate early gene program induced in response to ROS (and now is explained in the manuscript).

      (4) Please explain in the result section how overlap levels of transcription determined by EU are reduced after IR, and yet the number of genes with increased expression upon IR treatment is much more than that of genes with reduced expression. 

      We have explained that while less genes have reduced expression after IR than the number of genes that increase expression after IR, those genes that have reduced expression are extremely highly expressed to start off with. As a result, the bulk amount of transcripts is reduced after IR.

      (5) Do cells treated with MLN4924 block the down-regulation of histone genes and ribosomal genes? 

      We have not addressed this directly. However, given that the reduction of gene expression that occurs after IR is largely due to repression of histone and rDNA genes, it is safe to speculate that these are the genes that are no longer repressed during cell cycle arrest.

      (6) Is IR-induced down-regulation of histone genes due to cell cycle changes? 

      We do not know for sure if this is the case. It is relevant to note that even without IR, histone expression per se is regulated by cell cycle changes, being lower outside of S phase – and the majority of  non-arrested cells in our study are in S phase (Fig. 2B). As such, arrest of cells per se outside of S phase would be sufficient to reduce histone expression level.

      We would like to thank the reviewers again for their insightful suggestions and comments.

    1. Reviewer #2 (Public Review):

      Summary:

      In the manuscript by Oestreicher et al, the authors use patch-clamp electrophysiology, immunofluorescent imaging of the cochlea, auditory function tests, and single-unit recordings of auditory afferent neurons to probe the unique properties of calcium signaling in cochlear hair cells that allow rapid and sustained neurotransmitter release. The calcium binding proteins (CaBPs) are thought to modify inactivation of the Cav1.3 calcium channels in IHCs that initiate vesicle fusion, reducing the calcium-dependent inactivation (CDI) of the channels to allow sustained calcium influx to support neurotransmitter release. The authors use knockout mice of Cabp1 and Cabp2 in a double knockout (Cabp1/2 DKO) to show that these molecules are required for enabling sustained calcium currents by reducing CDI, enabling proper IHC neurotransmitter release. They further support their evidence by re-introducing Cabp2 using injection of AAV containing the Cabp2 sequence into the cochlea, which restores some of the auditory function and reduces CDI in patch-clamp recordings.

      Strengths:

      Overall the data is convincing that Cabp1/2 is required for reducing CDI in cochlear hair cells, allowing their sustained neurotransmitter release and sound encoding. Figures are well-prepared, recordings are careful and stats are appropriate, and the manuscript is well written. The discussion appropriately considers aspects of the data that are not yet explained and await further experimentation.

      Weaknesses:

      There are some sections of the manuscript that pool data from different experiments with slightly different conditions (wt data from a previous paper, different calcium concentrations, different holding voltages, tones vs clicks, etc). This makes the work harder to follow and more complicated to explain. However, the major conclusion, that that cabp1 and 2 work together to reduce calcium dependent inactivation of L-type calcium channels in cochlear inner hair cells, still holds and is well supported. Another minor weakness is that the authors used injections of AAV containing sequences for Cabp2, but do not present data from sham surgeries. In most cases, the improvement of hearing function with AAV injection is believable and should be attributed to the cabp2 function. However, in at least one instance (Figure 4B), the results of the AAV injection experiments may be overinterpreted - the authors show that upon AAV injection, the hair cells have a much longer calcium current recovery following a large, long depolarization to inactivate the calcium channels. Without comparison to a sham surgery, it is not known if this result could be a subtle result of the surgery or indeed due to the Cabp2 expression. The authors have added text acknowledging this, as appropriate.

    2. eLife assessment

      This fundamental work substantially advances our understanding of the role of calcium-binding proteins 1 and 2 (CaBP1 and CaBP2) for generating sustained calcium currents in mouse inner hair cells and their capacity for indefatigable exocytosis. The evidence supporting the conclusions is compelling, with rigorous in vitro and in vivo physiological experiments and state-of-the-art microscopy. The work will be of broad interest to synaptic physiologists, cellular biochemists, and hearing researchers.

    3. Reviewer #1 (Public Review):

      Summary:

      This manuscript dissects the contribution of the CaBP 1 and 2 on the calcium current in the cochlear inner hair cells. The authors measured the calcium current inactivation from the double knock-out CaBP1 and 2 and show that both proteins contribute to the voltage-dependent and calcium-dependent inactivation. Synaptic release was reduced in the double KO. As a consequence, the authors observed a depressed activity within the auditory nerve. Taken together, this study identifies a new player that regulates the stimulation-secretion coupling in the auditory sensory cells.

      Strengths:

      In this study, the authors bring compelling evidence that CaBP 1 and 2 are both involved in the inactivation of the calcium current, from cellular up to system level and by taking care to probe different experimental conditions such as different holding potentials and by rescuing the phenotype with the re-expression of CaBP2. Indeed, while changing the holding potential worsen the secretion, it completely changes the kinetics of the inactivation recovery. It alerts the reader that probing different experimental conditions that may be closer to physiology are better suited to uncover any deleterious phenotype. This gave pretty solid results.

      Weaknesses:

      Although this study clearly points that CaBP1 is involved in the calcium current inactivation, it is not clear how CaBP1 and CaBP2 act together (but this is probably beyond the scope of the study). Another point is that the authors re-express CaBP2 to largely rescue the phenotype in the double KO but no data are available to know whether the re-expression of both CaBP1 and CaBP2 would achieve a full recovery and what would be the effect of the sole re-expression of CaBP1 in the double KO.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors attempted to unravel the role of the Ca2+-binding proteins CaBP1 and CaBP2 for the hitherto enigmatic lack of Ca2+-dependent inactivation of Ca2+ currents in sensory inner hair cells (IHCs). As Ca2+ currents through Cav1.3 channels are crucial for exocytosis, the lack of inactivation of those Ca2+ currents is essential for the indefatigable sound encoding by IHCs. Using a deaf mouse model lacking both CaBP1 and CaBP2, the authors convincingly demonstrate that both CaBP1 and CaBP2 together confer a lack of inactivation, with CaBP2 being far more effective. This is surprising given the mild phenotype of the single knockouts, which has been published by the authors before. Re-admission of CaBP2 through viral gene transfer into the inner ear of double-knockout mice largely restored hearing function, normal Ca2+ current properties, and exocytosis.

      Comments on the revised version:

      The authors improved the quality of the figures as requested.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      This manuscript dissects the contribution of the CaBP 1 and 2 on the calcium current in the cochlear inner hair cells. The authors measured the calcium current inactivation from the double knock-out CaBP1 and 2 and showed that both proteins contribute to voltage-dependent and calcium-dependent inactivation. Synaptic release was reduced in the double KO. As a consequence, the authors observed a depressed activity within the auditory nerve. Taken together, this study identifies a new player that regulates the stimulation-secretion coupling in the auditory sensory cells. 

      Strengths: 

      In this study, the authors bring compelling evidence that CaBP 1 and 2 are both involved in the inactivation of the calcium current, from cellular up to system level, and by taking care to probe different experimental conditions such as different holding potentials and by rescuing the phenotype with the re-expression of CaBP2. Indeed, while changing the holding potential worsens the secretion, it completely changes the kinetics of the inactivation recovery. It alerts the reader that probing different experimental conditions that may be closer to physiology is better suited to uncovering any deleterious phenotype. This gave pretty solid results. 

      Weaknesses: 

      Although this study clearly points out that CaBP1 is involved in the calcium current inactivation, it is not clear how CaBP1 and CaBP2 act together (but this is probably beyond the scope of the study). Another point is that the authors re-express CaBP2 to largely rescue the phenotype in the double KO but no data are available to know whether the re-expression of both CaBP1 and CaBP2 would achieve a full recovery and what would be the effect of the sole re-expression of CaBP1 in the double KO.

      We would like to thank the reviewer for the appreciation of our work. We agree that the effect of the sole re-expression of CaBP1 in the double KO remains elusive and have planned to address this question in a follow-up study. 

      Reviewer #2 (Public Review): 

      Summary: 

      In the manuscript by Oestreicher et al, the authors use patch-clamp electrophysiology, immunofluorescent imaging of the cochlea, auditory function tests, and single-unit recordings of auditory afferent neurons to probe the unique properties of calcium signaling in cochlear hair cells that allow rapid and sustained neurotransmitter release. The calcium-binding proteins (CaBPs) are thought to modify the inactivation of the Cav1.3 calcium channels in IHCs that initiate vesicle fusion, reducing the calcium-dependent inactivation (CDI) of the channels to allow sustained calcium influx to support neurotransmitter release. The authors use knockout mice of Cabp1 and Cabp2 in a double knockout (Cabp1/2 DKO) to show that these molecules are required for enabling sustained calcium currents by reducing CDI and enabling proper IHC neurotransmitter release. They further support their evidence by re-introducing Cabp2 using an injection of AAV containing the Cabp2 sequence into the cochlea, which restores some of the auditory function and reduces CDI in patch-clamp recordings. 

      Strengths: 

      Overall the data is convincing that Cabp1/2 is required for reducing CDI in cochlear hair cells, allowing their sustained neurotransmitter release and sound encoding. Figures are well-prepared, recordings are careful and stats are appropriate, and the manuscript is well-written. The discussion appropriately considers aspects of the data that are not yet explained and await further experimentation.

      Weaknesses: 

      There are some sections of the manuscript that pool data from different experiments with slightly different conditions (wt data from a previous paper, different calcium concentrations, different holding voltages, tones vs clicks, etc). This makes the work harder to follow and more complicated to explain. However, the major conclusion, that cabp1 and 2 work together to reduce calcium-dependent inactivation of L-type calcium channels in cochlear inner hair cells, still holds. 

      Another weakness is that the authors used injections of AAV-containing sequences for Cabp2, but do not present data from sham surgeries. In most cases, the improvement of hearing function with AAV injection is believable and should be attributed to the cabp2 function. However, in at least one instance (Figure 4B), the results of the AAV injection experiments may be overinterpreted - the authors show that upon AAV injection, the hair cells have a much longer calcium current recovery following a large, long depolarization to inactivate the calcium channels. Without comparison to sham surgery, it is not known if this result could be a subtle result of the surgery or indeed due to the Cabp2 expression.  It would be great to see the auditory nerve recordings in AAV-injected animals that have a recovery of ABRs. However, this is a challenging experiment that requires considerable time and resources, so is not required.

      We would like to thank the reviewer for the appreciation of our work. We agree with the reviewer that sham surgery may convey more information that might benefit the interpretation of our data. The recovery experiments were very tedious and these long patch-clamp paradigms required extremely stable recordings. Based on our observations, we plan to address the recovery kinetics into more detail in the follow-up study. However, we would consider off-side effects of the surgery (as it may mainly affect middle ear function) and of the empty AAV-vector on inner hair cell calcium current recovery rather unlikely, but we cannot exclude them. We thus added a sentence in the discussion to alert to that. Based on previously published data of the effect of PHP.eB-Cabp2eGFP in WT animals we expect some (mild) adverse effects on hearing from overexpression of CaBP2 and/or eGFP in the inner ear. In the future, we thus plan to further optimize the treatment. In terms of the in vivo recordings from the auditory nerve fibers of the rescued mice, we could not agree more. That is in plan for the follow-up study.

      Reviewer #3 (Public Review): 

      Summary: 

      The authors attempted to unravel the role of the Ca2+-binding proteins CaBP1 and CaBP2 for the hitherto enigmatic lack of Ca2+-dependent inactivation of Ca2+ currents in sensory inner hair cells (IHCs). As Ca2+ currents through Cav1.3 channels are crucial for exocytosis, the lack of inactivation of those Ca2+ currents is essential for the indefatigable sound encoding by IHCs. Using a deaf mouse model lacking both CaBP1 and CaBP2, the authors convincingly demonstrate that both CaBP1 and CaBP2 together confer a lack of inactivation, with CaBP2 being far more effective. This is surprising given the mild phenotype of the single knockouts, which has been published by the authors before. Readmission of CaBP2 through viral gene transfer into the inner ear of double-knockout mice largely restored hearing function, normal Ca2+ current properties, and exocytosis. 

      Strengths: 

      (1) In vitro electrophysiology: perforated patch-clamp recordings of Ca2+/Ba2+ currents of inner hair cells (IHCs) from 3-4 week-old mice - very difficult recordings - necessary to not interfere with intracellular Ca2+ buffers, including CaBP1 and CaBP2. 

      (2) Capacitance (exocytosis) recordings from IHCs in perforated patch mode. 

      (3) The insight that a negative holding potential might underestimate the impact of lack of CaBP1/2 on the inactivation of ICa in IHCs. As the physiological holding potential is much more positive than a preferred holding potential in patch clamp experiments it has a strong impact on inactivation in the pauses between depolarization mimicking receptor potentials. This truly advances our thinking about the stimulation of IHCs and accumulating inactivation of the Cav1.3 channels. 

      (4) Insight that the voltage sine method with usual voltage excursions (35 mV) to determine the membrane capacitance (for exocytosis measurements) also favors the inactivated state of Cav1.3 channels 

      (5) Use of double ko mice (for both CaBP1 and CaBP2, DKO) and use of DKO with virally injected CaBP2eGFP into the inner ear. 

      (6) Use of DKO animals/IHCs/SGNs after virus-mediated CaBP2 gene transfer shows a great amount of rescue of the normal ICa inactivation phenotype.

      (7) In vivo measurements of SGN AP responses to sound, which is highly demanding. 

      (8) In vivo measurements of hearing thresholds, DPOAE characteristics, and ABR wave I amplitudes/latencies of DKO mice and DKO+injected mice compared to WT mice. 

      Very thorough analysis and presentation of the data, excellent statistical analysis.

      The authors achieved their aims. Their results fully support their conclusions. The methods used by the authors are state-of-the-art. 

      The impacts on the field are the following:

      Regulation of inactivation of Cav1.3 currents is crucial for the persistent functioning of Cav1.3 channels in sensory transduction. 

      The findings of the authors better explain the phenotype of the human autosomal recessive DFNB93, which is based on the malfunction of CaBP2. 

      Future work - by the authors or others - should address the molecular mechanisms of the interaction of CaBP1 and 2 in regulating Cav1.3 inactivation. 

      Weaknesses: 

      I do not see weaknesses. 

      What is not explained (but was not the aim of the authors) is how the CaBPs 1 and 2 interact with the Cav1.3 channels and with each other to reduce CDI. Also, why DFNB93, which is based on mutation of the CaBP2 gene, lead to a severe phenotype in humans in contrast to the phenotype of the CaBP2 ko mouse.

      We would like to thank the reviewer for the appreciation of our work and the amount of effort that went into these experiments. These are the questions that we are posing ourselves as well and would like to address them in the future.   

      Recommendations for the authors:

      Reviewing editor: 

      In the Introduction, the authors may also mention that Ca2+-dependent and voltage-dependent inactivation of L-type Ca channels has been reported at ribbon synapses of retinal bipolar cells (see von Gersdorff & Mathtews, J Neurosci. 1996, 16(1):115-122). These are critical retinal interneurons involved in the continuous exocytosis of synaptic vesicles onto retinal ganglion cells. 

      We would like to thank the reviewing editor for pointing that out, we have added the reference in the revised version of the manuscript.

      Reviewer #1 (Recommendations For The Authors): 

      Conditions worsen with age but no numbers regarding the threshold shift are provided. 

      For better readability, we now included click threshold values for both genotypes and age groups in the MS text, results section.   

      Do the authors correlate the re-expression level of CaBP2 using GFP to the rescuing phenotype (for exocytosis or BK channels immunostaining)?

      The restoration of BK expression in the virus-treated IHC was a side observation of our study, which was not performed in sufficient replicates for proper quantification. In the future, we will address this question into greater detail, possibly with improved viral constructs. In a previous study, we attempted to correlate eGFP fluorescence intensity with residual depolarization-evoked calcium current in CaBP2-injected IHC of Cabp2 single KO animals. At that time, we were unable to establish a convincing correlation. This could be related to (i) large variability in the data, possibly requiring much larger datasets to observe potential correlation above the noise, (ii) variable imaging conditions from prep to prep, or (iii) additional parameters that could influence the outcome of the current rescue, e.g. uncontrolled expression of the transgene. However, we did analyse the correlation between ABR click thresholds and mean IHC eGFP fluorescence in another, preliminary set of data that included different viruses at different titres. There, we were able to observe a relatively good correlation. Interestingly, some of the highest expression levels resulted in poorer threshold recovery, which could indicate harmful overexpression. Moreover, the correlation was only detected when the difference of the mean eGFP expression levels per organ was large. Furthermore, significantly less efficient ABR threshold recovery was observed in the non-injected contralateral ears, which showed a significantly lower viral expression of the transgene. In our follow-up study, we will investigate the question of dose dependence of rescue in more detail.  

      Reviewer #2 (Recommendations For The Authors): 

      -  There are two paragraphs in the results text about supplemental figure #2, which suggests that it should be moved to the main figures. 

      We would like to thank the reviewer for this suggestion. Figure S2 has now been moved to the main figures (as current Figure 5) and has been modified to accommodate the BK cluster analysis panel. The histogram with the number of ribbon synapses was removed as the data was redundant with the numbers given in the MS text.  

      -  Overall it is hard to distinguish between dark blue and black in many figures, including the dual-color asterisks.

      To improve the readability and clarity of the figures, we exchanged dark blue with magenta.  Dual-color asterisks in Fig. 3 were changed to single-color asterisks and what they refer to is explain in the figure legend.  

      -  Figure 4 legend - there is a mis-spelling of cabp in the fourth line from the bottom. 

      -  Figure 4 legend - the last line does not make sense - describes recovery as being both 'much faster' and 'slowest'.

      -  Figure 6 title - consider removing 'nearly blocked' and replacing it with 'impaired'.

      We would like to thank the reviewer for noticing these mistakes that have been corrected in the revised version, as suggested.

      -  The calculations of VDI and CDI could be better explained, specifically detailing that VDI is calculated first from currents using barium as a divalent, followed by the calculation of CDI. 

      We included an explanatory sentence in the results section as suggested and are additionally referring the readers to the methods section for the mathematical formulas.

      -  Why were two different tests (one parametric and one non-parametric) used for the Figure 3B data? 

      We performed a point-by-point-comparison of data. The choice of test was made based on the distribution and the variance of the data points. We now opted for a unified test, t test with Welch correction, which assumes that samples come from populations with normal distribution, but does not make assumption about equal variances. The outcome of these tests were similar. 

      -  The much broader tuning of the auditory nerve fibers is interesting, consider including this in a figure. 

      For recording tuning curves, we use an automated algorithm which adapts the tone burst intensity and frequency depending on the preceding results. The threshold criterion is an increase of spiking by 20Hz above spontaneous rate. This routine works fairly well in wild-type animals. However, DKO SGNs typically had very high thresholds at >80 dB across all frequencies, which can partly be explained by the fact that they had very low spike rates and did not reach that criterion. Besides tuning curve runs, we also tried systematic frequency sweeps and manual frequency control to determine a best frequency, followed by a rate intensity function at that frequency to determine “best threshold”. 

      All this was difficult, because in the DKO SGNs, sound threshold detection was challenged by the strong dependence of spiking on the duration of the preceding silent interval. A preceding stimulus outside the frequency response area or below the activation threshold of the SGN would thus improve spiking by allowing for longer recovery, while a preceding efficient stimulus would reduce it. Thus, the sound threshold determined in a rate level sweep varied depending on the interstimulus interval and possibly even on the (randomized) order at which the intensities were played. 

      A meaningful threshold measure would require long silent interstimulus intervals, i.e. a long recording time. As tuning curves require multiple threshold measures, it seemed impossible to obtain a useful dataset at high quality. As we deemed the spike rate dependence on interstimulus intervals more important than the tuning we rather focused on tone burst responses acquired at frequency/intensity combinations at which the hair cells and their synapses were maximally activated. In wild-types, these would be tone bursts at characteristic frequency or noise bursts in the saturated part of the rate intensity function, which typically has a dynamic range of 10-25dB. As we assume (based on DPOAE) that cochlear micromechanics and amplification are mostly normal in the DKOs, we hypothesize that the sensitivity and dynamic range of basilar membrane motion and  inner hair cell transduction are normal and that the increase in single unit thresholds and loss of sharp tuning are another readout of synaptic dysfunction. 

      - Figure S2 - please show separate panels for each channel, it is very difficult to make out the changes by eye in the merged panels. 

      Done.  

      - Figure S2 G - the results text stated that the BK channel clusters 'appeared' smaller - why was this not measured? 

      We have performed additional experiments to enable proper analysis of the BK channel clusters. The analysed data shows that the BK clusters are considerably larger and more abundant in the WT as compared to CaBP1/2-deficient IHCs of approx. 4-week-old mice. The results of the analysis are included in the immunohistochemistry figure (now Fig. 5) and are further commented in the results section.  

      Reviewer #3 (Recommendations For The Authors): 

      I have only a few minor points on the MS: 

      (1) Some labels in Figure 1 are too small and hard to read, e.g. y-axis in B-F. Wherever you use subscripts on the axes, the labeling needs to be larger.

      (2) Fig. 1A: the colors for CaM and CaBP1.2 are too similar, at least on my printout. Please use more distant colors.

      (3) Reference 24 should be corrected (no longer in press).

      These points have been addressed in the revised version of the MS.

    1. eLife assessment

      The fundamental findings of this work substantially advance our understanding of the impact of the host on its gut microbes. The authors provided compelling evidence at single-cell resolution that the host can drive heterogeneity in the populations of gut microbes with significant consequences for the host physiology.

    2. Reviewer #1 (Public Review):

      Summary:

      In this work, Wang and colleagues used Drosophila-Serratia as a host-microbe model to investigate the impact of the host on gut bacteria. The authors showed that Drosophila larvae reduce S. marcescens abundance in the food likely due to a combination of mechanical force and secretion of antimicrobial peptides. S. marcescens exposed to Drosophila larvae lost virulence to flies and could promote larval growth similar to typical Drosophila gut commensals. These phenotypic changes were reflected in the transcriptome and metabolome of bacteria, suggesting that the host could drive the switch from pathogenicity to commensalism in bacteria. Further, the authors used single-cell bacterial RNA-seq to demonstrate the heterogeneity in gut bacterial populations.

      Strengths:

      This is a valuable work that addresses an important question of the impact of the host on its gut microbes. The authors could convincingly demonstrate that gut bacteria are strongly affected by the host with important consequences for both interacting partners. Moreover, the authors used state-of-the-art bacterial single-cell RNA-seq to reveal heterogeneity in host-associated commensal populations.

      Overall most parts of the study are solid and clear.

    3. Reviewer #3 (Public Review):

      In this study, Wang and coworkers established a model of Drosophila-S. marcescens interactions and thoroughly examined host-microbe bidirectional interactions. They found that:

      (1) Drosophila larvae directly impact microbial aggregation and density;<br /> (2) Drosophila larvae affect microbial metabolism and cell wall morphology, as evidenced by reduced prodigiosin production and EPS production, respectively;<br /> (3) Drosophila larvae attenuate microbial virulence;<br /> (4) Drosophila larvae modulate the global transcription of microbes for adaptation to the host;<br /> (5) Microbial single-cell RNA sequencing (scRNA-seq) analysis revealed heterogeneity in microbial pathogenicity and growth;<br /> (6) AMPs are key factors controlling microbial virulence phenotypes.

      Taken together, they concluded that host immune factors such as AMPs are directly involved in the pathogen-to-commensal transition by altering microbial transcription.

      In general, in this revised version, I feel that the authors addressed all the points raised in the previous review process. Specifically, they demonstrated that sub-lethal doses of antibiotics such as kanamycin or ampicillin is sufficient to induce the virulence switch in S. marcescens. Furthermore, by testing IMD pathway mutant animals, they concluded that AMP plays a major role in the commensal-to-pathogen transition. In summary, I appreciate the authors' efforts, and I am satisfied with the revision.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study examines the role of a host in conditions that shift pathogenicity of opportunistic microbes. The use of single-cell microbial transcriptomics and metabolomics to demonstrate the host's effects on pathogen dynamics is interesting and convincing. However, the connection to host antimicrobial peptides driving these effects is incomplete and would benefit from additional evidence and improved explanation in the text. This paper has the potential to be of broad interest to those working in host-microbe (microbiome and pathogen) interactions.

      We appreciate the editors for organizing our manuscript and providing eLife assessment. We went through each comment and carried out some necessary experiments. According to the comments, we here provide additional evidence that further supports our findings in this revised manuscript.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, Wang and colleagues used Drosophila-Serratia as a host-microbe model to investigate the impact of the host on gut bacteria. The authors showed that Drosophila larvae reduce S. marcescens abundance in the food likely due to a combination of mechanical force and secretion of antimicrobial peptides. S. marcescens exposed to Drosophila larvae lost virulence to flies and could promote larval growth similar to typical Drosophila gut commensals. These phenotypic changes were reflected in the transcriptome and metabolome of bacteria, suggesting that the host could drive the switch from pathogenicity to commensalism in bacteria. Further, the authors used single-cell bacterial RNA-seq to demonstrate the heterogeneity in gut bacterial populations.

      Strengths:

      This is a valuable work that addresses an important question of the effect of the host on its gut microbes. The authors could convincingly demonstrate that gut bacteria are strongly affected by the host with important consequences for both interacting partners. Moreover, the authors used state-of-the-art bacterial single-cell RNA-seq to reveal heterogeneity in host-associated commensal populations.

      Weaknesses:

      Some of the conclusions are not fully supported by the data.

      Specifically, in lines 142-143, the authors claim that larva antagonizes the pathogenicity of S. marcescens based on the survival data. I do not fully agree with this statement. An alternative possibility could be that, since there are fewer S. marcescens in larvae-processed food, flies receive a lower pathogen load and consequently survive. Can the authors rule this out?

      Also, the authors propose that Drosophila larvae induce a transition from pathogenicity to commensalism in S. marcescens and provide nice phenotypic and transcriptomic data supporting this claim. However, is it driven only by transcriptional changes? Considering high mutation rates in bacteria, it is possible that S. marcescens during growth in the presence of larvae acquired mutations causing all the observed phenotypic and transcriptional changes. To test this possibility, the authors could check how long S. marcescens maintains the traits it acquires during growth with Drosophila. If these traits persist after reculturing isolated bacteria, it is very likely they are caused by genome alterations, if not - likely it is a phenotypic switch driven by transcriptional changes.

      We thank the reviewer for providing a feasible method to distinguish the shift in transcriptional profile from genomic mutations. According to this valuable suggestion, we checked phenotypic and transcriptional changes after re-culturing the bacterium that had coexisted with larvae. We found that all phenotypes can be recovered after re-culturing. The new data supported our previous result that a phenotypic switch was driven by transcriptional changes rather than genome mutations. We now add these results to the text with figure supplement 3 (line 147-151, 192-194). Please see the following text.

      “To rule out the possibility that phenotypic alterations could stem from genomic mutations, we examined the prodigiosin yield and CFUs of re-culturing S. marcescens that had coexisted with larvae. Our results showed that neither prodigiosin yield nor CFUs of re-culturing S. marcescens differed from the original strain (Figure 2-figure supplement 3A-C), suggesting that a phenotypic switch was driven primarily by transcriptional reprogramming.” “Consistent with the previous result that this phenotypic switch was driven by transcriptional changes, the expression of virulent and growth genes was recovered after re-culturing (Figure 3-figure supplement 3D, E).”

      For the first question, we admit the possibility that the high morality of flies could result from the acquirement of a higher pathogen load, because of an increase in the bacterial load of single S. marcescens. However, host pathogenesis is normally determined by the virulence of pathogens rather than the number of bacteria. For example, hosts constantly harbor astonishing commensals in their guts, but remain healthy. This evidence suggests that it was the property (virulence) of a pathogen that is more important to affect the health status of the hosts. Moreover, an increase in virulence of single S. marcescens was verified by real-time PCR (Fig. 2F) and TE (Fig. 2G). Taken together, we could draw a conclusion that the impaired survival of flies challenged with single S. marcescens mainly arose from an increase in the virulence of S. marcescens. Thanks for your understanding!

      Reviewer #2 (Public Review):

      Summary:

      While many studies have explored the impacts of pathogens on hosts, the effect of hosts on pathogens has received less attention. In this manuscript, Wang et al. utilize Drosophila melanogaster and an opportunistic pathogen, Serratia marcescens, to explore how the host impacts pathogenicity. Beginning with an observation that larval presence and density impacted microbial growth in fly vials (which they assess qualitatively as the amount of 'slick' and quantitatively as microbial load/CFUs), the authors focus on the impact of axenic/germ-free larvae on an opportunistic pathogen S. marcescens. Similar to their observations with general microbial load, they find that larvae reduce the presence of a pinkish slick of Sm, indicative of its secondary metabolite prodigiosin. The presence of larvae alters prodigiosin production, pathogen load, pathogen cellular morphology, and virulence, and this effect is through transcriptional and metabolic changes in the pathogen. Overall, they observe a loss of virulence factors/pathways and an increase in pathways contributing to growth. Given the important role the host plays in this lifestyle shift, the authors then examined host features that might influence these effects, focusing on the role of antimicrobial peptides (Amps). The authors combine the use of synthetic Amps and an Amp-deficient fly line and conclude much of the larval inhibitory effect is due to their production of AMPs.

      Strengths:

      This is a very interesting question and the use of Drosophila-Serratia marcescens is a great model to explore these interactions and effects.

      The authors have an interesting and compelling phenotype and are asking a unique question on the impact of the host on the pathogen. The use of microbial transcriptomics and metabolomics is a strength, especially in order to assess these impacts on the pathogen level and at the single-cell level to capture heterogeneity.

      Weaknesses:

      Overall, the writing style in the manuscript makes it difficult to fully understand and appreciate the data and its interpretation.

      The data on the role of AMPs would benefit from strengthening. Some of the arguments in the text of that section are also counterintuitive. The authors show that △AMP larvae have a reduced impact on Sm as compared to wt larvae, but it seems less mild of an effect than that observed with wt excreta (assuming the same as secreta in Figures 7, should be corrected or harmonized). Higher doses of AMPs give a phenotype similar to wt larvae, but a lower dose (40 ng/ul) gives phenotypes more similar to controls. The authors argue that this data suggests AMPs are the factor responsible for much of the inhibition, but their data seems more to support that it's synergistic- you seem to still need larvae (or some not yet defined feature larvae make, although secreta/excreta was not sufficient) + AMPs to see similar effects as wt. Based on positioning and color scheme guessing that AMP 40ng/ul was used in Figures 7D-H, but could not find this detail in the text, methods, or figure legend and it should be indicated. This section does not seem to be well supported by the provided data, and this inconsistency greatly dampened this reviewer's enthusiasm for the paper.

      We thank the reviewer’s valuable comments and suggestions. We admitted that some photos of the pinkish slick (prodigiosin) are counterintuitive in Figure 7 as well as figure supplement 2B. Here comes the reason. Single S. marcescens produced prodigiosin that only stayed on the surface of fly agar medium. As we know, larvae can agitate food and form a stratification of prodigiosin, even making higher prodigiosin yield inside food lighter than the surface slick of prodigiosin. We mentioned it in the previous manuscript line 166-168. This is why some photos treated with excreta and a lower dose of AMP seemed more intense than those with WT larvae. However, we precisely quantified the prodigiosin yield inside food with the spectrophotometer, so we provided a prodigiosin yield following the photos of the slick. Therefore, we drew our conclusions mainly relying on the quantification of the prodigiosin yield. We actually used cecropin A for our experiments, so we added this information in the text. We hope that our replies can reignite your enthusiasm for our manuscript, and thanks for your great support!

      Reviewer #3 (Public Review):

      In this study, Wang and coworkers established a model of Drosophila-S. marcescens interactions and thoroughly examined host-microbe bidirectional interactions. They found that:

      (1) Drosophila larvae directly impact microbial aggregation and density;

      (2) Drosophila larvae affect microbial metabolism and cell wall morphology, as evidenced by reduced prodigiosin production and EPS production, respectively;

      (3) Drosophila larvae attenuate microbial virulence;

      (4) Drosophila larvae modulate the global transcription of microbes for adaptation to the host;

      (5) Microbial single-cell RNA sequencing (scRNA-seq) analysis revealed heterogeneity in microbial pathogenicity and growth;

      (6) AMPs are key factors controlling microbial virulence phenotypes.

      Taken together, they concluded that host immune factors such as AMPs are directly involved in the pathogen-to-commensal transition by altering microbial transcription.

      General comments:

      In general, this study is intriguing as it demonstrates that host immune effectors such as AMPs can serve as critical factors capable of modulating microbial transcription for host-microbe symbiosis. However, several important questions remain unanswered. One such question is: What is the mechanism by which AMPs modulate the pathogen-to-commensal transition? One hypothesis suggests that antimicrobial activity may influence microbial physiology, subsequently modulating transcription for the transition from pathogen to commensal. In this context, it is imperative to test various antibiotics with different modes of action (e.g., targeting the cell wall, transcription, or translation) at sub-lethal concentrations to determine whether sub-lethal doses of antimicrobial activity are sufficient to induce the pathogen-to-commensal transition.

      Thank you for the important comments on our manuscript. We checked the effect of antibiotics (5 μg/μl kanamycin and 10 μg/μl ampicillin) on the virulence switch of S. marcescens. We found that the two antibiotics with the sub-lethal doses similarly resulted in a decrease in prodigiosin yield and virulence expression of S. marcescens. Intriguingly, the two antibiotics also resulted in a dramatic decline in the bacterial load and the expression of genes involved in cell growth. These results suggest that antibiotics reduced the virulence primarily through suppressing most activities of bacteria.

      We found that larvae and AMPs at 40 μg/μl modestly resulted in a decrease in bacterial load and an increase in the relative level of genes involved in cellular proliferation, suggesting that AMPs could maintain the exponential phase of bacterial growth. This result is consistent that Drosophila larvae can support the long-term persistence of commensals in the shared habitat (DOI: 10.1016/j.cmet.2017.11.011). The inhibition could prevent bacteria from rapidly exhausting their nutritional resources, and consequently maintain symbiosis. It is likely that AMPs could maintain S. marcescens at the exponential phase of cell growth and prevent bacteria from rapidly exhausting their nutritional resources.

      Author response image 1.

      (A) Representative images of surface slick with S. marcescens alone, with kanamycin (5 μg/μl) and ampicillin (10 μg/μl). (B) The prodigiosin production of S. marcescens alone, with kanamycin (5 μg/μl) and ampicillin (10 μg/μl). n = 6 for each. (C) Bacterial loads of S. marcescens alone, with kanamycin (5 μg/μl) and ampicillin (10 μg/μl). n = 6 for each. (D, E) RT-qPCR analysis of the expression levels of downregulated and upregulated genes in the S. marcescens alone, with kanamycin (5 μg/μl) and ampicillin (10 μg/μl). n = 3 for each. Means ± SEMs. All variables have different letters, they are significantly different (p < 0.05). If two variables share a letter, they are not significantly different (p > 0.05). ns, no significance. Kruskal-Wallis test followed by Dunn’s multiple comparisons test.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Here are some specific points that need to be addressed:

      (1) Lack of statistical analysis for many figures. The authors should perform and report the statistical analysis for all figures where it is currently lacking, specifically, Figures 2C, D, E, F, H; Figures 3E, F; Figures 7G, H; Figure S2E, Figures S3D, E.

      Thanks for your valuable suggestions. We re-checked the manuscript and performed the statistical analysis for these figures.

      (2) For graphs showing dots, it should be specified what exactly individual dots show and how many animals were used per replicate. Also, time points at which specific analysis was performed should be specified.

      We provided the important information in the legends in the revised manuscript.

      (3) Figure 2. No letters illustrating statistical significance are shown, although this is claimed in the legend (line 848).

      We added statistical significance in the updated Figure 2.

      (4) In Figure 7, the authors used AMPs of defined concentration, but it is not specified what exactly these AMPs are. Please provide the full composition of the AMP mix used.

      We used the antimicrobial peptide cecropin A produced by a silkworm. We added this information in the methods line 487-488 and Figure 7 legend.

      (5) Figure S2B. To me, it looks like that medium with larvae is redder than after mechanical force. I find it hard to believe the quantification in panel C that the medium with larvae has 3 times less pigment as compared to the mechanical force.

      Larvae could only agitate the surface of food (~0.4 cm), but sticks completely agitated the food up to 3 cm. Thus, the layer of food with pink pigment with agitation seemed much deeper than with larvae, which was responsible for the counterintuitively. We explained it in the previous manuscript (line 166-168). “Of note, the surface of the slick with agitation appeared lighter than that of larvae, mainly due to a stratification of prodigiosin following agitation.”

      (6) The authors need to proofread the manuscript as there are missing words, terms that need definition, and wrong terms. For example, L86 - naked eye?, L117 - what do the authors mean by co-culture?, L309 - not resist but rather combat, L347 - Species? or competition?, Figure 2A - 2nd?

      We have corrected these errors in the new manuscript. We added an "eye" in L86. Co-culture means “S. marcescens in co-culture”. Interspecies competition for nearly the same or similar nutrients and space occurs in the habitat.

      (7) The authors should reorganize either the text or the figures' order in a way that the figures are described in a consecutive order (Figure 1A, B ... and not Figure 1D first and then 1A).

      Thanks for your valuable advice. We reorganize the order of the text.

      (8) Do the authors have an idea which bacteria they quantified in Figures 1E to 1G? I didn't find the medium that was used for culturing. Also, in Figure 1F, Is the control group comprised of females or males?

      Mixed bacteria (bacteria in the living environment of Drosophila) were quantified in the NA medium that supports the growth of Drosophila microbiota (Jia Y, et al. Nat Commun. 2021) line 474-475. The control group comprised of both males and females with a 1:1 ratio. Similarly, the aged group contained 100 50-day-aged flies, male: female = 1:1. We provided details in Figure 1 legend line 849-850, 851-852.

      (9) L118-129. it is not possible to make all these statements without any statistical analysis. To me, at 96h both treatments have the same CFUs, while the authors claim they are different.

      We added statistical analysis in the current version. In fact, single S. marcescens became collapsed after 72 h post inoculation, and the CFU number of single S. marcescens declined step by step. The bacterial load of S. marcescens in co-culture was comparable (at 96 h post-inoculation, p>0.05) or higher (at 120 h post-inoculation, p<0.001) than S. marcescens alone, possibly explained by the possibility that bacteria rapidly exhausted the nutritional resources and collapsed through population suicide. We rewrote this sentence line 125-129 in the updated manuscript.

      (10) L136. term "symbionts" is not appropriate here.

      We change “symbionts” into “S. marcescens”.

      (11) In Figure 1, the authors used flies of different fitness: weak, strong, and infertile. They should be specific and describe exactly what these terms mean, are these mutants or treatments that affect the fitness?

      We apologize for this missing information and add them in the method and legend. Strong flies (wild-type fly CS), weak flies (yw; Sp/CyO; MKRS/TM6B), infertile flies (dfmr150M null mutant) Figure 1 legend line 849-850.

      (12) Figure S2. The title of this figure is misleading, please modify it. Mechanical force did affect S. marcescens but to a lesser degree as compared to larvae.

      Thank you for your suggestion. We admit that mechanical force affected S. marcescens but to a lesser degree as compared to larvae, so we changed the title to "Biological factors mainly determine S. marcescens lifestyle."

      Reviewer #2 (Recommendations For The Authors):

      General improvement to writing and presentation (see below):

      Describing confluent growth would make more sense than 'slick' and then using descriptions of broken, etc. "colour intensity of the surface slick".

      We used the slick to describe visible surface films of bacteria, which has been used in the previous study (DOI: 10.1038/s43705-023-00307-8). Slick is equal to confluent growth, but seems simple and easy than confluent growth. To make sense, we add this reference to the text.

      We reorganized the text of Figure 1.

      Suggest more specific language to describe observations. For example: Bacterial loading - S. marcescens growth (for example: the presence of dense fly populations reduced Sm growth).

      Thanks for the suggests. We replaced some of them.

      Symbiont, microbiota, microbiome, etc were all used interchangeably throughout the manuscript, but I am not sure I would call Sm part of the indigenous microbiome. Suggest to ensure proper usage and then harmonize throughout the ms.

      We used microbes and microbiome to replace symbiont and microbiota, respectively.

      Details missing from the message and Figure legends that would be helpful (including and especially Figure 7 - what AMP concentration?)

      Thanks for valuable comments. According to this comment, we provided concrete details in the Materials and methods and Figure 7 legend about AMPs, including the source and concentration of AMPs line 487-488, 954-955. Please see the response below.

      L73: define 'these issues" maybe or lead better with the prior sentence, it is not evident as currently written.

      Change "to address these issues" to " To investigate whether and/or how the host modulates bacterial lifestyles,” and merge two paragraphs.

      L74: repetitive sentence with the above.

      Thanks for pointing out this detail. We deleted it.

      L86: naked 'eye'.

      Added.

      L87: what is meant by 'weak flies'?

      Genotypes were added in the updated manuscript. Weak fly stocks display weaker activity and generate fewer eggs than WT flies.

      L96: bacterial load, not loading.

      Corrected.

      L128: no evidence to support, could be reflective of increased numbers in dying/dead larvae that impact total numbers in the vial.

      The number of CFUs of S. marcescens alone was gradually decreased at 96 h post-inoculation. In addition, we observed pale biofilm on the surface of the medium at the late stage. The numbers of CFUs of S. marcescens alone at the later stages were reduced (compared to the peak load at 48 h post-inoculation), so it was deterred that bacteria could undergo ecological suicide. Ecological suicide of the bacterial population was similarly examined by recording the number of CFUs in the medium over time (Ratzke C, et al. Nat Ecol Evol. 2018.). Taken together, we draw a conclusion that bacteria possibly underwent ecological suicide.

      L129: the prior sentence is in contradiction, reduced load only at early time points in the presence of larvae....

      Thanks for pointing out this detail. We added " before 72 h post-inoculation " in the sentence.

      L134: data is only focused on S marcescens, so inferring to 'symbionts' broadly is outside study.

      We change “symbionts” into “S. marcescens”.

      L139: sentence poorly written and confusing.

      We re-organized this sentence.

      To this end, we sought to examine the S. marcescens lifestyle switch from pathogenicity to commensalism by assessing the respective survival of flies on the fly medium that had been processed by single or coexisting S. marcescens.

      L189: evidence for long-term symbiosis is not well established in this paper, suggest editing this language throughout to more specifically reflect what the data supports and leave such interpretations to discussion points and future work.

      Thanks for your valuable advice. We deleted long-term and “thereby promoting the fitness of symbionts in the long maintenance.”.

      L192; used metabolomics to assess the impacts of larvae on bacterial metabolism, as currently written does not make sense.

      We rewrote this sentence. “Next, we investigated whether larvae could further elicit changes in the metabolism of S. marcescens using untargeted metabolomics.”

      L331: the use of monitored here is not correct/odd.

      We changed 'monitored' to 'reshaping’.

      L340: While the authors initially see a cost to Sm in reduced load (CFUs) at 120 h populations associated with larvae become higher - there is also a cost to producing virulence factors, which their RNASeq and metabolomics data support - trade-offs between growth and virulence.

      Thanks for your suggestion. We added “before 72 hours post inoculation” to define the early stage of the bacterial growth in the sentence.

      Reviewer #3 (Recommendations For The Authors):

      (1) Figures 1 A-D: What defines weak and strong flies, and what criteria determine the robustness of flies? How was the experiment conducted? The manuscript lacks details on this matter.

      We thank you for your comments. We lack a criterium, but the robustness of flies comes from daily experience. Weak fly stocks display weak activity and generate fewer eggs than WT flies. Genotypes with different robustness were added in the legend in the updated manuscript

      (2) The authors mentioned, "Noteworthily, the number of CFUs of S. marcescens alone was lower than S. marcescens in co-cultures at the late stage (at 96 h post inoculation), likely that bacteria rapidly exhausted their nutritional resources and underwent ecological suicide." How did they determine that the bacteria exhausted nutritional resources and underwent ecological suicide? One might speculate that larvae could have removed the bacteria simply by consuming them.

      Thanks for this comment. Virtually, there were no larvae inside the vials with single S. marcescens, so bacterial cells were not consumed. However, the numbers of CFUs of S. marcescens alone at the later stages were reduced (compared to the peak load at 48 h post-inoculation), so it was deterred that bacteria could undergo ecological suicide. Ecological suicide of the bacterial population was examined by recording the number of CFUs in the medium over time (Ratzke C, et al. Nat Ecol Evol. 2018.). A similar method was also applied to the number of CFUs of S. marcescens. Taken together, we draw a conclusion that bacteria possibly underwent ecological suicide.

      (3) Figure 2E: The experimental details should be provided in the text. What was the CFU of the bacteria used in this survival experiment?

      We provided further experimental details in the legend line 869-870. The same amount of inocula was used in both single and coculturing S. marcescens.

      (4) The experimental data in Figures 2G and 2H do not sufficiently prove the relationship between the width of the cell wall and virulence, as it lacks experimental validation.

      Previous studies (DOI: 10.1371/journal.ppat.1005946) reveal that glucosylating toxins on the surface are primary virulence determinants, so an increased surface-anchored polysaccharide and protein profile promotes the virulence of the pathogen. Alterations in cell surface (the width of the cell wall) can be examined by TE. Moreover, TE was used to observe changes in the virulence of S. marcescens (DOI: 10.1093/nar/gkab1186). We think that the width of the cell wall could be used to reflect virulence in S. marcescens.

      (5) While it's acknowledged that agitation decreases the color intensity of the bacteria, comparing mechanical agitation with larval crawling seems inappropriate, as the mechanical forces exerted by both methods are not of the same magnitude.

      Thanks for the suggestion. In fact, food was agitated more heavily by glass sticks than by larvae, because larvae merely agitated the surface of food (about 0.5 cm-depth). If the decrease in bacterial load and color was related to the magnitude of agitation, larvae would confer a less decrease (from the decrease in stick agitation) in bacterial load than the sticks. Consequently, it would further support our result that biofactors more importantly confer the inhibition of S. marcescens than force.

      (6) Figure 4D: with this metabolome data, they mentioned, "host suppresses differentiation of S. marcescens into the population with pathogenicity." What evidence supports the claim that downregulation of amino acid metabolism, phosphotransferase system, and ABC transporter directly correlates with decreased pathogenicity?

      Thanks for the comment. Earlier studies showed that amino acid-derived quorum sensing molecules are closely related to bacterial pathogenicity (Defoirdt T. PLoS Pathog. 2019; Wen J, et al. Microbiol Spectr. 2022). Moreover, the phosphotransferase system and ABC transporter can transport and/or produce virulence factors. Therefore, we claimed that downregulation of amino acid metabolism, phosphotransferase system, and ABC transporter directly were related to decreased pathogenicity. To support this claim, we add some references in the updated manuscript line 662-664, 827-830.

      (7) Serotonin: Does serotonin also reduce the virulence of S. marcescens?

      Our primary result showed that serotonin indeed could reduce the virulence of S. marcescens (figure supplement 4), because the survival rate of adult flies was increased and the expression levels of virulence-related genes of S. marcescens alone in the case of serotonin.

      (8) Figures 6D, E, H, I: The expression of key genes should be verified using quantitative real-time polymerase chain reaction (qRT-PCR), as scRNA-seq expression levels might not accurately reflect the true expression levels.

      Bacterial single-cell RNA-seq can evaluate alterations in gene expression in the single-cell resolution. The expression of key genes screened by scRNA-seq was changed only in subpopulations, so the average expression of these genes would be comparable when mixed with a large population. We are afraid that qRT-PCR could be illegible to verify the expression of genes in subpopulations.

      (9) Figure 7: The authors mentioned. "AMPs were supplemented to fly food". However, I could not find information regarding which AMPs and their respective concentrations (i.e., concentration of each AMP) were used in this study. This is a critical aspect of the research; therefore, details should be provided.

      Thanks for your important suggestions. We used the antimicrobial peptide cecropin A, which is produced by silkworms. We provided this information in the methods line 487-488. The concentrations of cecropin A were added in Figure 7 legend.

      (10) Figure 7: Delta AMP + AMP exhibited a stronger effect on the bacteria compared to AMP alone, indicating that immune effectors other than AMP may be involved. Since the IMD pathway is necessary for most immune effectors, including AMP, it would be interesting to test IMD pathway mutant animals and compare them with Delta AMP. Delta AMP + AMP exhibited a stronger effect on the bacteria compared to AMP alone. 

      We appreciate this important question. Indeed, Delta AMP + AMP exhibited a stronger effect on the bacteria compared to AMP alone. We admitted that immune effectors other than AMP may be involved. Alternatively, mechanical force, to a less extent, accounted for the stronger effect on the bacteria (Explained by larvae agitation in figure supplement 2). To rule out this possibility, we examined the effect of total immune effectors on the bacterial load and the prodigiosin yield of S. marcescens using the IMD pathway mutant (RelE20 larvae). Our result showed that the optical density and yield of prodigiosin in Delta AMP group did not significantly differ from the ones in RelE20 group. Moreover, the load of S. marcescens associated with Delta AMP mutant was comparable to that of S. marcescens associated with RelE20 mutant. These results suggested that AMPs play a major role in recapitulating the response of _S. marcescens t_o larvae.

      “To rule out the potential role of other immune effectors, we turned to the IMD pathway mutant RelE20 that is deficient in total immune effectors. Our result showed that the optical density and yield of prodigiosin in RelE20 group did not significantly differ from the ones in DAMP group (figure supplement 7A, B). Moreover, the load of S. marcescens associated with RelE20 mutant was comparable to that of S. marcescens associated with Delta AMP mutant (figure supplement 7C).”

      We now added these results in the text line 326-331.

    1. eLife assessment

      Guan and colleagues present solid arguments to address the question of how a single neural stem cell produces a defined number of progeny, and what influences its decommissioning. The focus of the experiments are two well-studied RNA-binding proteins: Imp and Syp. This is valuable work that will be of interest to the scientific community.

    2. Reviewer #1 (Public Review):

      This study addresses the temporal patterning of a specific Drosophila CNS neuroblast lineage, focusing on its larval development. They find that a temporal cascade, involving the Imp and Syb genes changes the fate of one daughter cell/branch, from glioblast (GB) to programmed cell death (PCD), as well as gates the decommissioning of the NB at the end of neurogenesis.

    3. Reviewer #2 (Public Review):

      Guan and colleagues address the question of how a single neuroblast produces a defined number of progeny, and what influences its decommissioning. The focus of the experiments are two well-studied RNA-binding proteins: Imp and Syp. The Authors find that these factors play an important role in determining the number of neurons in their preferred model system of VNC motor neurons coming from a single lineage (LinA/15) by separate functions taking place at specific stages of development of this lineage: influencing the life-span of the LinA neuroblast to control its timely decommissioning and functioning in the Late-born post-mitotic neurons to influence cell death after the appropriate number of progeny is generated. The post-mitotic role of Imp/Syp in regulating programmed-cell death (PCD) is also correlated with a specific code of key transcription factors that are suspected to influence neuronal identity, linking the fate of neuronal survival with its specification. This paper addresses a wide scope of phenotypes related to the same factors, thus providing an intriguing demonstration of how the nervous system is constructed by context-specific changes in key developmental regulators. The bulk of conclusions drawn by the authors are supported by careful experimental evidence, and the findings are a useful addition to an important topic in developmental neuroscience.

    4. Reviewer #3 (Public Review):

      This study by Guan and co-workers focuses on a model neuronal lineage in the developing Drosophila nervous system, revealing interesting aspects about: a) the generation of supernumerary cells, later destined for apoptosis; and, b) new insights into the mechanisms that regulate this process. The two RNA-binding proteins, Imp and Syp, are shown to be expressed in temporally largely complementary patterns, their expression defining early vs later born neurons in this lineage, and thus also regulating the apoptotic elimination. Moreover, neuronal 'fate' transcription factors that are downstream of Imp and signatures of early-born neurons, can also be sufficient to convert later born cells to an earlier 'fate', including survival. The authors provide solid evidence for most of their statements, including the temporal windows during which the early and the later-born motoneurons are generated by this model lineage, how this relates to patterns of cell death by apoptosis and that mis-expression of early-born transcription factors in later-born cells can be sufficient to block apoptosis (part of, and perhaps indicative of the late-born identity). Other studies have previously outlined analogous, mutually antagonistic roles for Imp and Syp during nervous system development in Drosophila, in different parts and at different stages, with which the working model of this study aligns. Overall, this study adds to and extends current working models and evidence on the developmental mechanisms that underlie temporal cell fate decisions.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This study addresses the temporal patterning of a specific Drosophila CNS neuroblast lineage, focusing on its larval development. They find that a temporal cascade, involving the Imp and Syb genes changes the fate of one daughter cell/branch, from glioblast (GB) to programmed cell death (PCD), as well as gates the decommissioning of the NB at the end of neurogenesis.

      I believe there are some inaccuracies in this summary. We address temporal patterning during larval and pupal stages until the adult stage. The Imp and Syp genes change the fate of one daughter cell/branch from survival to programmed cell death (PCD). The change from glioblast (GB) to PCD, which occurs at an early time point, is not addressed here. The main point of the paper is missing:

      • Last-born MNs undergo apoptosis due to their failure to express a functional TF code, and this code is post-transcriptionally regulated by the opposite expression of Imp and Syp in immature MNs.

      Reviewer #2 (Public Review):

      Summary:

      Guan and colleagues address the question of how a single neuroblast produces a defined number of progeny, and what influences its decommissioning. The focus of the experiments are two well-studied RNA-binding proteins: Imp and Syp. The Authors find that these factors play an important role in determining the number of neurons in their preferred model system of VNC motor neurons coming from a single lineage (LinA/15) by separate functions taking place at specific stages of development of this lineage: influencing the life-span of the LinA neuroblast to control its timely decommissioning and functioning in the Late-born post-mitotic neurons to influence cell death after the appropriate number of progeny is generated. The post-mitotic role of Imp/Syp in regulating programmed-cell death (PCD) is also correlated with a specific code of key transcription factors that are suspected to influence neuronal identity, linking the fate of neuronal survival with its specification. This paper addresses a wide scope of phenotypes related to the same factors, thus providing an intriguing demonstration of how the nervous system is constructed by context-specific changes in key developmental regulators.

      The bulk of conclusions drawn by the authors are supported by careful experimental evidence, and the findings are a useful addition to an important topic in developmental neuroscience.

      I cannot summarize better the paper.

      Strengths:

      A major strength is the use of a genetic labeling tool that allows the authors to specifically analyze and manipulate one neuronal lineage. This allows for simultaneous study of both the progenitors and post-mitotic progeny. As a result the paper conveys a lot of useful information for this particular neuronal lineage. Furthermore addressing the association of cell fate specification, taking advantage of this lab's extensive prior work in the system, with developmentally-regulated programmed celldeath is an important contribution to the field.

      Beyond Imp/Syp, additional characterization of this model system is provided in characterizing a previously unrecognized death of a hemilineage in early-born neurons.

      Thanks!

      Weaknesses:

      The main observations that distinguish this study from others that have investigated Imp/Syp in the fly nervous system is the role played in late-born post-mitotic neurons to regulate programmed cell death. This is an important and plausible (based on the presented findings) newly discovered role for these proteins. However the precision of experiments is not particularly strong, which limits the authors claims. The genetic strategy used to manipulate Imp/Syp or the TF code appears to be done throughout the entire lineage, or all neuronal progeny, and not restricted to only the late born cells. Can the authors rule out survival of the early born hemi-lineage normally fated to die? Therefore statements such as this: 

      To further investigate this possibility, we used the MARCM technique to change the TF code of lastborn MNs without affecting the expression of Imp and Syp should be qualified to specify that the result is obtained by misexpressing these factors throughout the entire lineage.

      We agree that our genetic manipulations affect the entire lineage or all neuronal progeny. We do not have genetic tools to gain such precision. We have changed our descriptions to specify the entire lineage or all neuronal progeny. As the reviewer raised, we were also concerned about the possibility that the overexpression of Imp or knockdown of Syp could induce the survival of the early-born hemilineage. We have two experiments that rule out this possibility:

      (1) In late LL3 larvae, Imp OE or syp MARCM clones do not change the number of cells in LL3 larvae (see Guan et al., 2022), indicating that the hemilineage that died by PCD is not affected. If Imp or Syp played a role in the survival of the hemilineage, we would see at least a 50% increase in the number of MNs at this stage.

      (2) The MARCM experiment using the VGlut driver to overexpress P35 or Imp allows us to manipulate only elav+ VGlut+ neurons. The hemilineage removed by PCD is elav- VGlut- and is not affected by this experiment. Consequently, the increase in MNs in adults with genetic manipulation can only be the result of the survival of the other hemilineage (elav+, VGlut+). Moreover, this experiment shows an increase in the number of neurons in the adult but not in LL3, demonstrating that the hemilineage (elav- VGlut-) is still removed by PCD with this genetic manipulation.

      The authors make an observation that differs from other systems in which Imp/Syp have been studied: that the expression of the two proteins appears to be independent and not influenced by cross-regulation. However there is a lack of investigation as to what effect this may have on how Imp/Syp regulate temporal identity. A key implication of the previously observed cross-regulation in the fly mushroom body is that the ratio of Imp/Syp could change over the life of the NB which would permit different neuronal identities. Without cross-regulation, do the authors still observe a gradient in the expression pattern of time? Because the data is presented with Imp and Syp stained in different brain samples, and without quantification across different stages, this is unclear. The authors use the term 'gradient' but changes in levels of these factors are not evident from the presented data.

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time using smFISH. We have also quantified the relative expression of Imp and Syp protein in the NB over time by co-immunostaining. Additionally, we quantified the relative expression of Imp and Syp protein in postmitotic neurons as a function of their birth order in late LL3 larvae. All these data show an opposite temporal gradient of Imp and Syp in the NB and an opposite spatial gradient in immature neurons according to their birth order (Figure. 4). How these gradients are established in our system remains to be elucidated. 

      Reviewer #3 (Public Review):

      This study by Guan and co-workers focuses on a model neuronal lineage in the developing Drosophila nervous system, revealing interesting aspects about: a) the generation of supernumerary cells, later destined for apoptosis; and, b) new insights into the mechanisms that regulate this process. The two RNA-binding proteins, Imp and Syp, are shown to be expressed in temporally largely complementary patterns, their expression defining early vs later born neurons in this lineage, and thus also regulating the apoptotic elimination. Moreover, neuronal 'fate' transcription factors that are downstream of Imp and signatures of early-born neurons, can also be sufficient to convert later born cells to an earlier 'fate', including survival.

      The authors provide solid evidence for most of their statements, including the temporal windows during which the early and the later-born motoneurons are generated by this model lineage, how this relates to patterns of cell death by apoptosis and that mis-expression of early-born transcription factors in later-born cells can be sufficient to block apoptosis (part of, and perhaps indicative of the late-born identity).

      Other studies have previously outlined analogous, mutually antagonistic roles for Imp and Syp during nervous system development in Drosophila, in different parts and at different stages, with which the working model of this study aligns.

      Overall, this study adds to and extends current working models and evidence on the developmental mechanisms that underlie temporal cell fate decisions.

      I cannot summarize better the paper.

      Reviewer #1 (Recommendations For The Authors):

      While this is an interesting topic, I raised two issues in my original review.

      (1) Against the backdrop of numerous previous studies linking many developmental regulators, including tTFs, to programmed cell death in the developing CNS, which in several cases have involved identifying key PCD genes and decoding the molecular regulatory interplay between regulators and PCD genes, this study does not provide any new insight into the regulation of developmental PCD in the CNS.

      The authors have not added any new data to address this shortcoming.

      I agree with the reviewer that we did not attempt to link Imp/Syp with the temporal transcription factor (tTF) cascade or spatial selectors such as Hox genes. However, this decision was intentional as our primary focus was on studying immature MNs. It is worth noting that the decommissioning of NBs by autophagic cell death or terminal differentiation, which is mediated by Imp/Syp in other lineages, has not been correlated with tTFs or spatial selectors. Although we have not directly examined the involvement of the hb + sv > kr > pdm > cas > cas-svp > Grh cascade in the decommissioning of the Lin A neuroblast, our preliminary data indicate that Hb, Sv, Pdm, and Cas are not expressed in the Lin A NB, while Grh is consistently expressed in the NB (Wenyue et al., 2022). Thus, it is less likely that this particular tTF cascade is not implicated in Lin A neuroblast decommissioning. In contrast, spatial selectors, such as the Hox gene Antp, play an opposing role compared to HOX transcription factors in abdominal NBs. In the Lin A lineage, Antp promotes survival (Baek, Enriquez, & Mann, 2013). Here, to avoid repeating what has already been described in the literature, we focused on the role of Imp/Syp in postmitotic neurons and revealed that the precise elimination of MNs is linked to the control of TFs expressed in the MNs.

      (2) I raised the issue that it is unclear if Imp/Syp acts in the NB, and/or in IMC/GMC, and/or in the daughter cells generated from these.

      I agree with the reviewer's concern regarding the unclear function of Imp/Syp, i.e., whether it acts in the NB, IMC/GMC, or daughter cells. To address this, one possible approach would be to attempt rescuing Imp and Syp mutants by transgenic expression in specific cell types, such as NBs, IMC/GMC, or GB/daughter cells. However, we have not conducted such experiments as we were skeptical about the outcome. Previous published work has used drivers expressed in NBs, IMC/GMC, or postmitotic neurons to decipher the function of a gene in a specific cell type. But the results of these experiments must be taken with caution. Using NB/GMC drivers to study gene function can lead to effects not only in the NB but also in its progeny, including GMC or postmitotic neurons, due to the perdurance and stability of the Gal4 and UAS-gene expression system. For instance, dpn-Gal4 UASGFP not only labels the NB but also many of its progeny, even if Dpn is only expressed in NBs. And elav-Gal4 is expressed in the NB and GMCs.

      However, our overexpression of Imp in immature neurons using Vglut demonstrates that Imp promotes cell survival through an autonomous function in these neurons. This driver is only expressed in postmitotic neurons (elav+) and not in the NB, IMC/GMC, or in the hemilineage eliminated by cell death (elav-vglut-).

      Reviewer #2 (Recommendations For The Authors):

      Oddly knockdown of Imp in the neuroblast (Fig. 5D) only led to death at 8h APF, when Imp is no longer expressed. Do the authors have an explanation as to how the stem cell can survive until this point? A discussion would be helpful.

      The simple explanation is the efficiency of RNAi. The imp-/- MARCM clones (Guan et al., 2022) lead to a stronger reduction of MNs in LL3.

      A simple experiment I would recommend is to repeat the antibody stainings of staged larvae/pupae (Fig. 4) having the anti-Imp/Syp antibodies in the same brain sample, and perhaps a quantification of the ratio in the NB. Given the species in which the ABs were raised seem compatible, this should be feasible. As it stands now, there is no indication of whether the ratio of Imp vs Syp change over time.

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in the NB over time and quantified the relative expression of Imp and Syp proteins in postmitotic neurons as a function of their birth in late LL3 larvae. How these gradients are established in our system still remains to be 

      Minor errors/suggestions:

      Fig 4. Time legend at the top goes A, B, C, E, F (no D). So it doesn't match the panels below

      Yes, we have made the corrections.

      Sentence repeated in Intro:

      The process of terminating NB neurogenesis through autophagic cell death or terminal differentiation is commonly referred to as decommissioning.

      Yes, corrections have been made.

      IN FIGURE 1 THEY SAY 'TYPE IB' AND IN FIGURE 2 THEY SAY 'TYPE 1B'

      We have changed it to type 1b.

      In Fig2A-It's hard to see lack of Elav and Fig2G-It's hard to see presence of Dcp1. Panels could be adjusted to emphasize these results

      We have increased the size of the panels and made two separate panels where only the elav and Dcp1 signals are present.

      Observations that the result is equivalent in all thoracic segments is expected, since all legs need the same number of neurons. This is nice to have but can be in the supplement.

      Overall the figure number seems excessive, especially considering much of the results included(particularly the NB results) are findings consistent with previous papers and some is characterization of the system that does not fit well with the main focus regarding Imp/Syp (i.e death of one hemi-lineage:

      Figure 5 and 6 can be joined as one.

      We have combined Figures 5 and 6, showing only the T1 segments.

      There is some discrepancy between graphs Fig7F and K: At LL3 the number of neurons is different for the control in 7F and the count in K

      Yes, because the genetic backgrounds are not the same and we are not counting the same type of cells. In 7F, we are counting the elav+ and VGlut+ cells, whereas in Figure 7K, we are counting all the elav+ in Lin A, including those elav+ VGlut-. VGlut expression arrives a bit later after elav+, which is why we have fewer elav+ cells in 7F. In other words, VGlut MARCM clones do not label all Lin A elav+ cells. I have clarified this in the figure.

      Reviewer #3 (Recommendations For The Authors):

      Main comment: on the notion of Imp and Syp gradients:

      p. 5, related to figure 4 - there are clearly distinct windows for predominantly (if not exclusively) Imp, and later, Syp expression in lineage 15, with a phase of co-expression.

      However, based on the data shown, it is unclear whether these windows represent gradients, as repeatedly stated. If the notion of gradients is derived from other studies, on other lineages, then this would be good to clarify. Alternatively, the idea of temporally opposing gradients of Imp and Syp would need to be demonstrated for this lineage.

      For example, a more accurate way to describe this study's data is given on p.7 "In conclusion, our findings demonstrate that the opposite expression pattern of Imp and Syp in postmitotic neurons precisely shapes the size of Lin A/15 lineage by controlling the pattern of PCD in immature MNs (Fig. 8)."

      We have now quantified the transcriptional activity of Imp and Syp in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in the NB over time. We have also quantified the relative expression of Imp and Syp proteins in postmitotic neurons as a function of their birth in late LL3 larvae. How these gradients are established in our system still remains to be identified.

      Minor points:

      p.6, related to figure 7: Are numbers of EDU- early born and EDU+, late born, MNs expressed as means in the main text? As written, it suggests absence of any variability, which one would expect and which is shown in Fig.7 data.

      Yes, we have added averages in the text.

      Methods: the author name 'Lacin' has been mis-spelled

      Sorry about that, it's been corrected.

    1. eLife assessment

      This study by Nandy and colleagues examined relationships between behavioral state, neural activity, and trial-by-trial variability in the ability to detect weak visual stimuli. They present useful findings indicating that certain changes in arousal and eye-position stability, along with patterns of synchrony in the activity of neurons in different layers of cortical area V4, can show modest correspondences to changes in the ability to correctly detect a stimulus. At present, however, the findings are based on data and analyses that are somewhat incomplete but could be improved with further revisions.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, Nandy and colleagues examine neural, physiological and behavioral correlates of perceptual variability in monkeys performing a visual change detection task. They used a laminar probe to record from area V4 while two macaque monkeys detected a small change in stimulus orientation that occurred at a random time in one of two locations, focusing their analysis on stimulus conditions where the animal was equally likely to detect (hit) or not-detect (miss) a briefly presented orientation change (target). They discovered two behavioral and physiological measures that are significantly different between hit and miss trials - pupil size tends to be slightly larger on hits vs. misses, and monkeys are more likely to miss the target on trials in which they made a microsaccade shortly before target onset. They also examined multiple measures of neural activity across the cortical layers and found some measures that are significantly different between hits and misses.

      Strengths:

      Overall the study is well executed and the analyses are appropriate (though several issues still need to be addressed as discussed in Specific Comments).

      Weaknesses:

      My main concern with this study is that, with the exception of the pre-target microsaccades, the correlates of perceptual variability (differences between hits and misses) appear to be weak, potentially unreliable and disconnected. The GLM analysis of predictive power of trial outcome based on the behavioral and neural measures is only discussed at the end of the paper. This analysis shows that some of the measures have no significant predictive power, while others cannot be examined using the GLM analysis because these measures cannot be estimated in single trials. Given these weak and disconnected effects, my overall sense is that the current results provide limited advance to our understanding of the neural basis of perceptual variability.

    3. Reviewer #2 (Public Review):

      Strengths:

      The experiments were well-designed and executed with meticulous control. The analyses of both behavioural and electrophysiological data align with the standards in the field.

      Weaknesses:

      Many of the findings appear to be subtle differences and incremental compared to previous literature, including the authors' own work. While incremental findings are not necessarily a problem, the manuscript lacks clear statements about the extent to which the dataset, analysis, and findings overlap with the authors' prior research. For example, one of the main findings, which suggests that V4 neurons exhibit larger visual responses in hit trials (as shown in Fig. 3), appears to have been previously reported in their 2017 paper.

      Furthermore, the manuscript does not explore potentially interesting aspects of the dataset. For instance, the authors could have investigated instances where monkeys made 'false' reports, such as executing saccades towards visual stimuli when no orientation change occurred, which allows for a broader analysis that considers the perceptual component of neural activity over pure sensory responses. Overall, lacking broad interest with the current form.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):  

      Summary: 

      In this study, Nandy and colleagues examine neural and behavioral correlates of perceptual variability in monkeys performing a visual change detection task. They used a laminar probe to record from area V4 while two macaque monkeys detected a small change in stimulus orientation that occurred at a random time in one of two locations, focusing their analysis on stimulus conditions where the animal was equally likely to detect (hit) or not-detect (miss) a briefly presented orientation change (target). They discovered two behavioral measures that are significantly different between hit and miss trials - pupil size tends to be slightly larger on hits vs. misses, and monkeys are more likely to miss the target on trials in which they made a microsaccade shortly before target onset. They also examined multiple measures of neural activity across the cortical layers and found some measures that are significantly different between hits and misses. 

      Strengths: 

      Overall the study is well executed and the analyses are appropriate (though multiple issues do need to be addressed). 

      We thank the reviewer for their enthusiasm and their constructive comments which we address below.

      Weaknesses: 

      My main concern with this study is that with the exception of the pre-target microsaccades, the physiological and behavioral correlates of perceptual variability (differences between hits and misses) appear to be very weak and disconnected. Some of these measures rely on complex analyses that are not hypothesis-driven and where statistical significance is difficult to assess. The more intuitive analysis of the predictive power of trial outcomes based on the behavioral and neural measures is only discussed at the end of the paper. This analysis shows that some of the significant measures have no predictive power, while others cannot be examined using the predictive power analysis because these measures cannot be estimated in single trials. Given these weak and disconnected effects, my overall sense is that the current results do not significantly advance our understanding of the neural basis of perceptual variability. 

      Reviewer #1 (Recommendations For The Authors): 

      (1) Most of the effects are very small. For example, the difference in pupil size between hits and misses is ~0.08 z-score units. The differences in firing rates between hits and misses are in the order of 1-2% of normalized firing rates. While these effects may be significant, their contribution to perceptual variability could be negligible, as suggested by the analysis of predictive power at the end of the result section. On a related note, it would be useful to mention the analysis of predictive power earlier in the paper. The finding that some of the measures do not have significant predictive power w/r to behavioral outcome raises questions regarding their importance. Finally, it would strengthen the paper if the authors could come up with methods to assess the predictive power of the PPC and interlaminar SSC. Without such analyses, it is difficult to assess the importance of these measures. 

      We expect that relatively small differences in early to intermediate sensory areas could cumulatively result in large differences in higher areas and contribute to the binary distinction between hits and misses. We certainly do not claim that these results completely explain state-dependent differences that determine the outcome of these trials. Instead, we have focused on neural signatures at the level of the V4 columnar microcircuit that might ultimately contribute to the variability in perception.

      We would like to emphasize that, based on the reviewer’s recommendation, we have now analyzed our results separately for each animal (see below). The consistency and significance of our findings across both animals give us confidence that what we have reported here are important neural signatures underlying perceptual variability at threshold.

      We would also like to note that SSC and PPC are now part of the standard toolkit of systems neuroscience and have been employed in numerous studies to our knowledge. While all measures come with their set of caveats and limitations, these two measures provide a frequency-resolved metric of the relationship between two temporal processes (point or continuous), which we believe provide insights into the interlaminar flow of information that we report here.

      Unfortunately, limitations in the GLM method and the reliability of these analyses with limited data make it impossible for these two measures to be included. The GLM requires all variables to be defined for each trial in the input. SSC and PPC can be undefined at low firing rates and require a substantial amount of data to be reliably calculated. While we did consider imputing data or estimating SSC and PPC using multiple trials, we ultimately did not pursue this idea as the purpose of the GLM was to use simultaneous measurements from single trials. 

      (2) What is the actual predictive power of the GLM model (i.e., what is the accuracy of predicting whether a given held-out trial will lead to a hit or a miss)? How much of this predictive power is accounted for by the effect of microsaccades? 

      As the GLM is not a decoder, it does not classify whether a given left out trial will be a hit or a miss. However, the GLM was highly predictive compared to a constant model. This information has been added to Table 3. The deviance of the GLM with and without microsaccades as a variable was not significantly different (p >0.9).  

      (3) The role of stimulus contrast is not explained clearly. Are all the analyses and figures restricted to a single contrast level? Was the contrast the same on both sides? If multiple contrasts are used, could contrast account for some of the observed neural-behavioral covariations? 

      All of the analyses include stimuli of all tested contrast levels. Stimulus contrasts were the same at both locations (attended and unattended). We have added a more detailed description of the contrast in hit and miss trials (Lines 289-296 and reproduced that here: 

      “Non-target stimulus contrasts were slightly different between hits and misses (mean:

      33.1% in hits, 34.0% in misses, permutation test, 𝑝 = 0.02), but the contrast of the target was higher in hits compared to misses (mean: 38.7% in hits, 27.7% in misses, permutation test, 𝑝 = 1.6 𝑒 − 31). Firing rates were normalized by contrast in Figure 3. In all other figures, we considered only non-target stimuli, which had very minor differences in contrast (<1%) across hits and misses. While we cannot completely rule out any other effects of stimulus contrast, the normalization in Figure 3 and minor differences for non-target stimuli should minimize them.”

      (4) Do the animals make false alarms (i.e., report seeing a target in non-target epochs)?

      If not, then it is not clear that the animals are performing near their perceptual threshold. If the false-alarm rate is non-zero, it should be reported and analyzed for neural/behavioral correlates. Does the logistic regression fit allow for a false alarm rate? More generally, it would be useful to see a summary of behavioral performance, such as distribution of thresholds, lower and upper asymptotes, and detection rates on foil trials vs. matched target trials. 

      The logistic regression does allow for a false alarm rate. We have reported additional behavioral parameters in Figure 1-figure supplement 3A-G.  

      (5) As far as I can tell, all the analyses in the paper are done on data combined across the two animals. Given that these effects are weak and that the analyses are complex, it is important to demonstrate for each analysis/figure that the results hold for each animal separately before combining the data across animals. This can be done in supplementary figures. 

      We have updated the paper to include all main results plotted separately for each animal as supplementary figures. 

      - Figure 2-figure supplement 2

      - Figure 3-figure supplement 1

      - Figure 3-figure supplement 2

      - Figure 4-figure supplement 1

      - Figure 5-figure supplement 2

      - Figure 7-figure supplement 1

      All the results except for the canonical correlation analysis were present, consistent, and significant when we analyzed them in each monkey independently.

      (6) The selection of the temporal interval used for the various analyses appears somewhat post hoc and is not explained clearly. Some analyses are restricted to the period immediately before or during target onset (e.g., 400 ms before target onset for analysis of the effect of microsaccade, 60 ms before stimulus onset for the analysis of the effect of neural variability). Other analyses are done on non-target rather than target stimuli. What is the justification for selecting these particular periods for these analyses? The differences in firing rates between hits and misses are restricted to the target epoch and are not present in the non-target epochs. Given these results, it seems important to compare the effects in target and non-target epochs in other analyses as well.

      Restricting the analysis of the Fano Factor to 60 ms before non-target onset seems odd. Given that the duration of the interval between stimulus presentations is random, how could this pre-stimulus effect be time-locked to target onset? 

      We selected a 200ms time window during the pre-stimulus or stimulus-evoked period for almost all our analyses. The results relating to microsaccade occurrence were robust to narrower time windows more consistent with the other pre-stimulus windows we used, but we chose to use the 400ms window to capture a larger fraction of trials with microsaccades. 

      Only the Fano factor time window was selected post-hoc based on the traces in Figure 4A, and the result is robust across animals (new Figure 4-figure supplement 1). The inter-stimulus intervals are random, and we do not believe the neural variability is timelocked to upcoming stimuli, but that lower variability in this pre-stimulus window is characteristic of hits. 

      We believe that the consistency of our results across both animals provides further evidence that our time window selection was appropriate. 

      We are interested in the extent to which these effects would remain consistent when applied only to target stimuli. However, restricting our analyses to only target stimuli substantially reduces the amount of neural data available for analysis. We plan to explore target stimulus representation more thoroughly in future studies.   

      (7) Can the measured neural response be used to discriminate between target and nontarget stimuli? If so, is the discriminability between target and non-target higher in hits vs. misses? 

      Thank you for raising this interesting point. We performed this analysis and find that target stimuli are more discriminable from non-targets in hits compared to misses. This has been added as a new Figure 3A.  

      (8) How many trials were performed per session? Did miss probability tend to increase over time over the session? If so, could this slow change in hit probability account for some of the observed neural and behavioral correlations with perceptual decisions? 

      Monkeys initiated a median of 905 trials (range of 651 to 1086). This has been added to the manuscript (Line 106). Approximately 1/8 of those trials were at perceptual threshold. Hit probability at threshold does not change substantially over the course of the session. We now report this in new Figure 1- figure Supplement 3I (error bars show standard deviation). 

      (9) Did miss probability depend on the time of the change within the trial? If so, do any of the behavioral/neural metrics share a similar within-trial time course? 

      Change times were not significantly different across hit and miss trials (p=0.15, Wilcoxon rank sum test). We now report this in new Figure 1-figure supplement 3H.

      (10) "Deep layer neurons exhibit reduced low-frequency phase-locking in hit trials than in misses (Figure 5B), suggesting an improvement in pooled signal-to-noise among this neural population." - why does this metric suggest improved SNR? Is there any evidence for improved SNR in the data? Why just in deep layers? 

      Thank you for raising this question. We agree this statement is not fully supported by the data and have removed it.  

      (11) I may have missed this but what were the sizes of the Gabor stimuli? 

      This has been added to the methods section (Line 454). The Gaussian halfwidth was 2 degrees.  

      Reviewer #2 (Public Review):  

      In this manuscript, the authors conducted a study in which they measured eye movements, pupil diameter, and neural activity in V4 in monkeys engaged in a visual attention task. The task required the monkeys to report changes in the orientation of Gabors' visual stimuli. The authors manipulated the difficulty of the trials by varying the degree of orientation change and focused their analysis on trials of intermediate difficulty where the monkeys' hit rate was approximately 50%. Their key findings include the following: 1) Hit trials were preceded by larger pupil diameter, reflecting higher arousal, and by more stable eye positions; 2) V4 neurons exhibit larger visual responses in hit trials; 3) Superficial and deep layers exhibited greater coherence in hit trials during both the pre-target stimulus period and the non-target stimulus presentation period. These findings have useful implications for the field, and the experiments and analyses presented in this manuscript validly support the authors' claims. 

      Strengths: 

      The experiments were well-designed and executed with meticulous control. The analyses of both behavioural and electrophysiological data align with the standards in the field. 

      We thank the reviewer for their enthusiasm about our study and their constructive comments which we address below.

      Weaknesses: 

      Many of the findings appear to be incremental compared to previous literature, including the authors' own work. While incremental findings are not necessarily a problem, the manuscript lacks clear statements about the extent to which the dataset, analysis, and findings overlap with the authors' prior research. For example, one of the main findings, which suggests that V4 neurons exhibit larger visual responses in hit trials (as shown in Fig. 3), appears to have been previously reported in their 2017 paper. Additionally, it seems that the entire Fig1-S1 may have been reused from the 2017 paper. These overlaps should have been explicitly acknowledged and correctly referenced. 

      While the raw data used in this paper overlaps entirely with Nandy et al. (2017), all the analyses and findings in this manuscript are new and have not been previously reported. Figure 1-figure supplement 1 is modified and reproduced from that paper only to allow readers to understand the recording methods used to collect the data without needing to go back to the previous paper. We have added an explicit acknowledgment of this to the figure caption.

      Previous studies have demonstrated that attention leads to decorrelation in V4 population activity. The authors should have discussed how and why the high coherence across layers observed in the current study can coexist with this decorrelation. 

      We have updated the discussion section (Lines 347-351) to further elaborate on this interpretation. 

      Furthermore, the manuscript does not explore potentially interesting aspects of the dataset. For instance, the authors could have investigated instances where monkeys made 'false' reports, such as executing saccades towards visual stimuli when no orientation change occurred. It would be valuable to provide the fraction of the monkeys' responses in a session, including false reports and correct rejections in catch trials, to allow for a broader analysis that considers the perceptual component of neural activity over pure sensory responses. 

      We appreciate this feedback. While we agree these are interesting directions, we decided to limit the scope of this study to only focus on trials at threshold with an orientation change, and are considering these directions for future studies. 

      Reviewer #2 (Recommendations For The Authors): 

      • Figure Design: Since eLife does not impose space limitations, it is advisable for the authors to avoid using very small font sizes. Consistency in font size throughout the figures is recommended. Some figures are challenging to discern, for example, the mean+-sem in Fig. 2B, and the alpha values of green and purple colours for superficial/deep layers are too high, making them too transparent or pale. 

      We have increased the size of some small fonts and improved font size consistency throughout the figures. We have changed the layer colors to improve legibility. 

      • Line 119: trail, 

      This has been fixed.

    1. eLife assessment

      This paper proposes a valuable new method for the assessment of the mean kurtosis for diffusional kurtosis imaging by utilizing a recently introduced sub-diffusion model. The evidence supporting the claims that this technique is robust and accurate in brain imaging is solid; however, there is a need to include a summary of the clear limitations.

    2. Reviewer #1 (Public Review):

      This study introduces an innovative method for assessing the mean kurtosis, utilizing the mathematical foundation of the sub-diffusion framework. In particular, a new fitting technique that incorporates two different diffusion times is proposed to estimate the parameters of the sub-diffusion model. The evaluation of this technique, which generates kurtosis maps based on the sub-diffusion framework, is conducted through simulations and the examination of data obtained from human subjects.

      The authors have revised the manuscript to address the initial critiques. However, there appears to be some confusion regarding the following responses.

      "The comment "... using the new sub-diffusion model -an approximation of the DKI-based signal expression..." is a bit misleading. In fact we propose that the reverse interpretation is the more suitable way to view the relationship: the DKI model is a degree-2 approximation of the sub-diffusion model, as in eq. (7).<br /> We appreciate the suggestion. However, unfortunately, it is not appropriate to generate data with the DKI model, as the maximum b-value is limited to 2000~3000s/mm^2 and hence the DKI model cannot represent diffusion MRI signals from a full spectrum of b-values. A key strength of our proposed model is that it removes this limitation. "

      The main motivation of this study is to investigate the feasibility of the sub-diffusion model, which was proposed in Yang et al., NeuroImage 2022, to provide fast and robust estimation of kurtosis model parameters. I understand that mathematically, the DKI model can be written as a degree two approximation of the sub-diffusion model. However, the hypothesis is that the proposed sub-diffusion model can be used to obtain practically useful mapping of mean kurtosis. Therefore, unless the authors use a different parameter or phenomenon as the "true" or "ground-truth kurtosis," this study examines whether the sub-diffusion model parameters can serve as an approximation to the conventional DKI parameters.

      With the current simulation study design, 1) the data is generated by the proposed sub-diffusion model, 2) the "ground-truth" or "true" D* and K* are computed based on the proposed equality (Eq.7); 3) and then the data is fit with the conventional DKI model and also with the proposed sub-diffusion model. Since the data is generated by the proposed model, and the ground truth (or true) values calculated by the proposed equality, as expected, the fitted kurtosis values by the sub-diffusion model match better with the simulated ones compared to the conventional DKI model.

      Furthermore, as the authors noted, the sub-diffusion model eliminates the restriction on b-value selection, allowing for DWI data acquisition with higher b-values. However, it is unclear how the new K* and D* values, calculated directly from the sub-diffusion model using a higher b-value DWI protocol, are superior to the K and D values from the conventional DKI model, which uses a DWI protocol limited to b-values of 2000-3000 s/mm². In clinical practice, b-values of 2000-3000 s/mm² are generally considered "high b-value."

    3. Reviewer #2 (Public Review):

      Summary:

      The authors present an interesting technique for analysis of diffusion magnetic resonance images (dMRI) using a sub-diffusion model of the diffusion process. They show that the results of their technique when fitted to dMRI with two diffusion times provide robust diffusion coefficient and kurtosis measures.

      Strengths:

      The measures provided by the sub-diffusion technique are robust and can be reliably estimated from short dMRI data acquisitions. This is potentially useful in application to clinical studies.

      Weaknesses:

      The authors do not fully demonstrate that their D* and K* measures are not affected by diffusion time. Potential limitations of the technique are not considered.

      This reviewer suggests that the paper would benefit from considerations of the limitations of the applied techniques. This would include consideration of:<br /> (i) The use of the sub-diffusion model in the simulation studies - there are circular arguments that should be considered.<br /> (ii) The time dependence of D* and K*. This is because the human data provided in Tables 3 and 4 (for Δ=19ms and Δ=49ms) seem to show that<br /> D* and K* are time dependent.

      With respect to the second point this reviewer acknowledges the authors' argument that when the fitting is performed over the higher dimensional space that includes multiple diffusion times then this leads to a more robust estimation of sub-diffusion measures. However, the authors only include two diffusion times in their in-vivo human analysis (Δ=19ms and Δ=49ms) so it is not possible for them to show here that different pairs of diffusion times lead to invariant D* and K* values. This is a limitation of the study as the authors show there is time dependence of D* and K* in tables 3 and 4 (when the model is fitted to single diffusion times). Potentially the larger apparent time dependence of K* in white matter compared to grey matter (tables 3 and 4) could lead to the tissue specific differences in root mean squared error shown in Figure 7.

      This reviewer requests that the authors discuss their results more clearly with respect to these potential limitations and include some discussion of their single (and multiple) diffusion time results (for D_SUB and K*) in comparison with the time dependent DKI literature.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This paper proposes a valuable new method for the assessment of the mean kurtosis for diffusional kurtosis imaging by utilizing a recently introduced sub-diffusion model. The evidence supporting the claims that this technique is robust and accurate in brain imaging is incomplete. The work could be of interest in the research and clinical arena.

      We thank the editors for their assessment and the reviewers for their careful reading and feedback that helped to improve the manuscript. We have addressed all the reviewers’ concerns and would like to request an update of the assessment to reflect the revisions we have made.

      Below, we address the reviewers’ comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study introduces an innovative method for assessing the mean kurtosis, utilizing the mathematical foundation of the sub-diffusion framework. In particular, a new fitting technique that incorporates two different diffusion times is proposed to estimate the parameters of the sub-diffusion model. The evaluation of this technique, which generates kurtosis maps based on the sub-diffusion framework, is conducted through simulations and the examination of data obtained from human subjects.

      We thank Reviewer #1 for pointing out the novelty and innovation of our work.

      Strengths:

      The utilization of the sub-diffusion model for tissue characterization is a significant conceptual advancement for the field of diffusion MRI. This study adeptly harnesses this approach for an accurate estimation of the parameters of the widely employed diffusion model, DKI, leveraging their established analytical interconnection as evidenced in prior research. Notably, this approach not only proposes a robust, fast, and accurate technique for DKI parameter estimation but also underscores the viability of deploying the sub-diffusion model for tissue characterization, substantiated by both simulated and human subject analyses. The paper is very-well written; well-organized; and coherent. The simulation study included different aspects of water diffusion as captured by diffusion-weighted MRI such as varying diffusion times and different b-value subpopulations, resulting in a comprehensive and thorough discussion.

      We thank Reviewer #1 for highlighting the the strengths of our work.

      Weaknesses:

      The primary objective of this study is to demonstrate a robust approach for estimating DKI parameters by directly calculating them using the parameters of the sub-diffusion model. This premise, however, relies on the assumption that the sub-diffusion model effectively characterizes the diffusion MRI signal and that its parameters are both robust and accurate. Throughout the manuscript, the term "ground truth kurtosis K" is frequently used to denote the "true K" value in the context of the simulation study. Nonetheless, given that the data is simulated using the new sub-diffusion model - an approximation of the DKI-based signal expression- this value cannot truly be considered the "ground truth K". The simulation study highlights the robustness and accuracy of D* and K*, but it inherently operates under the assumption that the observed data is in the form of the sub-diffusion model.

      It is correct that our study operates under the assumption that the observed data is in the form of the sub-diffusion model, and indeed one of the key outcomes of this work is to demonstrate the effectiveness of that assumption and the new possibilities it brings. Naturally, using any mathematical model at all carries assumptions. Over the past two decades, many mathematical and biophysical models have been proposed to characterise diffusion MRI signals. However, model validation remains an open challenge in the field. In this, as well as in our previous work (Yang et al, NeuroImage, 2022), we have shown that our proposed sub-diffusion model not only provides a much better fitting compared to the traditional DKI method, overcoming the major limitation of the traditional DKI method on the maximum b-value, but also generates brain maps with superior tissue contrast and elucidates previously unseen structure.

      We have replaced the term “ground truth kurtosis K” with “true kurtosis K”.

      The comment “… using the new sub-diffusion model – an approximation of the DKI-based signal expression…” is a bit misleading. In fact we propose that the reverse interpretation is the more suitable way to view the relationship: the DKI model is a degree-2 approximation of the sub-diffusion model, as in eq. (7).

      Reviewer #2 (Public Review):

      Summary: The authors present a technique for fitting diffusion magnetic resonance images (dMRI) to a sub-diffusion model of the diffusion process within brain imaging. The authors suggest that their technique provides robust and accurate calculation of diffusional kurtosis imaging parameters from which high quality images can be calculated from short dMRI data acquisitions at two diffusion times.

      Strengths: If the authors can show that the dMRI signal in brain tissue follows a sub-diffusion model decay curve then their technique for accurately and robustly calculating diffusional kurtosis parameters from multiple diffusion times would be of benefit for tissue microstructural imaging in research and clinical arenas.

      In Figure 7, we showed that the diffusion MRI signals follow the sub-diffusion model decay curves.

      Weaknesses: The applied sub-diffusion model has two parameters that are invariant to diffusion time, D_β and β which are used to calculate the diffusional kurtosis measures of a diffusion time dependent D* and a diffusion time invariant K*. However, the authors do not demonstrate that the D_β, β and K* parameters are invariant to diffusion time in brain tissue.

      In our proposed sub-diffusion model, D_β and β are assumed to be time-independent parameters, which is a key strength of the approach. The goal is to characterise tissue-specific properties (D_β for diffusivity and β for the extent of tissue complexity) that do not rely on the diffusion time setting in diffusion MRI experiments. To extract such time-independent properties, we proposed a new sampling and fitting strategy – fitting at least two diffusion time data together.

      The authors' results visually show that there is time dependence of the K* measure (in Figure 6) that is more apparent in white matter with K* values being higher for diffusion times of ∆=49 ms than ∆ = 19 ms. The diffusion time dependence of K* indicates there is also diffusion time dependence of β.

      The discrepancies in the fitted K* for ∆ = 19 ms and ∆ = 49 ms separately do not necessarily imply that there is a true time dependence in these parameters. Rather, this can be explained by a deficiency of data when fitting a two-dimensional surface (S is a function of q and ∆) based on data along a single curve for a fixed value of ∆.  Without properly sampling the surface across two independent coordinates, one cannot expect a fully reliable fit.  Indeed, a great advantage of our proposed method is to allow fitting data with multiple values of ∆, and thereby getting a richer data set with which to fit the full signal surface S(q, ∆).  The results for fitting ∆ = 19 ms and ∆= 49 ms data together clearly show the benefits of this approach, with superior contrast achieved.

      Furthermore, Figure 7 shows that there is a tissue specific root mean squared error in model fitting over the two diffusion times which indicates greater deviation from the model fit in white matter than grey matter.

      Although the errors are not completely tissue-independent, please note the magnitude of the RMSE is very small. The quality of the fitting in both white and grey matter is shown in sub-figures (A)-(H) for several representative voxels.

      To show that the sub-diffusion model is robust and accurate (and consequently that K* is robust and accurate) the authors would have to demonstrate that there is no diffusion time-dependence in both D_β and β in application to brain imaging data for each diffusion time separately. Simulated data should not be used to demonstrate the robustness and accuracy of the sub-diffusion model or to determine optimization of dMRI acquisition parameters without first demonstrating that D_β and β are invariant to diffusion time. This is because simulated signals calculated by using the sub-diffusion characteristic equation of dMRI signal decay will necessarily have diffusion time invariant D_β and β parameters. Without further information demonstrating diffusion time invariance of D_β, β and K* it is not possible to determine whether the authors have achieved their aims or that their results support their conclusions.

      First, as explained above, the dMRI signal S is a function of q and ∆, i.e., a two-dimensional surface S(q, ∆), and hence fitting data sampled from single diffusion time (i.e., one curve on the surface) cannot provide reliable parameters, as seen in the discrepancies in K* in Figure 6 (bottom two rows). Our proposed new sampling and fitting strategy overcomes this issue. That is, to obtain a reliable fitting, one should fit data from at least two diffusion times together (i.e., sampling data from at least two curves on the signal surface).

      Second, to demonstrate that D_β and β are time invariant, one would require data at several diffusion times with high b values. Such data cannot be easily obtained. The data used in this current study is the MGH Connectome 1.0 human brain data, which only contains two diffusion times, ∆ = 19 ms and ∆ = 49 ms.

      Hence, we conducted numerical experiments to demonstrate our idea. In Figure 3, we showed that (i) the variability of the fitted parameters is significantly reduced when moving from fitting single diffusion time data to two diffusion time data, and (ii) the difference in fitting three diffusion times compared to two is very minor, indicating convergence towards the correct time-independent parameter values. The results from fitting human brain data (Figure 6 and Tables 2-4) agree with the expectations from our numerical experiments. Hence, we believe that we have provided sufficient evidence to support our proposed sub-diffusion model and its optimal fitting strategy.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      It is clear that the authors preferred generating the data by using sub-diffusion model's signal expression as it has many benefits, such as allowing different diffusion times to be incorporated, and hence investigation of the effect of the number of diffusion times on the accuracy of the parameter fitting. I recommend adding another simulation study by generating the data with the DKI model expression (as the goal of the study is to provide an accurate mapping of diffusional mean kurtosis), fitting the data to the sub-diffusion model's expression in Eq. (10), and then calculating K* and D* by Eqs. (8) and (9) only for a fixed diffusion time and one b-value subset.

      We appreciate the suggestion. However, unfortunately it is not appropriate to generate data with the DKI model, as the maximum b-value is limited to 2000~3000s/mm^2 and hence the DKI model cannot represent diffusion MRI signals from a full spectrum of b-values. A key strength of our proposed model is that it removes this limitation.

      There is a typo on Page 24, Line 581; "b<=2400" should be b>=2400.

      We have fixed this typo.

      Reviewer #2 (Recommendations For The Authors):

      As the authors state the sub-diffusion model has two parameters, D_β and β that are invariant to diffusion time, and give rise to a time-varying diffusion coefficient in mm^2s^-1 and a time invariant kurtosis. However, there is a need to be clearer and more specific about the implications of the sub-diffusion model. The manuscript would be improved by the authors:

      (a) Defining the time-varying diffusion coefficient that arises from the model, its functional form and properties.

      We refer Reviewer#2 to eq.(5) and eq.(8) for the definition of time-varying diffusion coefficients D* and D_SUB and their relationship.

      (b) Clearly discuss the implications of this with respect to other time-varying diffusion coefficient methods in the current literature.

      We refer Reviewer#2 to the section “Time-dependence of diffusivity and kurtosis” under “Discussions”.

      (c) Demonstrating that D_β and β do not vary with diffusion time when estimated from dMRI acquired on human participants.

      We have addressed this comment in the public review.

      The manuscript would benefit from increases in clarity in all sections and the authors identifying typographical errors.

      We have updated the relevant text in the revised manuscript to make it clearer, including fixing typos.

      Specific improvements to clarity in the methods and results section would include:

      Line 620: Why were parameter approximations for model fitting to simulated data restricted to the ranges D_β∈[10^(-4),10^(-3) ] and β∈[0.5,1] but in fitting to brain imaging data the ranges were D_β>0 and 0<β<=1.

      The parameter ranges for model fitting to both the simulated and human data were set to the same: D_β>0 and 0<β<=1. To generate simulated data, D_β and β ranges were restricted to reflect observations in human brain data. We have updated the text to make this clearer.

      Lines 622, 628 & 629: Which goodness of fit measure was used?

      The goodness of fit measure for all simulated results is the coefficient of determination, or R^2 value, as noted in the “Goodness-of-fit and region-based statistical analysis” section under Methods. We have updated the text to make this clearer.

      Line 666: The method for computation of R^2 within the coefficient of determination should be stated as there are several ways of calculating an R^2 value.

      The formula for computing R^2 has been added to the text.

      Line 685: A t-test is mentioned but it is not clear as to the inputs to this test, or where the results of this analysis are presented.

      We have updated the text to make this clearer. The results of this analysis are presented in Table 5. The entries identified in italic under the optimal b-value heading were found to be significantly different from the benchmark mean K* reported in Table 2.

      Line 696: It is not clear how the intra-class correlation coefficient histograms are computed from six subjects. This applies to results in Figure 10 that require greater clarity in the description.

      The formula for computing the intra-class correlation coefficient has been added to the sub-section “Scan-rescan analysis using intraclass correlation coefficient (ICC)” under “Methods”.

      It would be helpful if the authors primarily report results pertaining to the model parameters D_β and β. This is because D* and K* are calculated from D_β and β. Conditions for robust and accurate estimation of D_β and β will provide robust and accurate measures for D* and K*.

      Two new tables for the model parameters D_β and β have been added. Please see Tables 3 and 4 in the revised manuscript.

      The authors state that fitted model parameters are not affected by maximum b-value (paragraph beginning line 366). This statement is based on their model simulation results. Could the authors provide data to support this based on the application of their model to the human brain imaging data?

      We would like to clarify that our statement is indeed based on human brain imaging. As stated in the paragraph beginning line 366, both results in Table 2 (using full dataset) and Table 5 (using dataset with optimal b-value sampling) are generated from the Connectome human brain data. If maximum b-value dependence is present, benchmark (Table 2) versus optimal region-specific results (Table 5, or previously Table 3) should show some systematic difference.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors investigate the role of chirping in a species of weakly electric fish. They subject the fish to various scenarios and correlate the production of chirps with many different factors. They find major correlations between the background beat signals (continuously present during any social interactions) or some aspects of social and environmental conditions with the propensity to produce different types of chirps. By analyzing more specifically different aspects of these correlations they conclude that chirping patterns are related to navigation purposes and the need to localize the source of the beat signal (i.e. the location of the conspecific).

      The study provides a wealth of interesting observations of behavior and much of this data constitutes a useful dataset to document the patterns of social interactions in these fish. Some data, in particular the high propensity to chirp in cluttered environments, raises interesting questions. Their main hypothesis is a useful addition to the debate on the function of these chirps and is worth considering and exploring further.

      After the initial reviewers' comments, the authors performed a welcome revision of the way the results are presented. Overall the study has been improved by the revision. However, one piece of new data is perplexing to me. The new Figure 7 presents the results of a model analysis of the strength of the EI caused by a second fish to localize when the focal fish is chirping. From my understanding of this type of model, EOD frequency is not a parameter in the model since it evaluates the strength of the field at a given point in time. Therefore the only thing that matters is the phase relationship and strength of the EOD. Assuming that the second fish's EOD is kept constant and the phases relationship is also the same, the only difference during a chirp that could affect the result of the calculation is the potential decrease in EOD amplitude during the chirp. It is indeed logical that if the focal fish decreased its EOD amplitude the target fish's EOD becomes relatively stronger. Where things are harder to understand is why the different types of chirps (e.g. type 1 vs type 2) lead to the same increase in signal even though they are typically associated with different levels of amplitude modulations. Also, it is hard to imagine that a type 2 chirps that is barely associated with any decrease in EOD amplitude (0-10% maybe), would cause doubling of the EI strength. There might be something I don't understand but the authors should provide a lot more details on how this result is obtained and convince us that it makes sense.

      We thank the author for the comments and we agree that the approach could have been better detailed. As anticipated by the Reviewer, the Boundary Element Method (BEM) model can be used simply to calculate the electric field and electric image at a specific point in time (instantaneously), regardless of EOD frequency. However, our model allows for the concatenation of consecutive instants and thus is able to render an entire sequence of electric fields - and resulting electric images - incorporating realistic EOD characteristics such as shape, duration, and frequencies (see Pedraja et al., 2014).

      Chirp-triggered EIs were modeled using real chirps produced by interacting fish. Each chirp was thus associated to its duration and peak parameters, as well as the fish positional information (distance and angle). 

      However, since we did not know the beat phase at which chirps were produced, we computed electric images for each fish position and chirp scenario by simulating various phases (here referred to the initial offset of the two EODs, set at 4 phases, equally spaced). These are intended as phases of the sender EOD and simply refer to the initial OFFSET between the two interacting EODs. However, since our simulations were run over a time window of 500 msec, all phases are likely to be covered, with a different temporal order relative to the chirp (always centered within the 500 msec).

      The simulation was run maintaining consistent timing for both chirp and non-chirp conditions, across approximately 800 body nodes. At each node, the current flow was calculated from the peak-to-peak of the EOD sum (i.e. the point-to-point of the difference between the beat positive and negative envelopes). Analyzing the EIs over this fixed time window enables us to assess the unitary changes of current flow induced by chirps over units of time (ΔI/Δt). From this, we can calculate a cumulative sum of current flow changes - expressed as delta(EI) and use it to show the effect of the chirps on the spatiotemporal EI (Figure 7C).

      One can express this cumulative change mapped onto the fish body (keeping the 800 points separated, as in Figure 7C) or further sum the current changes to obtain a single total (as shown in Figure 7D).

      One can check this by considering that a sum for example of a set of 500/800 points - judging from the size of the blue areas in C not all 800 points have a detectable change - each valued 0.1-to-0.3 mA/s, one could get circa 100 mA/s, which is what is shown in D. (is this what is happening ?)

      We do not know why chirps of different types triggered similar effects. It is possible that, since EI measurements are pooled over several chirps produced at different angles and distances, in case of a lower amount of chirps considered for a given type (as in the case of rises, very low) these measurements may not highlight more marked differences among types. In a publication we are currently working on, we are considering a larger dataset to better assess these results.

      The methods section has been edited to clarify the approach (not yet).

      Reviewer #2 (Public Review):

      Studying Apteronotus leptorhynchus (the weakly electric brown ghost knifefish), the authors provide evidence that 'chirps' (brief modulations in the frequency and amplitude of the ongoing electric signal) function in active sensing (specifically homeoactive sensing) rather than communication. Chirping is a behavior that has been well studied, including numerous studies on the sensory coding of chirps and the neural mechanisms for chirp generation.

      Chirps are largely thought to function in communication behavior, so this alternative function is a very exciting possibility that could have a great impact on the field.

      We thank the Reviewer for the extensive and constructive comments. We would like to add that, while it is true that many detailed studies have been published on the anatomy and physiology of the circuits implicated in the production and modulation of “electric chirps”, most of this  research assumed, and focused exclusively on, their possible role in communication.  In addition, most behavioral studies did the same and a meta-analysis of the existing literature on chirping allows to trace back the communication idea mainly to two studies: Hagedorn and Heiligenberg, 1985 (“Court and spark: electric signals in the courtship and mating of gymnotoid fish”) and Hopkins, 1974 (“Electric Communication: Functions in the Social Behavior of Eigenmannia Virescens”), among the main sources. Importantly, in these studies only contextual observations have been made (no playback experiment or other attempts to analyze more quantitatively the correlation of chirping with other behaviors).

      The authors do provide convincing evidence that chirps may function in homeoactive sensing. However, their evidence arguing against a role for chirps in communication is not as strong, and fails to sufficiently consider the evidence from a large body of existing research. Ultimately, the manuscript presents very interesting data that is sure to stimulate discussion and follow-up studies, but it suffers from dismissing evidence in support of, or consistent with, a communicative function for chirps.

      Although the tone of some statements present in our earlier draft may suggest otherwise, through our revisions, we have made an effort to clarify that we do not intend to dismiss a function of chirps in communication, we only intend to debate and discuss valid alternative hypothesis, advanced from reasonable considerations.

      Before writing this manuscript, we have attempted to survey  literally all the existing literature on chirps (including studies focused on behavior, peripheral sensory physiology as well as brain physiology). Although it is not unlikely that some studies have eluded our attention, an effort for a comprehensive review was made. Based on this survey we realized that none of the studies provided a clear  and  unambiguous piece of evidence to support the communication hypothesis (we refer here to the weak points highlighted in the discussion and mentioned in the previous comment). Which in fact does not come without its weak points and contradictions (see later comments).

      It follows a summary of the mentions made to the communication theory in the different section of the manuscript including several edits we have applied in response to the Reviewer’s concern:

      In the abstract we clearly state that we are considering an alternative that is only hypothetically complementary, not for sure.  Nonetheless, we have identified a couple of instances that could sound dismissive of the “communication hypothesis” in the following section.

      In the introduction we write in fact about the possibility of interference between communication signals and conspecific electrolocation cues, as they are both detected as beat perturbations. We did not mean to use “Interference” here as “reciprocal canceling”, rather we intended it as “partial or more or less conspicuous overlap” in the responses triggered in electroreceptors.

      Hoping to convey a clearer message, we have edited the related statement and changed it to “both types of information are likely to overlap and interact in highly variable ways”.

      We have also removed the statement: “According to this idea, beats and chirps are not only detected through the same input channel, but also used for the same purpose.” as at this point in the manuscript it may be too strong.

      In the results section we do not include statements that might be seen as dismissive of the communication hypothesis but only statements in support of the “probing with chirps” idea (which is the central hypothesis of the study).

      In the discussion paragraphs we elaborate on why the current functional view is either flawed or incomplete (first paragraph “existing functional hypotheses''). Namely: 1)  multiple triggering factors implied in chirp responses covary and need to be disentangled (example DF/ sex), 2) findings on brown ghosts and a few other gymnotiforms have been used to advance the hypothesis of “communication through chirps'' in all weakly electric fish (including pulse species). 3) social encounters - in which chirps are recorded - imply also other behaviors (such as probing) which have not been considered so far. This point is related to the first one on covariates. 4) most studies referring to big chirps as courtship chirps were not done in reproductive animals (added now)  and 5) no causal evidence has been provided so far to justify a role of chirps in social communication.

      We are discussing these points as challenges to the communication hypothesis, not to dismiss the hypothesis, but rather to motivate future studies addressing these challenges.

      We do not want to appear dismissive of the communication hypothesis and had therefore previously edited the manuscript to avoid the impression of exclusivity of the probing hypothesis. We have now gone over the manuscript once more and edited several sentences. Nevertheless, we want to point out again that - despite the large consensus - the communication hypothesis has, until now, never been investigated with the kind of rigor applied here.

      The authors do acknowledge that chirps could function as both a communication and homeactive sensing signal, but it seems clear they wish to argue against the former and for the latter, and the evidence is not yet there to support this.

      In both rounds of revision we have made an effort to convey a more inclusive interpretation of our findings. We tried our best to express our ideas as hypothetical, not as proof that communication through chirps does not exist. The aim of this study is to propose an alternative view, and this cannot be done without underlining the weak points of an existing hypothesis while providing and supporting reasonable arguments in favor of the alternative we advance. The actual evidence for a role of chirping in communication is much less strong than appears from the pure number of articles that have discussed chirps in this context.

      Regarding the weak evidence against communication, here we can list a few additional important points related to the proposed interpretations of chirp function (more specific than those made earlier):

      (1) A formally sound assessment of signal value/meaning - as typically done in animal communication studies should involve: 

      a) the isolation of a naturally occurring signal and determination of the context in which it is produced 

      b) the artificial replication of the signal

      c) the observation that such mimic is capable of triggering reliable and stereotyped responses in a group of individuals (identified by sex and/or species) under the same conditions (conditioned, unconditioned, state-dependent, etc.). As discussed for instance in Bradbury and Vehrencamp, 2011; Laidre and Johnstone, 2013; Wyatt, 2015; Rutz et al., 2023.

      This approach has so far not been applied to weakly electric fish. The initial purpose of the present study was in fact to conduct this type of validation.

      (2) The hypothesis of chirps used for DF-sign discrimination - for “social purposes” - although plausible in the face of theoretical considerations,  does not seem to be reasonable in practice, when one considers emission rates of 150 chirps per minute. We do find a strong correlation of chirp type with DF, which is often very abrupt and sudden (as if the fish were tracking beat frequency to guess its value) but the consideration made above on chirp rates seems to discourage this interpretation.

      (3) The hypothesis of chirp-patterning (i.e. chirping may have meaning based on the sequence of chirps of different types, a bit like syllables in birdsongs) - assessed by only one study conducted in our group - has not been enough substantiated by replication. We have surveyed all possible combinations of chirps produced by interacting pairs in different behavioral conditions using different value for chirp sequence size: 2, 3,... ,8 chirps (both considering the sender alone as well as sender+receiver together). In all cases we found no evidence for  a context dependent “modulation” of chirp types (i.e. no specific chirp type sequence in specific contexts).

      (4) The hypothesized role of “large chirps” as courtship signals could be easily criticized by noting the symmetrical distribution of these events around  a DF of 0 Hz . Although one could argue about a failure to discriminate DF-sign, to explain this well known pattern. However, we know from Walter Heiligenberg’s work and physiological considerations that such task can be solved easily through t-units and … in principle even just by motion (which would change the EOD phase in frequency dependent ways, thus potentially revealing the DF sign).

      Overall, these considerations made us think that certainly chirping occurs in a social context, but it is the meaning of this behavior that remains elusive.  We noticed that environmental factors are also strongly implied … we then formulate an alternative hypothesis to explain chirping but we do so  without dismissing the communication idea.

      All this seems to us just a careful way to critically discuss our results and those of other studies, without considering the issue resolved.

      In the introduction, the authors state, "Since both chirps and positional parameters (such as size, orientation or motion) can only be detected as perturbations of the beat, and via the same electroreceptors, the inputs relaying both types of information are inevitably interfering." I disagree with this statement, which seems to be a key assumption. Both of these features certainly modulate the activity of electroreceptors, but that does not mean those modulations are ambiguous as to their source. You do not know whether the two types of modulations can be unambiguously decoded from electroreceptor afferent population activity.

      We thank the Reviewer for noting this imprecision. We have addressed the Reviewer’s concern in another reply (see above).

      My biggest issue with this manuscript is that it is much too strong in dismissing evidence that chirping correlates with context. In your behavioral observations, you found sex differences in chirping as well as differences between freely interacting and physically separated fish. Chirps tended to occur in close proximity to another fish. Your model of chirp variability found that environmental experience, social experience, and beat frequency (DF) are the most important factors explaining chirp variability. Are these not all considered behavioral or social context? Beat frequency (DF) in particular is heavily downplayed as being a part of "context" but it is a crucial part of the context, as it provides information about the identity of the fish you're interacting with. The authors show quite convincingly that the types of chirps produced do not vary with these contexts, but chirp rates do.

      We believe the “perceived claim” may be an issue of unclear writing. We have now tried to better clarify that “context” affects chirp rates, but it does not affect chirp types as much (except when beat frequency is high).  

      We have edited two statements possibly susceptible to misinterpretation: 

      (1) In the results: “It also indicates that chirp parameters such as duration and FM do not seem to be associated with any particular context in a meaningful way, other than being affected by beat frequency.”

      (2) In the discussion: the statement

      “Recordings from interacting fish pairs confirmed the absence of any significant correlation between chirp type choice and behavioral context (Figure S2) although the variance of chirp parameters appears to be significantly affected by this factor (Figure 2). This may suggest that the effect of behavioral context is mainly detectable in the number of chirps produced (Figure S1), rather than the type (Figure S2).”

      has been changed to:

      “Recordings from interacting fish pairs confirmed the absence of any significant correlation between chirp type choice and behavioral context, except for those cases characterized by higher beat frequencies  (Figure S2). This suggests that the effect of behavioral context highlighted in our factor analysis (Figure 2) is mainly due to the number of chirps produced (Figure S1), rather than their type (Figure S2).”

      Eventually, in the results we emphasize the relatively higher impact of previously unexplored factors on chirp variance: “The plot of individual chirps (Figure 2C) shows the presence of clustering around different categorical variables and it reveals that experience levels or swimming conditions are important factors affecting chirp distribution (note for instance the large central “breeding” cluster in which fish are divided and the smaller ones in which fish are free). Sender or receiver identity does not individuate any clear clustering relative to either sex (see the overlap of male_s/male_r and female_s/female_r) or social status (dominant/subordinate). Chirps labeled based on tank experience (i.e. resident vs intruder) are instead clearly separated.”.

      Further, in your playback experiments, fish responded differently to small vs. large DFs, males chirped more than females, type 2 chirps became more frequent throughout a playback, and rises tended to occur at the end of a playback. These are all examples of context-dependent behavior.

      We do note that male brown ghosts chirp more than females. But we do also say - and show in figure 8 - that males move more in proximity to and around conspecifics. We do acknowledge that chirp time-course may be different during playbacks in a type-dependent manner. But how this can support the communication hypothesis - or other alternatives - is unclear. This result could equally imply the use of different chirp types for different probing needs. Since we cannot be sure about either, we do not want to put too much emphasis to it. Eventually, the fact that “context” (here meant broadly to define different experimental situations in which social but also physical and environmental parameters are altered) affects chirping is undeniable: cluttered and non-cluttered environments do represent different contexts which differently affect chirping in conspicuous ways.

      In the results, the authors state, "Overall, the majority of chirps were produced by male subjects, in comparable amounts regardless of environmental experience (resident, intruder or equal; Figure S1A,C), social status (dominant or subordinate; Figure S1B) or social experience (novel or experienced; Figure S1D)." This is not what is shown in Figure S1. S1A shows clear differences between resident vs. intruder males, S1B shows clear differences between dominant vs. subordinate males, and S1D shows clear differences between naïve and experienced males. The analysis shown in Figure 2 would seem to support this. Indeed, the authors state, "Overall, this analysis indicated that environmental and social experience, together with beat frequency (DF) are the most important factors explaining chirp variability."

      The Reviewer is right in pointing at this imprecise reference and we are grateful for spotting this incongruence. The writing refers probably to an earlier version of the figure in which data were grouped and analyzed differently. We now edited the text and changed it to: “Overall, the majority of chirps were produced by male subjects, at rates that seemed  affected by environmental experience (resident, intruder or equal; Figure S1A,C), social status (dominant or subordinate; Figure S1B) and social experience (novel or experienced; Figure S1D).”

      The choice of chirp type varied widely between individuals but was relatively consistent within individuals across trials of the same experiment. The authors interpret this to mean that chirping does not vary with internal state, but is it not likely that the internal states of individuals are stable under stable conditions, and that individuals may differ in these internal states across the same conditions? Stable differences in communication signals between individuals are frequently interpreted as reflecting differences between those individuals in certain characteristics, which are being communicated by these signals.

      It seems here we have been unclear in the writing: while it is true that behavioral states are stable and can imply stable chirp patterning (if the two are related), since chirp types vary abruptly and in a reliable DF-dependent manner, different types of chirps are unlikely to be matched to different internal states following the same temporal order in such a reliable way (similarly repeated through consecutive trials).

      This would imply the occurrence of different internal states in rapid sequence, reliably triggered by repeated EOD ramps, regardless of whether the playback is 20 sec long or 180 sec long.

      We have edited this paragraph to better explain this: “The reliability by which the chirping response adapts to both the rate and direction of beat frequency is variable across individuals but rather stable across trials (relative to a given subject), further suggesting that chirp type variations may not reflect changes in internal states or in the animal motivation to specific behavioral displays (which are presumably subject to less abrupt variations and stereotypical patterning based on DF).”

      I am not convinced of the conclusion drawn by the analysis of chirp transitions. The transition matrices show plenty of 1-2 and 2-1 transitions occurring.

      The only groups in which 1-2 and 2-1 transitions are as frequent as 1-1 and 2-2 (being 1 and 2 the numerical IDs of the two interacting fish) are F-F pairs. This is a result of the fact that in females chirp rates are so low that within-fish-correlations end up being as low as between-fish-correlations. We believe the impression of the Reviewer could be due to the fact that these are normalized maps (see legend of Figure 5A-B).

      Further, the cross-correlation analysis only shows that chirp timing between individuals is not phase-locked at these small timescales. It is entirely possible that chirp rates are correlated between interacting individuals, even if their precise timing is not.

      We agree with the Reviewer, this is a possibility. To address this point, we did edit the results section to acknowledge that what we see may be related to the time window chosen (i.e. 4 sec):

      “More importantly, they show that - at least in the social conditions analyzed here and within small-sized time windows - chirp time series produced by different fish during paired interactions are consistently independent of each other.”

      Further, it is not clear to me how "transitions" were defined. The methods do not make this clear, and it is not clear to me how you can have zero chirp transitions between two individuals when those two individuals are both generating chirps throughout an interaction.

      We thank the Reviewer for bringing up this unclear point. We have now clarified how transitions were calculated in the method section: “The number of chirp transitions present in each recording (dataset used for Figures 1, 2, 5) was measured by searching in a string array containing the 4 chirp types per fish pair, all their possible pairwise permutations (i.e. all possible permutations of 4+4=8 elements are: 1-1, 1-2, 1-3 … 7-6, 7-7, 7-8; considering the following legend 1 = fish1 type 1, 2 = fish 1 type 2, 3 = fish1 type 3 … 6 = fish2 type 2, 7 = fish2 type 3 and 8 = fish2 rise).”.

      Zero transitions are possible if two fish (or groups of fish) do not produce chirps of all types. Only transitions of produced types can be counted.

      In the results, "Although all chirp types were used during aggressive interactions, these seemed to be rather less frequent in the immediate surround of the chirps (Figure 6A)." A lack of precise temporal correlation on short timescales does not mean there is no association between the two behaviors. An increased rate of chirping during aggression is still a correlation between the two behaviors, even if chirps and specific aggressive behaviors are not tightly time-locked.

      The Reviewer is right in pointing out the limited temporal scaling of our observations/analysis. We have now edited the last paragraph of the results related to figure 6 to include the possibility mentioned by the Reviewer: “The significantly higher extent of chirping during swimming and locomotion, consistently confirmed by 4 different approaches (PSTH, TM, CN, MDS), suggests that - although chirp-behavior correlations may exist at time-scales larger than those here considered - chirping may be linked more strongly with scanning and environmental exploration than with a particular motivational state, thus confirming findings from our playback experiments.”

      The Reviewer here remarks an important point, yet, due to space limitations, we have considered only a sub-second scale. Most playback experiments in weakly electric fish implied the use of EOD mimics for a few tens of seconds - to avoid habituation in the fish behavioral responses -  while inter-chirp intervals usually range between a few hundreds of milliseconds to seconds (depending on how often a fish would chirp). This suggested to us that a 4 second time window may not be a bad choice to start with.

      In summary, it is simply too strong to say that chirping does not correlate with context, or to claim that there is convincing evidence arguing against a communication function of chirps. Importantly, however, this does not detract from your exciting and well-supported hypothesis that chirping functions in homeoactive sensing. A given EOD behavior could serve both communication and homeoactive sensing. I actually suspect this is quite common in electric fish (both gymnotiforms and mormyrids), and perhaps in other actively sensing species such as echolocating animals. The two are not mutually exclusive.

      We agree with the Reviewer that context - broadly speaking - does affect chirping (as we mentioned above). We hope we have improved the writing and clarified that we do not dismiss communication functions of chirping, but we do lean towards electrolocation based on the considerations above made and our results.

      We do conclude the manuscript remarking that communication and electrolocation are not mutually exclusive: ”probing cues could function simultaneously as proximity signals to signal presence, deter approaches, or coordinate behaviors like spawning, if properly timed (Henninger et al., 2018).” (see the conclusion paragraph of the discussion) .

      Therein, we further add “These findings aim to stir the pot and initiate a discussion on possible alternative functions of chirps beyond their presumed communication role.”.

      With this, we hope we’ve made it clear how we intend our manuscript to be read.

      Reviewer #3 (Public Review):

      Summary:

      This important paper provides the best-to-date characterization of chirping in weakly electric fish using a large number of variables. These include environment (free vs divided fish, with or without clutter), breeding state, gender, intruder vs resident, social status, locomotion state and social and environmental experience, without and with playback experiments. It applies state-of-the-art methods for reducing the dimensionality of the data and finding patterns of correlation between different kinds of variables (factor analysis, K-means). The strength of the evidence, collated from a large number of trials with many controls, leads to the conclusion that the traditionally assumed communication function of chirps may be secondary to its role in environmental assessment and exploration that takes social context into account. Based on their extensive analyses, the authors suggest that chirps are mainly used as probes that help detect beats caused by other fish and as well as objects.

      Strengths:

      The work is based on completely novel recordings using interaction chambers. The amount of new data and associated analyses is simply staggering, and yet, well organized in presentation. The study further evaluates the electric field strength around a fish (via modelling with the boundary element method) and how its decay parallels the chirp rate, thereby relating the above variables to electric field geometry.

      The main conclusions are that the lack of any significant behavioural correlates for chirping, and the lack of temporal patterning in chirp time series, cast doubt on a primary communication goal for most chirps. Rather, the key determinants of chirping are the difference frequency between two interacting conspecifics as well as individual subjects' environmental and social experience. The paper concludes that there is a lack of evidence for stereotyped temporal patterning of chirp time series, as well as of sender-receiver chirp transitions beyond the known increase in chirp frequency during an interaction.

      These conclusions by themselves will be very useful to the field. They will also allow scientists working on other "communication" systems to perhaps reconsider and expand the goals of the probes used in those senses. A lot of data are summarized in this paper, with thorough referencing to past work.

      The alternative hypotheses that arise from the work are that chirps are mainly used as environmental probes for better beat detection and processing and object localization, and in this sense are self-directed signals. This led to their prediction that environmental complexity ("clutter") should increase chirp rate, which is fact was revealed by their new experiments. The authors also argue that waveform EODs have less power across high spatial frequencies compared to pulse-type fish, with a resulting relatively impoverished power of resolution. Chirping in wave-type fish could temporarily compensate for the lower frequency resolution while still being able to resolve EOD perturbations with a good temporal definition (which pulse-type fish lack due to low pulse rates).

      The authors also advance the interesting idea that the sinusoidal frequency modulations caused by chirps are the electric fish's solution to the minute (and undetectable by neural wetware) echo-delays available to it, due to the propagation of electric fields at the speed of light in water. The paper provides a number of experimental avenues to pursue in order to validate the non-communication role of chirps.

      We thank the reviewer for the kind assessment.

      Weaknesses:

      My main criticism is that the alternative putative role for chirps as probe signals that optimize beat detection could be better developed. The paper could be clearer as to what that means precisely, especially since beating - and therefore detection of some aspects of beating due to the proximity of a conspecific - most often precedes chirping. One meaning the authors suggest, tentatively, is that the chirps could enhance electrosensory responses to the beat, for example by causing beat phase shifts that remediate blind spots in the electric field of view.

      We agree with the Reviewer that a better and more detailed explanation of how beat processing for conspecific electrolocation may be positively affected by chirps would be important to provide. We are currently working on a follow-up manuscript in which we intend to include these aspects. For space limitations and readability we had to discard from the current manuscript a lot of results that could further clarify these issues.

      A second criticism is that the study links the beat detection to underwater object localization. The paper does not significantly develop that line of thought given their data - the authors tread carefully here given the speculative aspect of this link. It is certainly possible that the image on the fish's body of an object in the environment will be slightly modified by introducing a chirp on the waveform, as this may enhance certain heterogeneities of the object in relation to its environment. The thrust of this argument derives mainly from the notion of Fourier analysis with pulse type fish EOD waveforms (see above, and radar theory more generally), where higher temporal frequencies in the beat waveform induced by the chirp will enable a better spatial resolution of objects. It remains to be seen whether experiments can show this to be significant.

      Perhaps the Reviewer refers to the last discussion paragraph before the conclusions in which we mention the performance of pulse or wave-type EODs in electrolocation (referring here to ideas illustrated in a recent review by Crampton, 2019). We added to this paragraph a statement which could better clarify that we do not propose that chirping could enhance object electrolocation. What we mean is that, in a context in which object electrolocation occurs through wave-type EODs - given the generally lower performance of such narrow-band signals in resolving the spatial features of any object, even a 3D electric field  - chirping could improve beat detection during social encounters by increasing the amount of information obtained by the fish.

      The edited paragraph now reads: “While broadband pulse signals may be useful to capture highly complex environments rich in foliage, roots and other structures common in vegetation featuring the more superficial habitats in which pulse-type fish live, wave-type EODs may be a better choice in the relatively simpler river-bed environments in which many wave-type fish live (e.g., the benthic zone of deep river channels; Crampton, 2019). In this case, achieving a good spatial resolution is critical during social encounters, especially considering the limited utility of visual cues in these low-light conditions. In such habitats, social encounters may “electrically” be less “abrupt”, but spatially less “conspicuous” or blurred (as a 3D electric field may be). In such a scenario, chirps could serve as a means to supplement the spatial information acquired via the beat, accentuating these cues during periods of reduced resolution.”

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      None, my points in the original review have been properly addressed in this resubmission.

    2. eLife assessment

      This study addresses a question in sensory ethology and active sensing in particular. It links the production of a specific signal - electrosensory chirps - to various contexts and conditions to argue that the main function is to enhance conspecific localization rather than communication as previously believed. The study provides a lot of valuable data, but the methods section is incomplete making it difficult to evaluate the claims.

    3. Reviewer #1 (Public Review):

      The authors investigate the role of chirping in a species of weakly electric fish. They subject the fish to various scenarios and correlate the production of chirps with many different factors. They find major correlations between the background beat signals (continuously present during any social interactions) or some aspects of social and environmental conditions with the propensity to produce different types of chirps. By analyzing more specifically different aspects of these correlations they conclude that chirping patterns are related to navigation purposes and the need to localize the source of the beat signal (i.e. the location of the conspecific).

      The study provides a wealth of interesting observations of behavior and much of this data constitutes a useful dataset to document the patterns of social interactions in these fish. Some data, in particular the high propensity to chirp in cluttered environments, raises interesting questions. Their main hypothesis is a useful addition to the debate on the function of these chirps and is worth being considered and explored further.

      After the initial reviewers' comments, the authors performed a welcome revision of the way the results are presented. Overall the study has been improved by the revision. However, one piece of new data is perplexing to me. The new figure 7 presents the results of a model analysis of the strength of the EI caused by a second fish to localize when the focal fish is chirping. From my understanding of this type of model, EOD frequency is not a parameter in the model since it evaluates the strength of the field at a given point in time. Therefore the only thing that matters is the phase relationship and strength of the EOD. Assuming that the second fish's EOD is kept constant and the phase relationship is also the same, the only difference during a chirp that could affect the result of the calculation is the potential decrease in EOD amplitude during the chirp. It is indeed logical that if the focal fish decreased its EOD amplitude the target fish's EOD becomes relatively stronger. Where things are harder to understand is why the different types of chirps (e.g. type 1 vs type 2) lead to the same increase in signal even though they are typically associated with different levels of amplitude modulations. Also, it is hard to imagine that a type 2 chirp that is barely associated with any decrease in EOD amplitude (0-10% maybe), would cause a doubling of the EI strength. There might be something I don't understand but the authors should provide a lot more details on how this result is obtained and convince us that it makes sense.

      Finally, the reviewer is concerned about this sentence in the rebuttal - "The methods section has been edited to clarify the approach (not yet)". This section is unfinished, which suggests that it is difficult to explain the modeling results from a logical point of view. Thus the reviewer's major concern from the previous review remains unresolved. To summarize, the model calculates field strengths at an instant in time and integrates over time with a 500 ms window. This window is 10 times longer than the small chirps, while the longer chirps cover a much larger proportion of the window. Yet, the small chirps have a bigger impact on discriminability than the longer chirps. The authors should attempt to explain this seemingly contradictory result. This remains a major issue because this analysis was the most direct evidence that chirping could impact localization accuracy.

    4. Reviewer #2 (Public Review):

      Studying Apteronotus leptorhynchus (the weakly electric brown ghost knifefish), the authors provide evidence that 'chirps' (brief modulations in the frequency and amplitude of the ongoing wave-like electric signal) function in active sensing (specifically homeoactive sensing) rather than communication. Chirping is a behavior that has been well studied, including numerous studies on the sensory coding of chirps and the neural mechanisms for chirp generation. Chirps are largely thought to function in communication behavior, so this alternative function is a very exciting possibility that should have a great impact on the field.

      The authors provide convincing evidence that chirps may function in homeoactive sensing. In particular, the evidence showing increased chirping in more cluttered environments and a relationship between chirping and movement are especially strong and suggestive. Their evidence arguing against a role for chirps in communication is not as strong. However, based on an extensive review of the literature, the authors conclude, I think fairly, that the evidence arguing in favor of a communication function is limited and inconclusive. Thus, the real strength of this study is not that it conclusively refutes the communication hypothesis, but that it calls this hypothesis into question while also providing compelling evidence in favor of an alternative function.

      In summary, although the evidence against a role for chirps in communication is not as strong as the evidence for a role in active sensing, this study presents very interesting data that is sure to stimulate discussion and follow-up studies. The authors acknowledge that chirps could function as both a communication and homeactive sensing signal, and the language arguing against a communication function is appropriately measured. A given electrical behavior could serve both communication and homeoactive sensing. I suspect this is quite common in electric fish (not just in gymnotiforms such as the species studied here, but also in the distantly related mormyrids), and perhaps in other actively sensing species such as echolocating animals.

    5. Reviewer #3 (Public Review):

      Summary:<br /> This important paper provides the best-to-date characterization of chirping in weakly electric fish using a large number of variables. These include environment (free vs divided fish, with or without clutter), breeding state, gender, intruder vs resident, social status, locomotion state and social and environmental experience, without and with playback experiments. It applies state-of-the-art methods for reducing the dimensionality of the data and finding patterns of correlation between different kinds of variables (factor analysis, K-means). The strength of the evidence, collated from a large number of trials with many controls, leads to the conclusion that the traditionally assumed communication function of chirps may be secondary to its role in environmental assessment and exploration that takes social context into account. Based on their extensive analyses, the authors suggest that chirps are mainly used as probes that help detect beats caused by other fish as well as objects.

      Strengths:<br /> The work is based on completely novel recordings using interaction chambers. The amount of new data and associated analyses is simply staggering, and yet, well organized in presentation. The study further evaluates the electric field strength around a fish (via modelling with the boundary element method) and how its decay parallels the chirp rate, thereby relating the above variables to electric field geometry. The BEM modelling also convincingly predicts how the electric image of a receiver conspecific on a sending fish is enhanced by a chirp.

      The main conclusions are that the lack of any significant behavioural correlates for chirping, and the lack of temporal patterning in chirp time series, cast doubt on a primary communication goal for most chirps. Rather, the key determinants of chirping are the difference in frequency between two interacting conspecifics as well as individual subjects' environmental and social experience. The paper concludes that there is a lack of evidence for stereotyped temporal patterning of chirp time series, as well as of sender-receiver chirp transitions beyond the known increase in chirp frequency during an interaction. The authors carefully submit that the new putative echolocation function of chirps is not mutually exclusive with a possible communication function.

      These conclusions by themselves will be very useful to the field. They will also allow scientists working on other "communication" systems to perhaps reconsider and expand the goals of the probes used in those senses. A lot of data are summarized in this paper, with thorough referencing to past work.

      The alternative hypotheses that arise from the work are that chirps are mainly used as environmental probes for better beat detection and processing and object localization, and in this sense are self-directed signals. This led to their prediction that environmental complexity ("clutter") should increase chirp rate, which is fact was revealed by their new experiments. The authors also argue that waveform EODs have less power across high spatial frequencies compared to pulse-type fish, with a resulting relatively impoverished power of resolution. Chirping in wave-type fish could temporarily compensate for the lower frequency resolution while still being able to resolve EOD perturbations with a good temporal definition (which pulse-type fish lack due to low pulse rates).

      The authors also advance the interesting idea that the sinusoidal frequency modulations caused by chirps are the electric fish's solution to the minute (and undetectable by neural wetware) echo-delays available to it, due to the propagation of electric fields at the speed of light in water. The paper provides a number of experimental avenues to pursue in order to validate the non-communication role of chirps.

    1. Reviewer #1 (Public Review):

      Summary:

      Insects inhabit diverse environments and have neuroanatomical structures appropriate to each habitat. Although the molecular mechanism of insect neural development has been mainly studied in Drosophila, the beetle, Tribolium castaneum has been introduced as another model to understand the differences and similarities in the process of insect neural development. In this manuscript, the authors focused on the origin of the central complex. In Drosophila, type II neuroblasts have been known as the origin of the central complex. Then, the authors tried to identify those cells in the beetle brain. They established a Tribolium fez enhancer trap line to visualize putative type II neuroblasts and successfully identified 9 of those cells. In addition, they also examined expression patterns of several genes that are known to be expressed in the type II neuroblasts or their lineage in Drosophila. They concluded that the putative type II neuroblasts they identified were type II neuroblasts because those cells showed characteristics of type II neuroblasts in terms of genetic codes, cell diameter, and cell lineage.

      Strengths:

      The authors established a useful enhancer trap line to visualize type II neuroblasts in Tribolium embryos. Using this tool, they have identified that there are 9 type II neuroblasts in the brain hemisphere during embryonic development. Since the enhancer trap line also visualized the lineage of those cells, the authors found that the lineage size of the type II neuroblasts in the beetle is larger than that in the fly. They also showed that several genetic markers are also expressed in the type II neuroblasts and their lineages as observed in Drosophila.

      Weaknesses:

      I recommend the authors reconstruct the manuscript because several parts of the present version are not logical. For example, the author should first examine the expression of dpn, a well-known marker of neuroblast. Without examining the expression of at least one neuroblast marker, no one can say confidently that it is a neuroblast. The purpose of this study is to understand what makes neuroanatomical differences between insects which is appropriate to their habitats. To obtain clues to the question, I think, functional analyses are necessary as well as descriptive analyses.

    2. eLife assessment

      The study is a valuable contribution to the question of evolutionary shifts in neuronal proliferation patterns and the timing of developmental progressions. The authors present solid support for the presence of type-II NB lineages in the beetle Tribolium with the same molecular characteristics as the counterparts in the fly Drosophila, but differences in lineage size and number. While presenting a number of interesting observations, further evidence will be required to show that the observed differences are indeed responsible for the differences in developmental timing of the central complex in the two insect species.

    3. Reviewer #2 (Public Review):

      The authors address the question of differences in the development of the central complex (Cx), a brain structure mainly controlling spatial orientation and locomotion in insects, which can be traced back to the neuroblast lineages that produce the Cx structure. The lineages are called type-II neuroblast (NB) lineages and are assumed to be conserved in insects. While Tribolium castaneum produces a functional larval Cx that only consists of one part of the adult Cx structure, the fan-shaped body, in Drosophila melanogaster a non-functional neuropile primordium is formed by neurons produced by the embryonic type-II NBs which then enter a dormant state and continue development in late larval and pupal stages.

      The authors present a meticulous study demonstrating that type-II neuroblast (NB) lineages are indeed present in the developing brain of Tribolium castaneum. In contrast to type-I NB lineages, type-II NBs produce additional intermediate progenitors. The authors generate a fluorescent enhancer trap line called fez/earmuff which prominently labels the mushroom bodies but also the intermediate progenitors (INPs) of the type-II NB lineages. This is convincingly demonstrated by high-resolution images that show cellular staining next to large pointed labelled cells, a marker for type-II NBs in Drosophila melanogaster. Using these and other markers (e.g. deadpan, asense), the authors show that the cell type composition and embryonic development of the type-II NB lineages are similar to their counterparts in Drosophila melanogaster. Furthermore, the expression of the Drosophila type-II NB lineage markers six3 and six4 in subsets of the Tribolium type-II NB lineages (anterior 1-4 and 1-6 type-II NB lineages) and the expression of the Cx marker skh in the distal part of most of the lineages provide further evidence that the identified NB lineages are equivalent to the Drosophila lineages that establish the central complex. However, in contrast to Drosophila, there are 9 instead of 8 embryonic type-II NB lineages per brain hemisphere and the lineages contain more progenitor cells compared to the Drosophila lineages. The authors argue that the higher number of dividing progenitor cells supports the earlier development of a functional Cx in Tribolium.

      While the manuscript clearly shows that type-II NB lineages similar to Drosophila exist in Tribolium, it does not considerably advance our understanding of the heterochronic development of the Cx in these insects. First of all, the contribution of these lineages to a functional larval Cx is not clear. For example, how do the described type-II NB lineages relate to the DM1-4 lineages that produce the columnar neurons of the Cx? What is the evidence that the embryonically produced type-II NB lineage neurons contribute to a functional larval Cx? The formation of functional circuits could rely on larval neurons (like in Drosophila) which would make a comparison of embryonic lineages less informative with respect to understanding the underlying variations of the developmental processes. Furthermore, the higher number of progenitors (and consequently neurons) in Tribolium could simply reflect the demand for a higher number of cells required to build the fan-shaped body compared to Drosophila. In addition, the larger lineages in Tribolium, including the higher number of INPs could be due to a greater number of NBs within the individual clusters, rather than a higher rate of proliferation of individual neuroblasts, as suggested. What is the evidence that there is only one NB per cluster? The presented schemes (Fig. 7/12) and description of the marker gene expression and classification of progenitor cells are inconsistent but indicate that NBs and immature INPs cannot be consistently distinguished.

      The main difference between Tribolium and Drosophila Cx development with regard to the larval functionality might be that Drosophila type-II NB lineage-derived neurons undergo quiescence at the end of embryogenesis so that the development of the Cx is halted, while a developmental arrest does not occur in Tribolium. However, this needs to be confirmed (as the authors rightly observe).

    4. Reviewer #3 (Public Review):

      Summary:

      In this paper, Rethemeier et al capitalize on their previous observation that the beetle central complex develops heterochronically compared to the fly and try to identify the developmental origin of this difference. For this reason, they use a fez enhancer trap line that they generated to study the neuronal stem cells (INPs) that give rise to the central complex. Using this line and staining against Drosophila type-II neuroblast markers, they elegantly dissect the number of developmental progression of the beetle type II neuroblasts. They show that the NBs, INPs, and GMCs have a conserved marker progression by comparing to Drosophila marker genes, although the expression of some of the lineage markers (otd, six3, and six4) is slightly different. Finally, they show that the beetle type II neuroblast lineages are likely longer than the equivalent ones in Drosophila and argue that this might be the underlying reason for the observed heterochrony.

      Strengths:

      - A very interesting study system that compares a conserved structure that, however, develops in a heterochronic manner.

      - Identification of a conserved molecular signature of type-II neuroblasts between beetles and flies. At the same time, identification of transcription factors expression differences in the neuroblasts, as well as identification of an extra neuroblast.

      - Nice detailed experiments to describe the expression of conserved and divergent marker genes, including some lineaging looking into the co-expression of progenitor (fez) and neuronal (skh) markers.

      Weaknesses:

      - Comparing between different species is difficult as one doesn't know what the equivalent developmental stages are. How do the authors know when to compare the sizes of the lineages between Drosophila and Tribolium? Moreover, the fact that the authors recover more INPs and GMCs could also mean that the progenitors divide more slowly and, therefore, there is an accumulation of progenitors who have not undergone their programmed number of divisions.

      - The main conclusion that the earlier central complex development in beetles is due to the enhanced activity of the neuroblasts is very handwavy and is not the only possible conclusion from their data.

      - The argument for conserved patterns of gene expression between Tribolium and Drosophila type-II NBs, INPs, and GMCs is a bit circular, as the authors use Drosophila markers to identify the Tribolium cells.

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions: Based on the above, I believe that the authors, despite advancing significantly, fall short of identifying the reasons for the divergent timing of central complex development between beetle and fly.

    1. Reviewer #1 (Public Review):

      Summary:

      Hahn et al use bystander BRET, NanoBiT assays, and APEX2 proteomics to investigate endosomal signaling of CCR7 by two agonists, CCL19 and CCL21. The authors suggest that CCR7 signals from early endosomes following internalisation. They use spatial proteomics to try to identify novel interacting partners that may facilitate this signaling and use this data to specifically enhance a Rac1 signaling pathway. Many of the results in the first few figures showing simultaneous recruitment of Barr and G proteins by CCR7 have been shown previously (Laufer et al, 2019, Cell Reports), as has signaling from endomembranes, and Rac1 activation at intracellular sites. The new findings are the APEX2 proteomics studies, which could be useful to the scientific community. Unfortunately, the authors only follow up on a single finding, and the expansion of this section would improve the manuscript.

      Strengths:

      (1) The APEX2 resource will be valuable to the GPCR and immunology community. It offers many opportunities to follow up on findings and discover new biology. The resource could also be used to validate earlier findings in the current manuscript and in previous manuscripts. Was there enrichment of early endosomal markers, Barr and Gi as this would provide further evidence for their earlier claims regarding endosomal signaling? Previous studies have suggested signaling from the TGN, so it is possible that the different ligands also direct to different sites. This could easily be investigated using the APEX2 data.

      (2) The results section is well written and can be followed very easily by the reader.

      (3) Some findings verify previous studies (e.g. endomembrane signalling). This should be acknowledged as this shows the validity of the findings of both studies.

      Weaknesses:

      (1) The findings are interesting although the studies are almost all performed in HEK293 cells. I understand that these are commonly used in GPCR biology and are easy to transfect and don't express many GPCRs at high concentrations, but their use is still odd when there are many cell-lines available that express CCR7 and are more reflective of the endogenous state (e.g. they are polarised, they can perform chemotaxis/ migration). Some of the findings within the study should also be verified in more physiologically relevant cells. At the moment only the final figure looks at this, but findings need to be verified elsewhere.

      (2) The authors acknowledge that the kinetic patterns of the signals at the early endosome are not consistent with the rates of internalisation. They mention that this could be due to trafficking elsewhere. This could be easily looked at in their APEX2 data. Is there evidence of proximity to markers of other membranes? Perhaps this could be added to the discussion. Similarly, previous studies have shown that CCR7 signaling may involve the TGN. Was there enrichment of these markers? If not, this could also be an interesting finding and should be discussed. It is also possible that the Rab5 reporter is just not as efficient as the trafficking one, especially as in later figures the very convincing differences in the two ligands are not as robust as the differences in trafficking.

      (3) In the final sentence of paragraph 2 of the results the authors state that the internalisation is specific to CCR7 as there isn't recruitment to V2R. I'm not sure this is the best control. The authors can only really say it doesn't recruit to unrelated receptors. The authors could have used a different chemokine receptor which does not respond to these ligands to show this.

      (4) The miniGi-Barr1 and imaging showing co-localisation could be more convincing if it was also repeated in a more physiological cell line as in the final figure. Imaging of CCR7, miniGi, and Barr1 would also provide further evidence that the receptor is also present within the complex.

      (5) The findings regarding Rac1 are interesting, although an earlier paper found similar results (Laufer et al, 2019, Cell Reports), so perhaps following up on another APEX2-identified protein pathway would have been more interesting. The authors' statement that Rac1 is specifically activated, and RhoA and Cdc42 are not, is unconvincing from the current data. Only a single NanoBiT assay was used, and as raw values are not reported it is difficult for the reader to glean some essential information. The authors should show evidence that these reporters work well for other receptors (or cite previous studies) and also need evidence from an independent (i.e. non-NanoBiT or BRET) assay.

      (6) At present, the studies in Figure 7 do not go beyond those in the previous Laufer et al study in which they showed blocking endocytosis affected Rac1 signalling. The authors could show that Rac1 signalling is from early endosomes to improve this, otherwise, it could be from the TGN as previously reported.

    2. eLife assessment

      This is a valuable study that provides CCR7-APEX2 proximity labelling mass spectrometry data that is expected to provide new insights into CCR7 signalling partners and pathways. The study is technically solid and easy to follow, however, there are some concerns that many of the highlighted findings are repetitive of prior work and that this is not clearly acknowledged. It would increase the impact of the study if the confirmatory nature of some findings were acknowledged. This is of value to the community, and there are likely multiple opportunities to use the APEX2 data set to extend these findings, strengthen some claims, and even explore a new pathway identified in the APEX2 data set.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript describes a comprehensive analysis of signalling downstream of the chemokine receptor CCR7. A comprehensive dataset supports the authors' hypothesis that G protein and beta-arrestin signalling can occur simultaneously at CCR7 with implications for continued signalling following receptor endocytosis.

      Strengths:

      The experiments are well controlled and executed, employing a wide range of assays using - in the main - CCR7 transfectants. Data are well presented, with the authors' claims supported by the data. The paper also has an excellent narrative which makes it relatively easy to follow. I think this would certainly be of interest to the readership of the journal.

      Weaknesses:

      Since the authors show a differential enrichment of RhoGTPases by CCR7 stimulation with CCL19 versus CCL21, I think that they also need to show that the Gi/o coupling of HEK-292-CCR7-APEX2 cells to both CCL19 and CCL21 is not perturbed by the modification. Currently, the authors only show data for CCL19 signalling, which leaves the potential for a false negative finding in terms of CCL21 signalling being selectively impaired. This should be relatively easy to do and should strengthen the authors' conclusions.

      The authors conclude the discussion by suggesting that their findings highlight endosomal signalling as a general mechanism for chemokine receptors in cell migration. I think this is an overreach. The authors chose several studies of CXC chemokine receptors to support their argument that C-terminal truncation or mutation of the C-terminal phosphorylation sites impairs endocytosis and chemotaxis (refs 40-42). However, in some instances e.g. at the related chemokine receptor CCR4, C-terminal removal of these sites impairs endocytosis but promotes chemotaxis (Nakagawa et al, 2014); Anderson et al, 2020). I therefore think that either the final statement needs to be tempered down or the counterargument discussed a little.

      References:

      Anderson, C. A. et al. A degradatory fate for CCR4 suggests a primary role in Th2 inflammation. J Leukocyte Biol 107, 455-466 (2020).

      Nakagawa, M. et al. Gain-of-function CCR4 mutations in adult T cell leukaemia/lymphoma. Journal of Experimental Medicine 211, 2497-2505 (2014).

    1. eLife assessment

      This important study correlates the size of various prefrontal brain regions in primate species with socioecological variables like foraging distance and population density. The evidence presented is solid but the approach and conclusions are limited to primates with well-defined gyri.

    2. Reviewer #1 (Public Review):

      The present study provides a phylogenetic analysis of the size prefrontal areas in primates, aiming to investigate whether relative size of the rostral prefrontal cortex (frontal pole) and dorsolateral prefrontal cortex volume vary according to known ecological or social variables.

      I am very much in favor of the general approach taken in this study. Neuroimaging now allows us to obtain more detailed anatomical data in a much larger range of species than ever before and this study shows the questions that can be asked using these types of data. In general, the study is conducted with care, focusing on anatomical precision in definition of the cortical areas and using appropriate statistical techniques, such as PGLS.

      I have read the revised version of the manuscript with interest. I commend the authors for including the requested additional analyses. I believe these highlight some of the major debates in the field, such as the relationship between absolute and relative brain size of areas. Providing a full description of the data will help this field be more open about these issues. All too often, debates between different groups focus on narrow anatomical or statistical arguments, and having all the data here is important.

      I do not agree with some of the statements of the other reviewers regarding development. Clearly, evolution works for a large part by tinkering (forgive the sense of agency) with development, but that does not mean that looking at the end result cannot provide insights. Ultimately, we will look at both phylogeny and ontogeny within the same framework, but the field is not quite there yet.

      As I said before, I do believe this is a positive study. I am happy that we as a field are using imaging data to answer more wider phylogenetic questions. Combining detailed anatomy, big data, and phylogenetic statistical frameworks is an important approach.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The present study provides a phylogenetic analysis of the size prefrontal areas in primates, aiming to investigate whether relative size of the rostral prefrontal cortex (frontal pole) and dorsolateral prefrontal cortex volume vary according to known ecological or social variables.

      I am very much in favor of the general approach taken in this study. Neuroimaging now allows us to obtain more detailed anatomical data in a much larger range of species than ever before and this study shows the questions that can be asked using these types of data. In general, the study is conducted with care, focusing on anatomical precision in definition of the cortical areas and using appropriate statistical techniques, such as PGLS.

      I have read the revised version of the manuscript with interest. I agree with the authors that a focus on ecological vs laboratory variables is a good one, although it might have been useful to reflect that in the title.

      I am happy to see that the authors included additional analyses using different definitions of FP and DLPFC in the supplementary material. As I said in my earlier review, the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital.

      We thank the reviewer for these positive remarks and for these very useful suggestions on the previous version of this article.

      I am sorry the authors are so dismissive of the idea of looking the models where brain size and area size are directly compared in the model, rather preferring to run separate models on brain size and area size. This seems to me a sensible suggestion.

      We agree with the reviewer 1 and the response of reviewer 3 also made it clear to us of why it was an important issue. We have therefore addressed it more thoroughly this time.

      First, we have added a new analysis, with whole brain volume included as covariate in the model accounting for regional volumes, together with the socio-ecological variables of interest. As expected given the very strong correlation across all brain measures (>90%), the effects of all socio-ecological factors disappear for both FP and DLPFC volumes when ‘whole brain’ is included as covariate. This is coherent with our previous analysis showing that the same combination of socio-ecological variables could account for the volume of FP, DLPFC and the whole brain. Nevertheless, the interpretation of these results remains difficult, because of the hidden assumptions underlying the analysis (see below).

      Second, we have clarified the theoretical reasons that made us choose absolute vs relative measures of brain volumes. In short, we understand the notion of specificity associated with relative measures, but 1) the interpretation of relative measures is confusing and 2) we have alternative ways to evaluate the specificity of the effects (which are complementary to the idea of adding whole brain volume as covariate). 

      Our goal here was to evaluate the influence of socio-ecological factors on specific brain regions, based on their known cognitive functions in laboratory conditions (working memory for the DLPFC and metacognition for the frontal pole). Thus, the null hypothesis is that socio-ecological challenges supposed to mobilize working memory and metacognition do not affect the size of the brain regions associated with these functions (respectively DLPFC and FP). This is what our analysis is testing, and from that perspective, it seems to us that direct measures are better, because within regions (across species), volumes provide a good index of neural counts (since densities are conserved), which are indicative fo the amount of computational resources available for the region. It is not the case when using relative measures, or when using the whole brain as covariate, since densities are heterogenous across brain regions (e.g. Herculano-Houzel, 2011; 2017, but see below for further details on this).

      Quantitatively, the theoretical level of specificity of the relation between brain regions and socio-ecological factors is difficult to evaluate, given that our predictions are based on the cognitive functions associated with DLPFC and FP, namely working memory and metacognition, and that each of these cognitive functions also involved other brain regions. We would actually predict that other brain regions associated with the same cognitive functions as DLPFC or FP also show a positive influence of the same socioecological variables. Given that the functional mapping of cognitive functions in the brain remains debated, it is extremely difficult to evaluate quantitatively how specific the influence of the socio-ecological factors should be on DLPFC and FP compared to the rest of the brain, in the frame of our hypothesis.

      Critically, given that FP and DLPFC show a differential sensitivity to population density, a proxy for social complexity, and that this difference is in line with laboratory studies showing a stronger implication of the FP in social cognition, we believe that there is indeed some specificity in the relation between specific regions of the PFC and socioecological variables. Thus, our results as a whole seem to indicate that the relation between prefrontal cortex regions and socio-ecological variables shows a small but significant level of specificity. We hope that the addition of the new analysis and the corresponding modifications of the introduction and discussion section will clarify this point.

      Similarly, the debate about whether area volume and number of neurons can be equated across the regions is an important one, of which they are a bit dismissive.

      We are sorry that the reviewer found us a bit dismissive on this issue, and there may have been a misunderstanding.

      Based on the literature, it is clearly established that for a given brain region, area volume provides a good proxy for the number of neurons, and it is legitimate to generalize this relation across species if neuronal densities are conserved for the region of interest (see for example Herculano-Houzel 2011, 2017 for review). It seems to be the case across primates because cytoarchitectonic maps are conserved for FP and DLPFC, at least in humans and laboratory primates (Petrides et al, 2012; Sallet et al, 2013; Gabi et al, 2016; Amiez et al, 2019). But we make no claim about the difference in number of neurons between FP and DLPFC, and we never compared regional volumes across regions (we only compared the influence of socio-ecological factors on each regional volume), so their difference in cellular density is not relevant here. As long as the neuronal density is conserved across species but within a region (DLPFC or FP), the difference in volume for that region, across species, does provide a reliable proxy for the influence of the socioecological regressor of interest (across species) on the number of neurons in that region.

      Our claims are based on the strength of the relation between 1) cross-species variability in a set of socio-ecological variables and 2) cross-species variability in neural counts in each region of interest (FP or DLPFC). Since the effects of interest relate to inter-specific differences, within a region, our only assumption is that the neural densities are conserved across distinct species for a given brain region. Again (see previous paragraph), there is reasonable evidence for that in the literature. Given that assumption, regional volumes (across species, for a given brain region) provide a good proxy for the number of neurons. Thus, the influence of a given socio-ecological variable on the interspecific differences in the volume of a single brain region provides a reliable estimate of the influence of that socio-ecological variable on the number of neurons in that region (across species), and potentially of the importance of the cognitive function associated with that region in laboratory conditions. None of our conclusions are based on direct comparison of volumes across regions, and we only compared the influence of socioecological factors (beta weights, after normalization of the variables).

      Note that this is yet another reason for not using relative measures and not including whole brain as covariate in the regression model: Given that whole brain and any specific region have a clear difference in density, and that this difference is probably not conserved across species, relative measures (or covariate analysis) cannot be used as proxies for neuronal counts (e.g. Herculano-Houzel, 2011). In other words, using the whole brain to rescale individual brain regions relies upon the assumption that the ratios of volumes (specific region/whole brain) are equivalent to the ratios of neural counts, which is not valid given the differences in densities.

      Nevertheless, I think this is an important study. I am happy that we are using imaging data to answer more wider phylogenetic questions. Combining detailed anatomy, big data, and phylogenetic statistical frameworks is a important approach.

      We really thank the reviewer for these positive remarks, and we hope that this study will indeed stimulate others using a similar approach.

      Reviewer #2 (Public Review):

      In the manuscript entitled "Linking the evolution of two prefrontal brain regions to social and foraging challenges in primates" the authors measure the volume of the frontal pole (FP, related to metacognition) and the dorsolateral prefrontal cortex (DLPFC, related to working memory) in 16 primate species to evaluate the influence of socio-ecological factors on the size of these cortical regions. The authors select 11 socio-ecological variables and use a phylogenetic generalized least squares (PGLS) approach to evaluate the joint influence of these socio-ecological variables on the neuro-anatomical variability of FP and DLPFC across the 16 selected primate species; in this way, the authors take into account the phylogenetic relations across primate species in their attempt to discover the the influence of socio-ecological variables on FP and DLPF evolution.

      The authors run their studies on brains collected from 1920 to 1970 and preserved in formalin solution. Also, they obtained data from the Mussée National d´Histoire Naturelle in Paris and from the Allen Brain Institute in California. The main findings consist in showing that the volume of the FP, the DLPFC, and the Rest of the Brain (ROB) across the 16 selected primate species is related to three socio-ecological variables: body mass, daily traveled distance, and population density. The authors conclude that metacognition and working memory are critical for foraging in primates and that FP volume is more sensitive to social constraints than DLPFC volume.

      The topic addressed in the present manuscript is relevant for understanding human brain evolution from the point of view of primate research, which, unfortunately, is a shrinking field in neuroscience. But the experimental design has two major weak points: the absence of lissencephalic primates among the selected species and the delimitation of FP and DLPFC. Also, a general theoretical and experimental frame linking evolution (phylogeny) and development (ontogeny) is lacking.

      We are sorry that the reviewer still believes that these two points are major weaknesses.

      - We have added a point on lissencephalic species in the discussion. In short, we acknowledge that our work may not be applied to lissencephalic species because they cannot be studied with our method, but on the other hand, based on laboratory data there is no evidence showing that the functional organization of the DLPFC and FP in lissencephalic primates is radically different from that of other primates (Dias et al, 1996; Roberts et al, 2007; Dureux et al, 2023; Wong et al, 2023). Therefore, there is no a priori reason to believe that not including lissencephalic primates prevents us from drawing conclusions that are valid for primates in general. Moreover, as explained in the discussion, including lissencephalic primates would require using invasive functional studies, only possible in laboratory conditions, which would not be compatible with the number of species (>15) necessary for phylogenetic studies (in particular PGLS approaches). Finally, as pointed out by the reviewer, our study is also relevant for understanding human brain evolution, and as such, including lissencephalic species should not be critical to this understanding.

      - In response to the remarks of reviewer 1 on the first version of the manuscript, we had included a new analysis in the previous version of the manuscript, to evaluate the validity of our functional maps given another set of boundaries between FP and DLPFC. But one should keep in mind that our objective here is not to provide a definitive definition of what the regions usually referred to as DLPFC and FP should be from an anatomical point of view. Rather, as our study aims at taking into account the phylogenetic relations across primate species, we chose landmarks that enable a comparison of the volume of cortex involved in metacognition (FP) and working memory (DLPFC) across species. We have also updated the discussion accordingly.

      We agree that this is a difficult point and we have always acknowledged that this was a clear limitation in our study. In the light of the functional imaging literature in humans and non-human primates, as well as the neurophysiological data in macaques, defining the functional boundary between FP and DLPFC remains a challenging issue even in very well controlled laboratory conditions. As mentioned by reviewer 1, “the precise delineation of the areas will always be an issue of debate in studies like this, so showing the effects of different decisions in vital”. Again, an additional analyses using different boundaries for FP and DLPFC was included in the supplementary material to address that issue. Now, we are not aware of solid evidence showing that the boundaries that we chose for DLPFC vs FP were wrong, and we believe that the comparison between 2 sets of measures as well as the discussion on this topic should be sufficient for the reader to assess both the strength and the limits of our conclusion. That being said, if the reviewer has any reference in mind showing better ways to delineate the functional boundary between FP and DLPFC in primates, we would be happy to include it in our manuscript.

      - The question of development, which is an important question per se,  is neither part of the hypothesis nor central for the field of comparative cognition in primates. Indeed, major studies in the field do not mention development (e.g. Byrne, 2000; Kaas, 2012; Barton, 2012). De Casien et al (2022) even showed that developmental constraints are largely irrelevant (see Claim 4 of their article): [« The functional constraints hypothesis […] predicts more complex, ‘mosaic’ patterns of change at the network level, since brain structure should evolve adaptively and in response to changing environments. It also suggests that ‘concerted’ patterns of brain evolution do not represent conclusive evidence for developmental constraints, since allometric relationships between developmentally linked or unlinked brain areas may result from selection to maintain functional connectivity. This is supported by recent computational modeling work [81], which also suggests that the value of mosaic or concerted patterns may fluctuate through time in a variable environment and that developmental coupling may not be a strong evolutionary constraint. Hence, the concept of concerted evolution can be decoupled from that of developmental constraints »].

      Finally, when studies on brain evolution and cognition mention development, it is generally to discuss energetic constraints rather than developmental mechanisms per se (Heldstab et al 2022 ; Smaers et al, 2021;  Preuss & Wise, 2021; Dunbar & Schutz, 2017; MacLean et al, 2012. Mars et al, 2018; 2021). Therefore, development does not seem to be a critical issue, neither for our article nor for the field.

      Reviewer #3 (Public Review):

      This is an interesting manuscript that addresses a longstanding debate in evolutionary biology - whether social or ecological factors are primarily responsible for the evolution of the large human brain. To address this, the authors examine the relationship between the size of two prefrontal regions involved in metacognition and working memory (DLPFC and FP) and socioecological variables across 16 primate species. I recommend major revisions to this manuscript due to: 1) a lack of clarity surrounding model construction; and 2) an inappropriate treatment of the relative importance of different predictors (due to a lack of scaling/normalization of predictor variables prior to analysis).

      We thank the reviewer for his/her remarks, and for the clarification of his /her criticism regarding the use of relative measures. We are sorry to have missed the importance of this point in the first place. We also thank the reviewer for the cited references, which were very interesting and which we have included in the discussion. As the reviewer 1 also shared these concerns, we wrote a detailed response to explain how we addressed the issue above.

      First, we did run a supplementary analysis where whole brain volume was added as covariate, together with socio-ecological variables, to account for the volume of FP or DLPFC. As expected given the very high correlation across all 3 brain measures, none of the socio-ecological variables remained significant. We have added a long paragraph in the discussion to tackle that issue. In short, we agree with the reviewer that the specificity of the effects (on a given brain region vs the rest of the brain) is a critical issue, and we acknowledge that since this is a standard in the field, it was necessary to address the issue and run this extra-analysis. But we also believe that specificity could be assessed by other means: given the differential influence of ‘population density’ on FP and DLPFC, in line with laboratory data, we believe that some of the effects that we describe do show specificity. Also, we prefer absolute measures to relative measures because they provide a better estimate of the corresponding cognitive operation, because standard allometric rules (i.e., body size or whole brain scaling) may not apply to the scaling and evolution of FP and DLPFC in primates.. Indeed, given that we use these measures as proxies of functions (metacognition for FP and working memory for DLPFC), it is clear that other parts of the brain should show the same effect since these functions are supported by entire networks that include not only our regions of interest but also other cortical areas in the parietal lobe. Thus, the extent to which the relation with socio-ecological variables should be stronger in regions of interest vs the whole brain depends upon the extent to which other regions are involved in the same cognitive function as our regions of interest, and this is clearly beyond the scope of this study. More importantly, volumetric measures are taken as proxies for the number of neurons, but this is only valid when comparing data from the same brain region (across species), but not across brain regions, since neural densities are not conserved. Thus, using relative measures (scaling with the whole brain volume) would only work if densities were conserved across brain regions, but it is not the case. From that perspective, the interpretation of absolute measures seems more straightforward, and we hope that the specificity of the effects could be evaluated using the comparison between the 3 measures (FP, DLPFC and whole brain) as well as the analysis suggested by the reviewer. We hope that the additional analysis and the updated discussion will be sufficient to cover that question, and that the reader will have all the information necessary to evaluate the level of specificity and the extent to which our findings can be interpreted.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      In my previous review of the present manuscript, I pointed out the fact that defining parts, modules, or regions of the primate cerebral cortex based on macroscopic landmarks across primate species is problematic because it prevents comparisons between gyrencephalic and lissencephalic primate species. The authors have rephrased several paragraphs in their manuscript to acknowledge that their findings do apply to gyrencephalic primates.

      I also said that "Contemporary developmental biology has showed that the selection of morphological brain features happens within severe developmental constrains. Thus, the authors need a hypothesis linking the evolutionary expansion of FP and DLPFC during development. Otherwise, the claims form the mosaic brain and modularity lack fundamental support". I insisted that the author should clarify their concept of homology of cerebral cortex parts, modules, or regions cross species (in the present manuscript, the frontal pole and the dorsolateral prefrontal cortex). Those are not trivial questions because any phylogenetic explanation of brain region expansion in contemporary phylogenetic and evolutionary biology must be rooted in evolutionary developmental biology. In this regard, the authors could have discussed their findings in the frame of contemporary studies of cerebral cortex evolution and development, but, instead, they have rejected my criticism just saying that they are "not relevant here" or "clearly beyond the scope of this paper".

      The question of development, which is an important question per se, is neither part of the hypothesis nor central for the field of comparative cognition in primates. Indeed, the major studies in the field do not mention development and some even showed that developmental constraints were not relevant (see De Casien et al., 2022 and details in our response to the public review). When studies on brain evolution and cognition mention development, it is generally to discuss energetic constraints rather than developmental mechanisms per se (Heldstab et al 2022 ; Smaers et al, 2021;  Preuss & Wise, 2021; Dunbar & Schutz, 2017;  MacLean et al, 2012. Mars et al, 2018; 2021).

      If the other reviewers agree, the authors are free to publish in eLife their correlations in a vacuum of evolutionary developmental biology interpretation. I just disagree. Explanations of neural circuit evolution in primates and other mammalian species should tend to standards like the review in this link: https://royalsocietypublishing.org/doi/full/10.1098/ rstb.2020.0522

      In this article, Paul Cizek (a brilliant neurophysiologist) speculates on potential evolutionary mechanisms for some primate brain functions, but there is surprisingly very little reference to the existing literature on primate evolution and cognition. There is virtually no mention of studies that involve a large enough number of species to address evolutionary processes and/or a comparison with fossils and/or an evaluation of specific socio-ecological evolutionary constraints. Most of the cited literature refers to laboratory studies on brain anatomy of a handful of species, and their relevance for evolution remains to be evaluated. These ideas are very interesting and they could definitely provide an original perspective on evolution, but they are mostly based on speculations from laboratory studies, rather than from extensive comparative studies. This paper is interesting for understanding developmental mechanisms and their constraints on neurophysiological processes in laboratory conditions, but we do not think that it would fit it in the framework of our paper as it goes far beyond our main topic.

      Reviewer #3 (Recommendations For The Authors):

      Yes, I am suggesting that the authors also include analyses with brain size (rather than body size) as a covariate to evaluate the effects of other variables in the model over and above the effect on brain size. In a very simplified theoretical scenario: two species have the same body sizes, but species A has a larger brain and therefore a larger FP. In this case, species A has a larger FP because of brain allometric patterns, and models including body size as a covariate would link FP size and socioecological variables characteristic of species A (and others like it). However, perhaps the FP of species A is actually smaller than expected for its brain size, while the FP of species B is larger than expected for its brain size.

      As explained in our response to the public review, we did run this analysis and we agree with the reviewer’s point from a practical point of view: it is important to know the extent to which the relation with a set of socio-ecological variables is specific of the region of interest, vs less specific and present for other brain regions. Again, we are sorry to not have understood that earlier, and we acknowledge that since it is a standard in the field, it needs to be addressed thoroughly.

      We understand that the scaling intuition, and the need to get a reference point for volumetric measures, but here the volume of each brain region is taken as a proxy for the number of neurons and therefore for the region’s computational capacities. Since, for a given brain region (FP or DLPFC) the neural densities seem to be well conserved across species, comparing regional volumes across species provides a good proxy for the contrast (across species) in neural counts for that region. All we predicted was that for a given brain region, associated with a given cognitive operation, the volume (number of neurons) would be greater in species for which socio-ecological constraints potentially involving that specific cognitive operation were greater. We do not understand how or why the rest of the brain would change this interpretation (of course, as discussed just above, beyond the question of specificity). And using whole brain volume as a scaling measure is problematic because the whole brain density is very different from the density of these regions of the prefrontal cortex (see above for further details). Again, we acknowledge that allometric patterns exist, and we understand how they can be interpreted, but we do not understand how it could prove or disprove our hypothesis (brain regions involved in specific cognitive operations are influenced by a specific set of socio-ecological variables). When using volumes as a proxy for computational capacities, the theoretical implications of scaling  procedures might be problematic. For example, it implies that the computational capacities of a given brain region are scaled by the rest of the brain. All other things being equal, the computational capacities of a given brain region, taken as the number of neurons, should decrease when the size of the rest of the brain increases. But to our knowledge there is no evidence for that in the literature. Clearly these are very challenging issues, and our position was to take absolute measures because they do not rely upon hidden assumptions regarding allometric relations and their consequence on cognition.

      But since we definitely understand that scaling is a reference in the field, we have not only completed the corresponding analysis (including the whole brain as a covariate, together with socio-ecological variables) but also expended the discussion to address this issue in detail. We hope that between this new analysis and the comparison of effects between non-scaled measures of FP, DLPFC and the whole brain, the reader will be able to judge the specificity of the effect.

      Models including brain (instead of body) size would instead link FP size and socioecological variables characteristic of species B (and others like it). This approach is supported by a large body of literature linking comparative variation in the relative size of specific brain regions (i.e., relative to brain size) to behavioral variation across species - e.g., relative size of visual/olfactory brain areas and diurnality/nocturnality in primates (Barton et al. 1995), relative size of the hippocampus and food caching in birds (Krebs et al. 1989).

      Barton, R., Purvis, A., & Harvey, P. H. (1995). Evolutionary radiation of visual and olfactory brain systems in primates, bats and insectivores. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 348(1326), 381-392.

      Krebs, J. R., Sherry, D. F., Healy, S. D., Perry, V. H., & Vaccarino, A. L. (1989). Hippocampal specialization of food-storing birds. Proceedings of the National Academy of Sciences, 86(4), 1388-1392. 

      We are grateful to the reviewer for mentioning these very interesting articles, and more generally for helping us to understand this issue and clarify the related discussion. Again, we understand the scaling principle but the fact that these methods provide interesting results does not make other approaches (such as ours) wrong or irrelevant. Since we have used both our original approach and the standard version as requested by the reviewer, the reader should be able to get a clear picture of the measures and of their theoretical implications. We sincerely hope that the present version of the paper will be satisfactory, not only because it is clearer, but also because it might stimulate further discussion on this complex question.

    1. eLife assessment

      This manuscript describes a creative approach using dual-component gRNAs to create a new class of molecular proximity sensors for genome editing. The authors demonstrate that this tool can be coupled with several different gene editing effectors, and the authors convincingly show that this functions as designed. This important study represents this first-of-its kind technology with key baseline activity metrics ready for future developmental approaches.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Choi and co-authors presents "P3 editing", which leverages dual-component guide RNAs (gRNA) to induce protein-protein proximity. They explore three strategies for leveraging prime-editing gRNA (pegRNA) as a dimerization module to create a molecular proximity sensor that drives genome editing, splitting a pegRNA into two parts (sgRNA and petRNA), inserting self-splicing ribozymes within pegRNA, and dividing pegRNA at the crRNA junction. Among these, splitting at the crRNA junction proved the most promising, achieving significant editing efficiency. They further demonstrated the ability to control genome editing via protein-protein interactions and small molecule inducers by designing RNA-based systems that form active gRNA complexes. This approach was also adaptable to other genome editing methods like base editing and ADAR-based RNA editing.

      Strengths:

      The study demonstrates significant advancements in leveraging guide RNA (gRNA) as a dimerization module for genome editing, showcasing its high specificity and versatility. By investigating three distinct strategies-splitting pegRNA into sgRNA and petRNA, inserting self-splicing ribozymes within the pegRNA, and dividing the pegRNA at the repeat junction-the researchers present a comprehensive approach to achieving molecular proximity and reconstituting function. Among these methods, splitting the pegRNA at the repeat junction emerged as the most promising, achieving editing efficiencies up to 76% of the control, highlighting its potential for further development in CRISPR-Cas9 systems. Additionally, the study extends genome editing control by linking protein-protein interactions to RNA-mediated editing, using specific protein-RNA interaction pairs to regulate editing through engineered protein proximity. This innovative approach expands the toolkit for precision genome editing, demonstrating the feasibility of controlling genome editing with enhanced specificity and efficiency.

      Weaknesses:

      The initial experiments with splitting the pegRNA into sgRNA and petRNA showed low editing efficiency, less than 2%. Similarly, inserting self-splicing ribozymes within pegRNA was inefficient, achieving under 2% editing efficiency in all constructs tested, possibly hindered by the prime editing enzyme. The editing efficiency of the crRNA and petracrRNA split at the repeat junction varied, with the most promising configurations only reaching 76% of the control efficiency. The RNA-RNA duplex formation's inefficiency might be due to the lack of additional protein binding, leading to potential degradation outside the Cas9-gRNA complex. Extending the approach to control genome editing via protein-protein interactions introduced complexity, with a significant trade-off between efficiency and specificity, necessitating further optimization. The strategy combining RADARS and P3 editing to control genome editing with specific RNA expression events exhibited high background levels of non-specific editing, indicating the need for improved specificity and reduced leaky expression. Moreover, P3 editing efficiencies are exclusively quantified after transfecting DNA into HEK cells, a strategy that has resulted in past reproducibility concerns for other technologies. Overall, the various methods and combinations require further optimization to enhance efficiency and specificity, especially when integrating multiple synthetic modules.

    3. Reviewer #2 (Public Review):

      Choi et al. describe a new approach for enabling input-specific CRISPR-based genome editing in cultured cells. While CRISPR-Cas9 is a broadly applied system across all of biology, one limitation is the difficulty in inducing genome editing based on cellular events. A prior study, from the same group, developed ENGRAM - which relies on activity-dependent transcription of a prime editing guide RNA, which records a specific cellular event as a given edit in a target DNA "tape". However, this approach is limited to the detection of induced transcription and does not enable the detection of broader molecular events including protein-protein interactions or exposure to small molecules. As an alternative, this study envisioned engineering the reconstitution of a split prime editing guide RNA (pegRNA) in a protein-protein interaction (PPI)-dependent manner. This would enable location- and content-specific genome editing in a controlled setting.

      The authors explored three different design possibilities for engineering a PPI-dependent split pegRNA. First, they tried splitting pegRNA into a functional sgRNA and corresponding prime editing transRNA, incorporating reverse-complementary dimerization sequences on each guide half. This approach, however, resulted in low editing efficiency across 7 different designs with various complementary annealing template lengths (<2% efficiency). They also tried inserting a self-splicing ribozyme within the pegRNA, which produces a functional pegRNA post-transcriptionally. The incorporation of a split-ribozyme, dependent on a PPI, could have been used to reconstitute the split pegRNA in an event-controlled manner. However again, only modest levels of editing were observed with the self-splicing ribozyme design (<2%). Finally, they tried splitting the pegRNA at the repeat:anti-repeat junction that was used to join the original dual-guide system comprised of a crRNA and tracrRNA, into a single-guide RNA. They incorporated the prime editing features into the tracrRNA half, to create petracrRNA. Dimerization was initially induced by different complementary RNA annealing sequences. Using this design, they were able to induce an editing efficiency of ~28% (compared to 37% efficiency using a positive control epegRNA guide).

      Having identified a suitable split pegRNA system, they next sought to induce the reconstitution of the two halves in a PPI-dependent manner. They replaced the complementary RNA annealing sequences with two different RNA aptamers (MS2 and BoxB). MS2 detects the MCP protein, while BoxB detects the LambdaN protein. Close proximity between MCP and LambdaN would thus bring together the two split pegRNA halves, creating a functional pegRNA that would enable prime editing at a specific target site. They demonstrated that they could induce MCP-BoxB proximity by fusing them to different dimerizing protein partners: 1) constitutive epitope-nanobody/antibody pairs such as scFv/GCN4 or NbALFA/ALFA-Tag; 2) split-GFP; or 3) chemically-induced protein pairs such as FKBP/FRB or ABI/PYL. For all of these approaches, they could achieve between ~20-60% normalized editing efficiency (relative to positive control editing levels with epegRNA). Additional mutation of the linkers between the RNA and aptamers could increase editing efficiency but also increase non-specific background editing even in the absence of an induced PPI.

      Additional applications of this overall strategy included incorporating the design with different DNA base editors, with the most promising examples shown with the base editors CBE4max and ABE8. It should be noted that these specific examples used a non-physiological LambdaN-MCP direct fusion protein as the "bait" that induced reconstitution of the two halves of the guideRNA, rather than relying on a true induced PPI. They also demonstrated that the recently reported RADARS strategy could be incorporated into their system. In this example, they used an ADAR-guide-RNA to drive the expression of a LambdaN-PCP fusion protein in the presence of a specific target RNA molecule, IL6. This induced LambdaN-PCP protein could then reconstitute the split peg-RNAs to drive prime editing. To enable this last application, they replaced the MS2 aptamer in their pegRNA with the PP7 aptamer that binds the PCP protein (this was to avoid crosstalk with RADARS, which also uses MS2/MCP interaction). Using this strategy, they observed a normalized editing efficiency of around 12% (but observed non-specific editing of around 8% in the absence of the target RNA).

      Strengths:

      The strengths of this paper include an interesting concept for engineering guide RNAs to enable activity-dependent genome editing in living cells in the future, based on discreet protein-protein interactions (either constitutively, spatially, or chemically induced). Important groundwork is laid down to engineer and improve these guide RNAs in the future (especially the work describing altering the linkers in Supplementary Figure 3 - which provides a path forward).

      Weaknesses:

      In its current state, the editing efficiency appears too low to be applied in physiological settings. Much of the latter work in the paper relies on a LambdaN-MCP direction fusion protein, rather than two interacting protein pairs. Further characterizations in the future, especially varying the transfection amounts/durations/etc of the various components of the system, would be beneficial to improve the system. It will also be important to demonstrate editing at additional sites; to characterize how long the PPI must be active to enable efficient prime editing; and how reversible the reconstitution of the split pegRNA is.

    1. eLife assessment

      This is a valuable study providing solid evidence that the Mediator kinase module mediates an elevated inflammatory response, manifested by heightened cytokine levels, associated with Downs syndrome (DS) via transcriptional changes impacting cell signaling and metabolism, which has significance for the treatment of DS and other chronic inflammatory conditions. Particular strengths of the study include the combined experimental approaches of transcriptomics, untargeted metabolomics, cytokine screens, and the use of sibling-matched cell lines (trisomy 21 vs disomy 21) from various donors. Less certain is that the Mediator kinase plays a meaningful role in regulating mRNA splicing. Further evidence that nuclear receptors are activated by changes in lipid levels and that mitochondrial function is substantially reduced on Mediator kinase inhibition would strengthen the work.

    2. Reviewer #1 (Public Review):

      Summary:

      The main conclusion of this manuscript is that the mediator kinases supporting the IFN response in Downs syndrome cell lines represent an important addition to understanding the pathology of this affliction.

      Strengths:

      Mediator kinase stimulates cytokine production. Both RNAseq and metabolomics clearly demonstrate a stimulatory role for CDK8/CDK19 in the IFN response. The nature of this role, direct vs. indirect, is inferred by previous studies demonstrating that inflammatory transcription factors are Cdk8/19 substrates. The cytokine and metabolic changes are clear-cut and provide a potential avenue to mitigate these associated pathologies.

      Weaknesses:

      This study revealed a previously undescribed role for the CKM in splicing. The previous identification of splicing factors as substrates of CDK8/CDK19 is also intriguing. However, additional studies seem to be necessary in order to attach this new function to the CKM. As the authors point out, the changes in splicing patterns are relatively modest compared to other regulators. In addition, some indication that the proteins encoded by these genes exhibit reduced levels or activities would support their RNAseq findings.

      Seahorse analysis is normally calculated with specific units for oxygen consumption, ATP production, etc. It would be of interest to see the actual values of OCR between the D21 and T21 cell lines rather than standardizing the results. This will address the specific question about relative mitochondrial function between these cells. Reduced mitochondrial function has been associated with DS patients. Therefore, it would be important to know whether mitochondrial function is reduced in the T21 cells vs. the D21 control. Importantly for the authors' goal of investigating the use of CDK8/19 inhibitors in DS patients, does CA treatment reduce mitochondrial function to pathological levels?

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Cozzolino et al. demonstrate that inhibition of the Mediator kinase CDK8 and its paralog CDK19 suppresses hyperactive interferon (IFN) signaling in Down syndrome (DS), which results from trisomy of chromosome 21 (T21). Numerous pathologies associated with DS are considered direct consequences of chronic IFN pathway activation, and thus hyperactive IFN signaling lies at the heart of pathophysiology. The collective interrogation of transcriptomics, metabolomics, and cytokine screens in sibling-matched cell lines (T21 vs D21) allows the authors to conclude that Mediator kinase inhibition could mitigate chronic, hyperactive IFN signaling in T21. To probe the functional outcomes of Mediator kinase inhibition, the authors performed cytokine screens, transcriptomic, and untargeted metabolomics. This collective approach revealed that Mediator kinases establish IFN-dependent cytokine responses at least in part through transcriptional regulation of cytokine genes and receptors. Mediator kinase inhibition suppresses cell responses during hyperactive IFN signaling through inhibition of pro-inflammatory transcription factor activity (anti-inflammatory effect) and alteration of core metabolic pathways, including upregulation of anti-inflammatory lipid mediators, which served as ligands for specific nuclear receptors and downstream phenotypic outcomes (e.g., oxygen consumption). These data provided a mechanistic link between Mediator kinase activity and nuclear receptor function. Finally, the authors also disclosed that Mediator kinase inhibition alters splicing outcomes.

      Overall, this study reveals a mechanism by which Mediator kinases regulate gene expression and establish that its inhibition antagonizes chronic IFN signaling through collective transcriptional, metabolic, and cytokine responses. The data have implications for DS and other chronic inflammatory conditions, as Mediator kinase inhibition could potentially mitigate pathological immune system hyperactivation.

      Strengths:

      (1) One major strength of this study is the mechanistic evidence linking Mediator kinases to hyperactive IFN signaling through transcriptional changes impacting cell signaling and metabolism.

      (2) Another major strength of this study is the use of sibling-matched cell lines (T21 vs D21) from various donors (not just one sibling pair), and further cross-referencing with data from large cohorts, suggesting that part of the data and conclusions are generalizable.

      (3) Another major strength of this study is the combined experimental approach including transcriptomics, untargeted metabolomics, and cytokine screens to define the mechanisms underlying suppression of hyperactive interferon signaling in DS upon Mediator kinase inhibition.

      (4) Another major strength of this study is the significance of the work to DS and its potential impact on other chronic inflammatory conditions.

      Weakness:

      (1) Genetic evidence linking the mentioned nuclear receptors to activation of an anti-inflammatory program upon Mediator kinase inhibition could improve the definition of the mechanism and overall impact of the work.

      (2) Page 5 states that "Mediator kinases broadly regulate cholesterol and fatty acid biosynthesis and this was further confirmed by the metabolomics data", but a clear mechanistic explanation was lacking. Likewise, the data suggest but do not prove, that altered lipid metabolites influence the function of nuclear receptors to regulate an anti-inflammatory program in response to Mediator kinase inhibition (p. 6), despite the fact the gene expression changes elicited by Mediator kinase inhibition tracked with downstream metabolic changes.

      (3) The figures are outstanding but dense.

      (4) Figure 6 (PRO-Seq). The authors refer to pro-inflammatory TFs (e.g. NF-kB/RelA). It is not clear whether the authors have specifically examined TF binding at enhancers or more broadly at every region occupied by the interrogated TFs?

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This manuscript presents useful findings on several phage from deep sea isolates of Lentisphaerae strains WC36 and zth2 that further our understanding of deep sea microbial life. The manuscript's primary claim is that phage isolates augment polysaccharide use in Pseudomonas bacteria via auxiliary metabolic genes (AMGs). However, the strength of the evidence is incomplete and does not support the primary claims. Namely, there are not data presented to rule out phage contamination in the polysaccharide stock solution, AMGs are potentially misidentified, and there is missing evidence of successful infection.

      Thanks for the Editor’s and Reviewers’ positive and constructive comments, which help us improve the quality of our manuscript entitled “Deep-sea bacteriophages facilitate host utilization of polysaccharides” (paper#eLife-RP-RA-2023-92345). The comments are valuable, and we have studied the comments carefully and have made corresponding revisions according to the suggestions. We removed some uncertain results and strengthened other parts of the manuscript, which evidently improved the accuracy and impact of the revised version. Revised portions are marked in blue in the modified manuscript. Please find the detailed responses as following.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: This manuscript describes the identification and isolation of several phage from deep sea isolates of Lentisphaerae strains WC36 and zth2. The authors observe induction of several putative chronic phages with the introduction of additional polysaccharides to the media. The authors suggest that two of the recovered phage genomes encode AMGs associated with polysaccharide use. The authors also suggest that adding the purified phage to cultures of Pseudomonas stutzeri 273 increased the growth of this bacterium due to augmented polysaccharide use genes from the phage. While the findings were of interest and relevance to the field, it is my opinion that several of the analysis fall short of supporting the key assertions presented.

      Thanks for your comments. We removed some uncertain results and strengthened other parts of the manuscript, which evidently improved the accuracy and impact of the revised version. Please find the detailed responses as following.

      Strengths: Interesting isolate of deep sea Lentisphaerae strains which will undoubtedly further our understanding of deep sea microbial life.

      Thanks for your positive comments.  

      Weaknesses:

      (1) Many of the findings are consistent with a phage contamination in the polysaccharide stock solution. 

      Thanks for your comments. We are very sure that the phages are specifically derived from the Lentisphaerae strain WC36 but not the polysaccharide stock solution. The reasons are as following: (1) the polysaccharide stock solution was strictly sterilized to remove any phage contamination; (2) we have performed multiple TEM checks of the rich medium supplemented with 10 g/L laminarin alone (Supplementary Fig. 1A) or in 10 g/L starch alone (Supplementary Fig. 1B), and there were not any phage-like structures, which confirmed that the polysaccharides (laminarin/starch) we used were not contaminated with any phage-like structures; in addition, we also observed the polysaccharides (laminarin/starch) directly by TEM and did not find any phage-like structures (Supplementary Fig. 2); (3) the polysaccharide (starch) alone could not promote the growth of Pseudomonas stutzeri 273, however, the supplement of starch together with the extracted Phages-WC36 could effectively facilitate the growth of Pseudomonas stutzeri 273 (Author response image 1). The above results clearly indicated the phages were derived from the Lentisphaerae strain WC36 but not the polysaccharide stock solution. 

      Author response image 1.

      Growth curve and status of Pseudomonas stutzeri 273 cultivated in basal medium, basal medium supplemented with 20 μl/mL Phages-WC36, basal medium supplemented with 5 g/L starch, basal medium supplemented with 5 g/L starch and 20 μl/mL Phages-WC36. 

       

      (2) The genes presented as AMGs are largely well known and studied phage genes which play a role in infection cycles.

      Thanks for your comments. Indeed, these AMGs may be only common in virulent phages, while have never been reported in chronic phages. In virulent phages, these genes typically act as lysozymes, facilitating the release of virions from the host cell upon lysis, or injection of viral DNA upon infection. However, the chronic phages do not lyse the host. Therefore, the persistence of these genes in chronic phages may be due to their ability to assist the host in metabolizing polysaccharides. Finally, according to your suggestions, we have weakened the role of AMGs and added “potential” in front of it. The detailed information is shown below.

      (3) The evidence that the isolated phage can infect Pseudomonas stutzeri 273 is lacking, putting into question the dependent results.

      Thanks for your comments. Actually, we selected many marine strains (Pseudomonadota, Planctomycetes, Verrucomicrobia, Fusobacteria, and Tenericutes isolates) to investigate whether Phages-WC36 could assist them in degradation and utilization of polysaccharides, and found that Phages-WC36 could only promote the growth of strain 273. It is reported that filamentous phages could recognize and bind to the host pili, which causes the pili to shrink and brings the filamentous phages closer to and possibly through the outer membrane of host cells. The possible mechanism of other chronic phages release without breaking the host might be that it was enclosed in lipid membrane and released from the host cells by a nonlytic manner. Thus, these chronic phages may have a wider host range. However, we were unable to further reveal the infection mechanism due to some techniques absence. Therefore, according to your suggestions, we have deleted this section in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      I have previously reviewed this manuscript as a submission to another journal in 2022. My recommendations here mirror those of my prior suggestions, now with further added details.

      Thanks for your great efforts for reviewing our manuscript and valuable suggestions for last and this versions.

      Specific comments:

      Comment 1: Line 32. Rephrase to "polysaccharides cause the induction of multiple temperate phages infecting two strains of Lentisphaerae (WC36 and zth2) from the deep sea."

      Thanks for your positive suggestion. We have modified this description as “Here, we found for the first time that polysaccharides induced the production of multiple temperate phages infecting two deep-sea Lentisphaerae strains (WC36 and zth2).” in the revised manuscript (Lines 31-33). 

      Comment 2: Line 66. "Chronic" infections are not "lysogenic" as described here, suggesting the former is a subcategory of the latter. If you are going to introduce lifecycles you need a brief sentence distinguishing "chronic" from "lysogenic"

      Thanks for your positive suggestion. We added this sentence as “Currently, more and more attention has been paid to chronic life cycles where bacterial growth continues despite phage reproduction (Hoffmann Berling and Maze, 1964), which was different from the lysogenic life cycle that could possibly lyse the host under some specific conditions.” in the revised manuscript (Lines 66-69).

      Comment 3: Line 72. Please avoid generalized statements like "a hand-full" (or "plenty" line 85). Try to be at least somewhat quantitative regarding how many chronic phages are known. This is a fairly common strategy among archaeal viruses. 

      Thanks for your suggestion. Given that some filamentous phages also have a chronic life cycle that is not explicitly reported, we cannot accurately estimate their numbers. According to your suggestions, we have modified these descriptions as “however, to our best knowledge, only few phages have been described for prokaryotes in the pure isolates up to date (Roux et al., 2019; Alarcón-Schumacher et al., 2022; Liu et al., 2022).” in the revised manuscript (Lines 73-75). In addition, the number of chronic phages in the biosphere cannot be accurately estimated, according to the latest report (Chevallereau et al., 2022), which showed that “a large fraction of phages in the biosphere are produced through chronic life cycles”. Therefore, we have modified this description as “Therefore, a large percentage of phages in nature are proposed to replicate through chronic life cycles” in the revised manuscript (Lines 87-88). 

      Comment 4: Line 93. While Breitbart 2012 is a good paper to cite here, there have been several, much more advanced analysis of the oceans virome. https://doi.org/10.1016/j.cell.2019.03.040 is one example, but there are several others. A deeper literature review is required in this section.  

      Thanks for your valuable suggestions. We have added some literatures and modified this description as “A majority of these viruses are bacteriophages, which exist widely in oceans and affect the life activities of microbes (Breitbart, 2012; Roux et al., 2016; Gregory et al., 2019; Dominguez-Huerta et al., 2022).” in the revised manuscript (Lines 94-97). 

      References related to this response:

      Roux, S., Brum, J.R., Dutilh, B.E., Sunagawa, S., Duhaime, M.B., Loy, A., Poulos, B.T., Solonenko, N., Lara, E., Poulain, J., et al. (2016) Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses. Nature 537:689-693. 

      Gregory, A.C., Zayed, A.A., Conceição-Neto, N., Temperton, B., Bolduc, B., Alberti, A., Ardyna, M., Arkhipova, K., Carmichael, M., Cruaud, C., et al. (2019) Marine DNA Viral Macro- and Microdiversity from Pole to Pole. Cell 177:1109-1123.e1114. 

      Dominguez-Huerta, G., Zayed, A.A., Wainaina, J.M., Guo, J., Tian, F., Pratama, A.A., Bolduc, B., Mohssen, M., Zablocki, O., Pelletier, E., et al. (2022) Diversity and ecological footprint of Global Ocean RNA viruses. Science 376:1202-1208.

      Comment 5: Line 137. I see the phage upregulation in Figure 1, however in the text and figure it would be good to also elaborate on what the background expression generally looks like. Perhaps a transcriptomic read normalization and recruitment to the genome with a display of the coverage map, highlighting the prophage would be helpful. Are the polysacharides directly influencing phage induction or is there some potential for another cascading effect?  

      Thanks for your comments. We have elaborated all expressions of phage-associated genes under different conditions in the Supplementary Table 1, which showed that the background expressions were very low. The numbers in Fig. 1C were the gene expressions (by taking log2 values) of strain WC36 cultured in rich medium supplemented with 10 g/L laminarin compared with the rich medium alone.

      In addition, our RT-qPCR results (Fig. 1D) also confirmed that these genes encoding phage-associated proteins were significantly upregulated when 10 g/L laminarin was added in the rich medium. According to your suggestions, we have modified this description as “In addition to the up-regulation of genes related to glycan transport and degradation, when 10 g/L laminarin was added in the rich medium, the most upregulated genes were phage-associated (e. g. phage integrase, phage portal protein) (Fig. 1C and Supplementary Table 1), which were expressed at the background level in the rich medium alone.” in the revised manuscript (Lines 136-140). Based on the present results, we speculate that polysaccharides might directly induce phage production, which needs to be verified by a large number of experiments in the future.

      Comment 6: Line 179. We need some assurance that phage was not introduced by your laminarin or starch supplement. Perhaps a check on the TEM/sequencing check of supplement itself would be helpful? This may be what is meant on Line 188 "without culturing bacterial cells" however this is not clearly worded if that is the case. Additional note, further reading reinforces this as a key concern. Many of the subsequent results are consistent with a contaminated starch stock. 

      Thanks for your comments. We are very sure that the phages are specifically derived from the Lentisphaerae strain WC36 but not the polysaccharide stock solution. The reasons are as following: (1) we have performed multiple TEM checks of the rich medium supplemented with 10 g/L laminarin alone (Supplementary Fig. 1A) or in 10 g/L starch alone (Supplementary Fig. 1B), and there were not any phage-like structures, which confirmed that the polysaccharides (laminarin/starch) we used are not contaminated with any phage-like structures. In addition, we also observed the polysaccharides (laminarin/starch) directly by TEM and did not find any phage-like structures (Supplementary Fig. 2). According to your suggestions, we have modified this description as “We also tested and confirmed that there were not any phage-like structures in rich medium supplemented with 10 g/L laminarin alone (Supplementary Fig. 1A) or in 10 g/L starch alone (Supplementary Fig. 1B), ruling out the possibility of phage contamination from the polysaccharides (laminarin/ starch).” in the revised manuscript (Lines 158-162) and “Meanwhile, we also checked the polysaccharides (laminarin/ starch) in rich medium directly by TEM and did not find any phage-like structures (Supplementary Fig. 2).” in the revised manuscript (Lines 178-180). (2) the polysaccharide stock solution was strictly sterilized to remove any phage contamination. (3) the polysaccharide (starch) alone could not promote the growth of Pseudomonas stutzeri 273, however, the supplement of starch together with the extracted Phages-WC36 could effectively facilitate the growth of Pseudomonas stutzeri 273 (Response Figure 1). The above results clearly indicated the phage was derived from the Lentisphaerae strain WC36 but not the polysaccharide stock solution. 

      In addition, given that polysaccharide was a kind of critical energy source for most microorganisms, we sought to ask whether polysaccharide also induces the production of bacteriophages in other deep-sea bacteria. To this end, we cultured deep-sea representatives from other four other phyla (including Chloroflexi, Tenericutes, Proteobacteria, and Actinobacteria) in the medium supplemented with laminarin/starch, and checked the supernatant of cells suspension through TEM as described above. We could not find any phage-like structures in these cells suspension (Author reaponse image 2), which also confirmed that there was no phage contamination in the polysaccharides.

      Author response image 2.

      Growth curve and status of Pseudomonas stutzeri 273 cultivated in basal medium, basal medium supplemented with 20 μl/mL Phages-WC36, basal medium supplemented with 5 g/L starch, basal medium supplemented with 5 g/L starch and 20 μl/mL Phages-WC36.   

      Author response image 3.

      TEM observation of the supernatant of cells suspension of a Chloroflexi strain, a Tenericutes strain, a Proteobacteria strain and an Actinobacteria strain that cultivated in the rich medium supplemented with 10 g/L laminarin and 10 g/L starch. No phage-like particles could be observed.  

      Comment 7: Line 223. Correct generalized wording "long time". 

      Thanks for your comments. We have changed “after for a long time” to “after 30 days” in the revised manuscript (Line 197).

      Comment 8: Line 229. Please more explicitly describe what these numbers are (counts of virion like structures - filamentous and hexagonal respectively?), the units (per µL?), and how these were derived. The word "around" should be replaced with mean and standard deviation values for each count from replicates, without which these are not meaningful.

      Thanks for your comments. The average numbers per microliter (µL) of filamentous and hexagonal phages in each condition were respectively calculated by randomly choosing ten TEM images. According to your suggestions, we have modified this description as “Specifically, the average number per microliter of filamentous phages (9.7, 29 or 65.3) extracted from the supernatant of strain WC36 cultured in rich medium supplemented with 10 g/L laminarin for 5, 10 or 30 days was higher than that cultured in rich medium supplemented with 5 g/L laminarin (4.3, 13.7 or 35.3) (Fig. 3B). The average number per microliter of hexagonal phages (9, 30, 46.7) extracted from the supernatant of strain WC36 cultured in rich medium supplemented with 10 g/L laminarin for 5, 10 or 30 days was higher than that cultured in rich medium supplemented with 5 g/L laminarin (4, 11.3 or 17.7) (Fig. 3C).” in the revised manuscript (Lines 203-210).

      Comment 9: Line 242. This section should be included in the discussion of Figure 2 - around line 194.

      Thanks. According to your suggestion, we have moved this section to the discussion corresponding to Figure 2 (Lines 183-191).

      Comment 10: Figure 3. Stay consistent in the types of figures generated per strain. Figure 3A should be a growth curve.

      Thanks for your comments. Actually, figure 3A was a growth curve, the corresponding description “(A) Growth curve of strain WC36 cultivated in either rich medium alone or rich medium supplemented with 5 g/L or 10 g/L laminarin for 30 days.” was shown in the Figure 3A legend in this manuscript.

      Comment 11: Line 312. Move the discussion of AMGs to after the discussion of the phage genome identification.

      Thanks for your valuable comments. According to your suggestions, we have moved the discussion of AMGs to after the discussion of the phage genome identification.

      Comment 12: Line 312. It would be informative to sequence in-bulk each of your treatments as opposed to just sequencing the viral isolates (starch and no host included) to see what viruses can be identified in each. ABySS is also not a common assembler for viral analysis. Is there literature to support it as a sufficient tool in assembling viral genomes? What sequencing depths were obtained in your samples?

      Thanks for your comments. In previous studies, we did sequence the starch or laminarin alone (no host included) and did not detect any phage-related sequences. The introduction of ABySS software was shown in these literatures (Jackman SD, Vandervalk BP, Mohamadi H, Chu J, Yeo S, Hammond SA, Jahesh G, Khan H, Coombe L, Warren RL, Birol I. ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter. Genome Res. 2017 May;27(5):768-777; Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009 Jun;19(6):1117-23.), which were also used to assemble viral genomes in these literatures (Guo Y, Jiang T. First Report of Sugarcane Mosaic Virus Infecting Goose Grass in Shandong Province, China. Plant Dis. 2024 Mar 21. doi: 10.1094/PDIS-11-23-2514-PDN; Tang M, Chen Z, Grover CE, Wang Y, Li S, Liu G, Ma Z, Wendel JF, Hua J. Rapid evolutionary divergence of Gossypium barbadense and G. hirsutum mitochondrial genomes. BMC Genomics. 2015 Oct 12;16:770.). The sequencing depth of the phages of strain WC36 and zth2 were 350x and 365x, respectively.

      Comment 13: Line 323. Replace "eventually" with more detail about what was done to derive the genomes. Were these the only four sequences identified as viral?

      Thanks for your comments. We have used the ABySS software (http://www.bcgsc.ca/platform/bioinfo/software/abyss) to perform genome assembly with multiple-Kmer parameters. VIBRANT v1.2.1 (Kieft et al., 2020), DRAM-v (Shaffer et al., 2020), VirSorter v1.0.5 (with categories 1 (“pretty sure”) and 2 (“quite sure”)) (Roux et al., 2015) and VirFinder v1.1 (with statistically significant viral prediction: score > 0.9 and P-value < 0.05) (Ren et al., 2017) with default parameters were used to identify viral genomes from these assembly sequences by searching against the both cultured and non-cultured viral NCBI-RefSeq database (http://blast.ncbi.nlm.nih.gov/) and IMG/VR database (Camargo et al., 2023). The GapCloser software (https://sourceforge.net/projects/soapdenovo2/files/GapCloser/) was subsequently applied to fill up the remaining local inner gaps and correct the single base polymorphism for the final assembly results. All the detailed processes were described in the supplementary information. The virus sequences with higher scores are only these four, but they are not complete genomes. Some virus sequences with shorter sequences and lower scores were excluded.

      Comment 14: Line 328. We need some details about the host genomes here. How were these derived? What is their completeness/contamination? What is their size? If the bins are poor, these would not serve as a reliable comparison to identify integrated phage.

      Thanks for your comments. For genomic sequencing, strains WC36 and zth2 were grown in the liquid rich medium supplemented with 5 g/L laminarin and starch and harvested after one week of incubation at 28 °C. Genomic DNA was isolated by using the PowerSoil DNA isolation kit (Mo Bio Laboratories Inc., Carlsbad, CA). Thereafter, the genome sequencing was carried out with both the Illumina NovaSeq PE150 (San Diego, USA) and Nanopore PromethION platform (Oxford, UK) at the Beijing Novogene Bioinformatics Technology Co., Ltd. A complete description of the library construction, sequencing, and assembly was performed as previously described (Zheng et al., 2021). We used seven databases to predict gene functions, including Pfam (Protein Families Database, http://pfam.xfam.org/), GO (Gene Ontology, http://geneontology.org/) (Ashburner et al., 2000), KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.genome.jp/kegg/) (Kanehisa et al., 2004), COG (Clusters of Orthologous Groups, http://www.ncbi.nlm.nih.gov/COG/) (Galperin et al., 2015), NR (Non-Redundant Protein Database databases), TCDB (Transporter Classification Database), and Swiss-Prot (http://www.ebi.ac.uk/uniprot/) (Bairoch and Apweiler, 2000). A whole genome Blast search (E-value less than 1e-5, minimal alignment length percentage larger than 40%) was performed against above seven databases.

      The completeness of the genomes of strains WC36 and zth2 were 100%, which were checked by the CheckM v1.2.2. The size of the genome of strains WC36 and zth2 were 3,660,783 bp and 3,198,720bp, respectively. The complete genome sequences of strains WC36 and zth2 presented in this study have been deposited in the GenBank database with accession numbers CP085689 and CP071032, respectively. 

      Moreover, to verify whether the absence of microbial contamination in phage sequencing results, we used the new alignment algorithm BWA-MEM (version 0.7.15) to perform reads mapping of host WGS to these phages. We found that all the raw reads of host strains (WC36 and zth2) were not mapping to these phages sequences (Author response image 3, shown as below). In addition, we also performed the evaluation of the assembly graph underlying the host consensus assemblies. Clean reads were mapped to the bacterial complete genome sequences by the Bowtie 2 (version 2.5.0), BWA (version 0.7.8) and SAMTOOLS (version 0.1.18). The results showed that the total mismatch rate of strains WC36 and zth2 were almost 0% and 0.03%, respectively (Author response table 1, shown as below). In addition, we also collected the cells of strains WC36 and zth2, and then sent them to another company for whole genome sequencing (named WC36G and ZTH, GenBank accession numbers CP151801 and CP119760, respectively). The completeness of the genomes of strains WC36G and ZTH were also 100%. The size of the genome of strains WC36G and ZTH were 3,660,783bp and 3,198,714bp, respectively. The raw reads of strains WC36G and zth2 were also not mapping to the phages sequences. Therefore, we can confirm that these bacteriophage genomes were completely outside of the host chromosomes. 

      Author response image 4.

      The read mapping from WGS to phage sequences.

      Author response table 1.

      Sequencing depth and coverage statistics.

      References related to this response:

      Zheng, R., Liu, R., Shan, Y., Cai, R., Liu, G., and Sun, C. (2021b) Characterization of the first cultured free-living representative of Candidatus Izemoplasma uncovers its unique biology ISME J 15:2676-2691. 

      Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium Nat Genet 25:25-29. 

      Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M. (2004) The KEGG resource for deciphering the genome Nucleic Acids Res 32:D277-280. 

      Galperin, M.Y., Makarova, K.S., Wolf, Y.I., and Koonin, E.V. (2015) Expanded microbial genome coverage and improved protein family annotation in the COG database Nucleic Acids Res 43:D261-269. 

      Bairoch, A., and Apweiler, R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 Nucleic Acids Res 28:45-48.

      Comment 15: Line 333. This also needs some details. What evidence do you have that these are not chromosomal? If not chromosomal where can they be found? Sequencing efforts should also be able to yield extrachromosomal elements such as plasmids etc... If you were to sequence your purified isolate cultures from the rich media alone and include all assemblies (not just those binned for example) as a reference, would you be able to recruit viral reads? The way this reads suggests that Chevallereau et al., worked specifically with these phage, which is not the case - please rephrase.

      Thanks for your comments. We carefully compared the bacteriophage genomes with those of the corresponding hosts (strains WC36 and zth2) using Galaxy Version 2.6.0 (https://galaxy.pasteur.fr/) (Afgan et al., 2018) with the NCBI BLASTN method and used BWA-mem software for read mapping from host whole genome sequencing (WGS) to these bacteriophages. These analyses both showed that the bacteriophage genomes are completely outside of the host chromosomes. Therefore, we hypothesized that the phage genomes might exist in the host in the form similar to that of plasmid.

      Comment 16: Line 335. More to the point here that we need confirmation that these phages were not introduced in the polysaccharide treatment

      Thanks for your comments. Please find our answers for this concern in the responses for comment 1 of “weakness” part and comment 6 of “Recommendations For The Authors” part.

      Comment 17: Line 342. Lacking significant detail here. Phylogeny based on what gene(s), how were the alignments computed/refined, what model used etc..?

      Thanks for your comments. According to your suggestions, all the related information was shown in this section “Materials and methods” of this manuscript. The maximum likelihood phylogenetic tree of Phage-WC36-2 and Phage-zth2-2 was constructed based on the terminase large subunit protein (terL). These proteins used to construct the phylogenetic trees were all obtained from the NCBI databases. All the sequences were aligned by MAFFT version 7 (Katoh et al., 2019) and manually corrected. The phylogenetic trees were constructed using the W-IQ-TREE web server (http://iqtree.cibiv.univie.ac.at) with the “GTR+F+I+G4” model (Trifinopoulos et al., 2016). Finally, we used the online tool Interactive Tree of Life (iTOL v5) (Letunic and Bork, 2021) to edit the tree. 

      Comment 18: Line 346. How are you specifically defining AMGs in this study? Most of these are well-known and studied phage genes with specific life cycle functions and could not be considered as polysaccharide processing AMGs even though in host cells many do play a role in polysaccharide processing systems. A substantially deeper literature review is needed in this section, which would ultimately eliminate most of these from the potential AMG pools. Further, the simple HMM/BLASTp evalues are not sufficient to support the functional annotation of these genes. At a minimum, catalytic/conserved regions should be identified, secondary structures compared, and phylogenetic analysis (where possible) developed etc... My recommendation is to eliminate this section entirely from the manuscript. 

      Categorically:

      - Glycoside hydrolase (various families), glucosaminidases, and transglycosylase are all very common to phage and operate generally as a lysins, facilitating the release of virions from the host cell upon lysis, or injection of viral DNA upon infection https://doi.org/10.3389/fmicb.2016.00745 (and citations therein) https://doi.org/10.1016/j.cmi.2023.10.018 etc... In order to confirm these as distinct AMGs we would need a very detailed analysis indicating that these are not phage infection cycle/host recognition related, however I strongly suspect that under such interrogation, these would prove to be as such.

      -TonB related systems including ExbB are well studied among phages as part of the trans-location step in infection. These could not be considered as AMGs. https://doi.org/10.1128/JB.00428-19. Other TonB dependent receptors play a role in host recognition.

      -Several phage acetyltransferases play a role in suppressing host RNA polymerase in order to reserve host cell resources for virion production, including polysaccharide production. https://doi.org/10.3390/v12090976. Further it has been shown that the E. coli gene neuO (O-acetyltransferase) is a homologue of lambdoid phage tail fiber genes https://doi.org/10.1073/pnas.0407428102. I suspect the latter is also the case here and this is a tail fiber gene.

      Thanks for your valuable comments. According to your suggestions, we have reanalyzed these AMGs and made some modifications (the new version Fig. 5A, shown as below). These genes encoding proteins associated with polysaccharide transport and degradation may be only common in virulent phages, and have never been reported in chronic phages. Unlike virulent phages, these genes typically act as lysozymes, facilitating the release of virions from the host cell upon lysis, or injection of viral DNA upon infection, chronic phages do not lyse the host. It is reported that, filamentous phages could recognize and bind to the host pili, which causes the pili to shrink and brings the filamentous phages closer to and possibly through the outer membrane of host cells (Riechmann et al., 1997; Sun et al., 1987). The possible mechanism of other chronic phage release without breaking the host might be that it was enclosed in lipid membrane and released from the host cells by a nonlytic manner. It has recently been reported that the tailless Caudoviricetes phage particles are enclosed in lipid membrane and are released from the host cells by a nonlytic manner (Liu et al., 2022), and the prophage induction contributes to the production of membrane vesicles by Lacticaseibacillus casei BL23 during cell growth (da Silva Barreira et al., 2022). Therefore, the persistence of these genes in chronic phages may be due to their ability to assist the host in metabolizing polysaccharides. 

      Finally, according to your suggestions, we have weakened the role of AMGs and added “potential” in front of it.

      References related to this response:

      Riechmann L, Holliger P. (1997) The C-terminal domain of TolA is the coreceptor for filamentous phage infection of E. coli Cell 90:351-60.

      Sun TP, Webster RE. (1987) Nucleotide sequence of a gene cluster involved in entry of E colicins and single-stranded DNA of infecting filamentous bacteriophages into Escherichia coli J Bacteriol 169:2667-74. 

      Liu Y, Alexeeva S, Bachmann H, Guerra Martníez J.A, Yeremenko N, Abee T et al. (2022) Chronic release of tailless phage particles from Lactococcus lactis Appl Environ Microbiol 88: e0148321. da Silva Barreira, D., Lapaquette, P., Novion Ducassou, J., Couté, Y., Guzzo, J., and Rieu, A. Spontaneous prophage induction contributes to the production of membrane vesicles by the gram-positive bacterium Lacticaseibacillus casei BL23. mBio_._ 2022;13:e0237522.

      Comment 19: Line 354. To make this statement that these genes are missing from the host, we would need to know that these genomes are complete.

      Thanks for your comments. The completeness of the genomes of strains WC36 and zth2 were 100%, which were checked by the CheckM v1.2.2. The size of the genome of strains WC36 and zth2 were 3,660,783 bp and 3,198,720bp, respectively. The complete genome sequences of strains WC36 and zth2 presented in this study have been deposited in the GenBank database with accession numbers CP085689 and CP071032, respectively. In addition, we also collected the cells of strains WC36 and zth2, and then sent it to another company for whole genome sequencing (named WC36G and ZTH, GenBank accession numbers CP151801 and CP119760, respectively). The completeness of the genomes of strains WC36G and ZTH were also 100%. The size of the genome of strains WC36G and ZTH were 3,660,783bp and 3,198,714bp, respectively. Therefore, these genomes of strains WC36 and zth2 were complete and circular.    

      Comment 20: Figure 5. Please see https://peerj.com/articles/11447/ and https://doi.org/10.1093/nar/gkaa621 for a detailed discussion on vetting AMGs. Several of these should be eliminated according to the standards set in the field. More specifically, and by anecdotal comparison with other inoviridae genomes, for Phage-WC36-1 and Phage-zth2-1, I am not convinced that the transactional regulator and glycoside hydrolase are a part of the phage genome. The phage genome probably ends at the strand switch.

      Thanks for your comments. According to your suggestions, we have analyzed these two articles carefully and modified the genome of Phage-WC36-1 and Phage-zth2-1 by anecdotal comparison with other inoviridae genomes. As you said, the transactional regulator and glycoside hydrolase are not a part of the phage genome.

      The new version Fig. 5A was shown.

      References related to this response:

      Shaffer, M., Borton, M.A., McGivern, B.B., Zayed, A.A., La Rosa, S.L., Solden, L.M., Liu, P., Narrowe, A.B., Rodrgíuez-Ramos, J., Bolduc, B., et al. (2020) DRAM for distilling microbial metabolism to automate the curation of microbiome function Nucleic Acids Res 48:8883-8900 

      Pratama, A.A., Bolduc, B., Zayed, A.A., Zhong, Z.P., Guo, J., Vik, D.R., Gazitúa, M.C., Wainaina, J.M., Roux, S., and Sullivan, M.B. (2021) Expanding standards in viromics: in silico evaluation of dsDNA viral genome identification, classification, and auxiliary metabolic gene curation PeerJ 9:e11447

      Comment 21: Line 380. This section needs to start with detailed evidence that this phage can even infect this particular strain. Added note, upon further reading the serial dilution cultures are not sufficient to prove these phage infect this Pseudomonas. We need at a minimum a one-step growth curve and wet mount microscopy. It is much more likely that some carry over contaminant is invading the culture and influencing OD600. With the given evidence, I am not at all convinced that these phages have anything to do with Pseudomonas polysaccharide use and I recommend either drastically revising this section or eliminating it entirely.

      Line 386-389. Could this be because you are observing your added phage in the starch enriched media while no phage were introduced with the "other types of media" so none would be observed? This could have nothing to do with infection dynamics. Further, this would also be consistent with your starch solution being contaminated by phage.

      Line 399. Again consistent with the starch media being contaminated.

      Line 401-408. This is more likely to do with the augmentation of the media with an additional carbon source and not involving the phage. 

      Line 410. I am not convinced that these viruses infect the Pseudomonas strain. Extensive further evidence of infection is needed to make these assertions.  Figure 6A. We need confirmation that the isolate culture remains pure and there are no other contaminants introduced with the phage.

      Thanks for your comments. We have proved that the polysaccharides (laminarin/ starch) didn't contaminate any phages above. Actually, we selected many marine strains (Pseudomonadota, Planctomycetes, Verrucomicrobia, Fusobacteria, and Tenericutes isolates) to investigate whether Phages-WC36 could assist them in degradation and utilization of polysaccharides, and found that Phages-WC36 could only promote the growth of strain 273. The presence of filamentous phages and hexagonal phages was detected in the supernatant of strain 273 cultured in basal medium supplemented with 5 g/L starch and 20 μl/mL Phages-WC36. After 3 passages of serial cultivation in basal medium supplemented with 5 g/L starch, we found that filamentous phages and hexagonal phages were also present in basal medium supplemented with starch, but not in the basal medium, which may mean that Phages-WC36 could infect strain 273 and starch is an important inducer. In addition, the Phages-WC36 used in the growth assay of strain 273 were multiple purified and eventually suspended in SM buffer (0.01% gelatin, 50 mM Tris-HCl, 100 mM NaCl and 10 mM MgSO4). Thus, these phages are provided do not contain some extracellular enzymes and/or nutrients. In addition, we set up three control groups in the growth assay of strain 273: basal medium, basal medium supplemented with Phages-WC36 and basal medium supplemented with starch. If the Phages-WC36 contains some extracellular enzymes and/or nutrients, strain 273 could also grow well in the basal medium supplemented only with Phages-WC36. However, the poor growth results of strain 273 cultivated in the basal medium supplemented with Phages-WC36 further confirmed that there were not some extracellular enzymes and/or nutrients in these phages.

      Finally, the possible mechanism of the chronic phage release without breaking the host might be that it was enclosed in lipid membrane and released from the host cells by a nonlytic manner. Thus, these chronic phages may have a wider host range. However, we were unable to further disclose the infection mechanism in this paper. Therefore, according to your suggestions, we have deleted this section entirely.

      Comment 27: Line 460. Details about how these genomes were reconstructed is needed here.  

      Thanks for your comments. According to your suggestions, we have added the detailed information about the genome sequencing, annotation, and analysis as “Genome sequencing, annotation, and analysis of strains WC36 and zth2 For genomic sequencing, strains WC36 and zth2 were grown in the liquid rich medium supplemented with 5 g/L laminarin and starch and harvested after one week of incubation at 28 °C. Genomic DNA was isolated by using the PowerSoil DNA isolation kit (Mo Bio Laboratories Inc., Carlsbad, CA). Thereafter, the genome sequencing was carried out with both the Illumina NovaSeq PE150 (San Diego, USA) and Nanopore PromethION platform (Oxford, UK) at the Beijing Novogene Bioinformatics Technology Co., Ltd. A complete description of the library construction, sequencing, and assembly was performed as previously described (Zheng et al., 2021b). We used seven databases to predict gene functions, including Pfam (Protein Families Database, http://pfam.xfam.org/), GO (Gene Ontology, http://geneontology.org/) (Ashburner et al., 2000), KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.genome.jp/kegg/) (Kanehisa et al., 2004), COG (Clusters of Orthologous Groups, http://www.ncbi.nlm.nih.gov/COG/) (Galperin et al., 2015), NR (Non-Redundant Protein Database databases), TCDB (Transporter Classification Database), and Swiss-Prot (http://www.ebi.ac.uk/uniprot/) (Bairoch and Apweiler, 2000). A whole genome Blast search (E-value less than 1e-5, minimal alignment length percentage larger than 40%) was performed against above seven databases.” in the revised manuscript (Lines 333-351).

      Comment 28: Line 462. Accession list of other taxa in the supplement would help here.  

      Thanks for your comments. The accession numbers of these strains were displayed behind these strains in Figure 1A. According to your suggestions, we have added an accession list of these taxa (Supplementary Table 6) in the revised manuscript.

      Comment 29: Line 463. Is there any literature to support that these are phylogenetically informative genes for Inoviridae?  

      Thanks for your comments. There are some literatures (Zeng et al, 2021; Evseev et al, 2023) to support that these are phylogenetically informative genes for Inoviridae. We have added these literatures in the revised manuscript. 

      References related to this response:

      Zeng, J., Wang, Y., Zhang, J., Yang, S., and Zhang, W. (2021) Multiple novel filamentous phages detected in the cloacal swab samples of birds using viral metagenomics approach Virol J 18:240

      Evseev, P., Bocharova, J., Shagin, D., and Chebotar, I. (2023) Analysis of Pseudomonas aeruginosa isolates from patients with cystic fibrosis revealed novel groups of filamentous bacteriophages. Viruses 15: 2215

      Reviewer #2 (Public Review):

      Summary: This paper investigates virus-host interactions in deep-sea bacteriophage systems which employ a seemingly mutualistic approach to viral replication in which the virus aids host cell polysaccharide import and utilization via metabolic reprogramming. The hypothesis being tested is supported with solid and convincing evidence and the findings are potentially generalizable with implications for our understanding of polysaccharide-mediated virus-host interactions and carbon cycles in marine ecosystems more broadly.

      Thanks for your positive comments.

      Strengths: This paper synthesizes sequencing and phylogenic analyses of two Lentisphaerae bacteria and three phage genomes; electron microscopy imaging of bacterial/phage particles; differential gene expression analyses; differential growth curve analyses, and differential phage proliferation assays to extract insights into whether laminarin and starch can induce both host growth and phage proliferation. The data presented convincingly demonstrate that both host culture density and phage proliferation increase as a result having host, phage, and polysaccharide carbon source together in culture.

      Thanks for your positive comments.  

      Weaknesses (suggestions for improvement): 

      (1) The article would be strengthened by the following additional experiment: providing the phage proteins hypothesized to be aiding host cell growth (red genes from Figure 5...TonB system energizer ExbB, glycosidases, etc) individually or in combination on plasmids rather than within the context of the actual phage itself to see if such additional genes are necessary and sufficient to realize the boosts in host cell growth/saturation levels observed in the presence of the phages tested.

      Thanks for your valuable comments. It is a really good idea to express individually or in combination on plasmids to see the effects of those polysaccharide-degradation proteins in the host cell. However, at present, we failed to construct the genetic and expression system for the strictly anaerobic strain WC36, which hindering our further detailed investigation of the functions of those polysaccharide-degradation proteins. In our lab, we are trying our best to build the genetic and expression system for strain WC36. We will definitely test your idea in the future. 

      (2) The paper would also benefit from additional experiments focused on determining how the polysaccharide processing, transport, and metabolism genes are being used by the phages to either directly increase viral infection/replication or else to indirectly do so by supporting the growth of the host in a more mutualistic manner (i.e. by improving their ability to import, degrade, and metabolize polysaccharides).  

      Thanks for your valuable comments. Indeed, due to the chronic phage genome is not within the chromosome of the host, it is very hard to disclose the exact auxiliary process and mechanism of chronic phages. At present, we are trying to construct a genetic manipulation system for the strictly anaerobic host WC36, and we will gradually reveal this auxiliary mechanism in the future. In addition, combined with the reviewer 1’s suggestions, the focus of revised manuscript is to emphasize that polysaccharides induce deep-sea bacteria to release chronic phages, and most of the content of phage assisting host metabolism of polysaccharides has been deleted.

      (3) The introduction would benefit from a discussion of what is known regarding phage and/or viral entry pathways that utilize carbohydrate anchors during host entry. The discussion could also be improved by linking the work presented to the concept of "selfishness" in bacterial systems (see for instance Giljan, G., Brown, S., Lloyd, C.C. et al. Selfish bacteria are active throughout the water column of the ocean. ISME COMMUN. 3, 11 (2023) https://doi.org/10.1038/s43705-023-00219-7). The bacteria under study are gram negative and it was recently demonstrated (https://www.nature.com/articles/ismej201726) that "selfish" bacteria sequester metabolizable polysaccharides in their periplasm to advantage. It is plausible that the phages may be hijacking this "selfishness" mechanism to improve infectivity and ENTRY rather than helping their hosts to grow and profilerate so they can reap the benefits of simply having more hosts to infect. The current work does not clearly distinguish between these two distinct mechanistic possibilities. The paper would be strengthened by at least a more detailed discussion of this possibility as well as the author's rationale for interpreting their data as they do to favor the "mutualistic" interpretation. In the same light, the paper would benefit from a more careful choice of words which can also help to make such a distinction more clear/evident/intentional. As currently written the authors seem to be actively avoiding giving insights wrt this question.  

      Thanks for your valuable comments. According to your suggestions, we have added the related discussion as “Moreover, it was recently demonstrated that selfish bacteria, which were common throughout the water column of the ocean, could bind, partially hydrolyze, and transport polysaccharides into the periplasmic space without loss of hydrolysis products (Reintjes et al., 2017; Giljan et al., 2023). Based on our results, we hypothesized that these chronic phages might also enter the host through this “selfishness” mechanism while assisting the host in metabolizing polysaccharides, thus not lysing the host. On the other hand, these chronic phages might hijack this “selfishness” mechanism to improve their infectivity and entry, rather than helping their hosts to grow and proliferate, so they could reap the benefits of simply having more hosts to infect. In the future, we need to construct a genetic operating system of the strictly anaerobic host strain WC36 to detailedly reveal the relationship between chronic phage and host.” in the revised manuscript (Lines 305-316). 

      References related to this response:

      Reintjes, G., Arnosti, C., Fuchs, B.M., and Amann, R. (2017) An alternative polysaccharide uptake mechanism of marine bacteria ISME J 11:1640-1650

      Giljan, G., Brown, S., Lloyd, C.C., Ghobrial, S., Amann, R., and Arnosti, C. (2023) Selfish bacteria are active throughout the water column of the ocean ISME Commun 3:11

      (4) Finally, I would be interested to know if the author’s sequencing datasets might be used to inform the question raised above by using bacterial immunity systems such as CRISPR/Cas9. For example, if the phage systems studied are truly beneficial/mutualistic for the bacteria then it’s less likely that there would be evidence of targeted immunity against that particular phage that has the beneficial genes that support polysaccharide metabolism.

      Thanks for your comments. According to your suggestions, we have carefully analyzed the genome of strain WC36, and found that there were no CRISPR/Cas9-related genes. Considering our results that the number of chronic phages was increased with the prolongation of culture time, we speculated that host might have no targeted immunity against these chronic phages.

      Reviewer #2 (Recommendations For The Authors):

      There are some minor grammatical errors and unclear statements (lines 99-100, 107-109, 163, 222, 223, 249-250, 254) which should also be fixed before final publication. 

      Thanks for your valuable comments. We have fixed these minor grammatical errors and unclear statements in the revised manuscript.

      Lines 99-100: we have modified this description as “For instance, AMGs of marine bacteriophages have been predicted to be involved in photosynthesis (Mann et al., 2003), nitrogen cycling (Ahlgren et al., 2019; Gazitúa et al., 2021), sulfur cycling (Anantharaman et al., 2014; Roux et al., 2016), phosphorus cycling (Zeng and Chisholm, 2012), nucleotide metabolism (Sullivan et al., 2005; Dwivedi et al., 2013; Enav et al., 2014), and almost all central carbon metabolisms in host cells (Hurwitz et al., 2013).” in the revised manuscript (Lines 100-105).

      Lines 107-109: we have modified this description as “However, due to the vast majority of deep-sea microbes cannot be cultivated in the laboratory, most bacteriophages could not be isolated.” in the revised manuscript (Lines 110-111).

      Line 163: we have modified this description as “Based on the growth curve of strain WC36, we found that the growth rate of strictly anaerobic strain WC36 was relatively slow.” in the revised manuscript (Lines 149-151).

      Lines 222-223: we have modified this description as “Regardless of whether the laminarin was present, the bacterial cells kept their cell shape intact, indicating they were still healthy after 30 days” in the revised manuscript (Lines 195-197).

      Lines 249-250: we have modified this description as “However, the entry and exit of the hexagonal phages into the WC36 cells were not observed.” in the revised manuscript (Lines 190-191).

      Line 254: we have modified this description as “To explore whether the production of bacteriophages induced by polysaccharide is an individual case, we further checked the effect of polysaccharides on another cultured deep-sea Lentisphaerae strain zth2.” in the revised manuscript (Lines 213-215).

    2. eLife assessment

      This manuscript presents valuable findings on two isolates of deep sea Lentisphaerae strains, which further our understanding of deep sea microbial life. The manuscript's primary claim is that phage isolates augment polysaccharide use in Pseudomonas bacteria, with preliminary evidence for the potential auxiliary metabolic genes in chronic phage infection and/or host proliferation. The strength of the evidence is overall solid and there are only minor weaknesses regarding the mechanism of polysaccharide use by the phages and the evidence for chronic infection. Overall, the data on Lentisphaerae strains will deepen our understanding of microbial life in the deep sea.

    3. Reviewer #1 (Public Review):

      Summary:

      I have previously reviewed this manuscript as a submission to another journal in 2022. My recommendations here mirror those of my prior suggestions, now with further added details.

      This manuscript describes the identification and isolation of several phage from deep sea isolates of Lentisphaerae strains WC36 and zth2. The authors observe induction of several putative chronic phages with the introduction of additional polysaccharides to the media. The authors suggest that two of the recovered phage genomes encode AMGs associated with polysaccharide use. The authors also suggest that adding the purified phage to cultures of Pseudomonas stutzeri 273 increased the growth of this bacteria due to augmented polysaccharide use genes from the phage.

      Strengths:

      Interesting isolate of deep sea Lentisphaerae strains which will undoubtedly further our understanding of deep sea microbial life.

      The revisions have addressed the weaknesses raised in the previous review.

    4. Reviewer #2 (Public Review):

      Summary:

      This paper investigates deep-sea bacteriophage systems which appear to employ a chronic replication mechanism that is induced or enhanced by polysaccharide addition. Some preliminary evidence for the potential role of auxiliary metabolic genes in aiding phage and/or host proliferation is also provided. The hypothesis being tested is fully supported with solid and convincing evidence and the findings are potentially generalizable with implications for our understanding of polysaccharide-mediated virus-host interactions and carbon cycling in marine ecosystems more broadly.

      Strengths:

      This paper synthesizes sequencing and phylogenic analyses of two Lentisphaerae bacteria and three phage genomes; electron microscopy imaging of bacterial/phage particles; differential gene expression analyses; differential growth curve analyses, and differential phage proliferation assays to extract insights into whether laminarin and starch can induce both host growth and phage proliferation. The data presented convincingly demonstrate that both host culture density and phage proliferation increase as a result having host, phage, and polysaccharide carbon source together in culture.

      Weaknesses:

      The AMG-centered elements of the article would be strengthened by more "mechanistic" experiments focusing on identifying "HOW" the polysaccharide processing, transport, and metabolism genes are being used by the phages to either directly increase viral infection/replication or else to indirectly do so by supporting the growth of the host (via mutualism). The concept of "selfishness" in bacterial systems and its potential role in viral life cycles could be more developed. Selfish bacteria are active throughout the water column of the ocean. ISME COMMUN. 3, 11 (2023) (see for instance https://doi.org/10.1038/s43705-023-00219-7) and such "selfish" bacteria sequester metabolizable polysaccharides in their periplasm to advantage (https://www.nature.com/articles/ismej201726). It is plausible that phages may be either hijacking such polysaccharide sequestration mechanisms to improve infectivity and ENTRY or else helping their hosts to grow and proliferate so they can reap the benefits of simply having more hosts to infect. The current work does not clearly distinguish between these two distinct mechanistic possibilities. The paper would be strengthened by a more detailed/clear discussion of this possibility.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1:

      Thank you for your review and pointing out multiple things to be discussed and clarified! Below, we go through the various limitations you pointed out and refer to the places where we have tried to address them.

      (1) It's important to keep in mind that this work involves simplified models of the motor system, and often the terminology for 'motor cortex' and 'models of motor cortex' are used interchangeably, which may mislead some readers. Similarly, the introduction fails in many cases to state what model system is being discussed (e.g. line 14, line 29, line 31), even though these span humans, monkeys, mice, and simulations, which all differ in crucial ways that cannot always be lumped together.

      That is a good point. We have clarified this in the text (Introduction and Discussion), to highlight the fact that our model isn’t necessarily meant to just capture M1. We have also updated the introduction to make it more clear which species the experiments which motivate our investigation were performed in.

      (2) At multiple points in the manuscript thalamic inputs during movement (in mice) is used as a motivation for examining the role of preparation. However, there are other more salient motivations, such as delayed sensory feedback from the limb and vision arriving in the motor cortex, as well as ongoing control signals from other areas such as the premotor cortex.

      Yes – the motivation for thalamic inputs came from the fact that those have specifically been shown to be necessary for accurate movement generation in mice. However, it is true that the inputs in our model are meant to capture any signals external to the dynamical system modeled, and as such are likely to represent a mixture of sensory signals, and feedback from other areas. We have clarified this in the Discussion, and have added this additional motivation in the Introduction.

      (3) Describing the main task in this work as a delayed reaching task is not justified without caveats (by the authors' own admission: line 687), since each network is optimized with a fixed delay period length. Although this is mentioned to the reader, it's not clear enough that the dynamics observed during the delay period will not resemble those in the motor cortex for typical delayed reaching tasks.

      Yes, we completely agree that the terminology might be confusing. While the task we are modeling is a delayed reaching task, it does differ from the usual setting since the network has knowledge of the delay period, and that is indeed a caveat of the model. We have added a brief paragraph just after the description of the optimal control objective to highlight this limitation.

      We have also performed additional simulations using two different variants of a model-predictive control approach that allow us to relax the assumption that the go-cue time is known in advance. We show that these modifications of the optimal controller yield results that remain consistent with our main conclusions, and can in fact in some settings lead to preparatory activity plateaus during the preparation epoch as often found in monkey M1 (e.g in Elsayed et al. 2016). We have modified the Discussion to explain these results and their limitations, which are summarized in a new Supplementary Figure (S9).

      (4) A number of simplifications in the model may have crucial consequences for interpretation.

      a) Even following the toy examples in Figure 4, all the models in Figure 5 are linear, which may limit the generalisability of the findings.

      While we agree that linear models may be too simplistic, much prior analyses of M1 data suggest that it is often good enough to capture key aspects of M1 dynamics; for example, the generative model underlying jPCA is linear, and Sussillo et al. (2015) showed that the internal activity of nonlinear RNN models trained to reproduce EMG data aligned best with M1 activity when heavily regularized; in this regime, the RNN dynamics were close to linear. Nevertheless, this linearity assumption is indeed convenient from a modeling viewpoint: the optimal control problem is more easily solved for linear network dynamics and the optimal trajectories are more consistent across networks. Indeed, we had originally attempted to perform the analyses of Figure 5 in the nonlinear setting, but found that while the results were overall similar to what we report in the linear regime, iLQR was occasionally trapped into local minimal, resulting in more variable results especially for inhibition-stabilized network in the strongly connected end of the spectrum. Finally, Figure 5 is primarily meant to explore to what extent motor preparation can be predicted from basic linear control-theoretic properties of the Jacobian of the dynamics; in this regard, it made sense to work with linear RNNs (for which the Jacobian is constant).

      b) Crucially, there is no delayed sensory feedback in the model from the plant. Although this simplification is in some ways a strength, this decision allows networks to avoid having to deal with delayed feedback, which is a known component of closed-loop motor control and of motor cortex inputs and will have a large impact on the control policy.

      This comment resonates well with Reviewer 3's remark regarding the autonomous nature (or not) of M1 during movement. Rather than thinking of our RNN models as anatomically confined models of M1 alone, we think of them as models of the dynamics which M1 implements possibly as part of a broader network involving “inter-area loops and (at some latency) sensory feedback”, and whose state appears to be near-fully decodable from M1 activity alone. We have added a paragraph of Discussion on this important point.

      (5) A key feature determining the usefulness of preparation is the direction of the readout dimension. However, all readouts had a similar structure (random Gaussian initialization). Therefore, it would be useful to have more discussion regarding how the structure of the output connectivity would affect preparation, since the motor cortex certainly does not follow this output scheme.

      We agree with this limitation of our model — indeed one key message of Figure 4 is that the degree of reliance on preparatory inputs depends strongly on how the dynamics align with the readout. However, this strong dependence is somewhat specific to low-dimensional models; in higher-dimensional models (most of our paper), one expects that any random readout matrix C will pick out activity dimensions in the RNN that are sufficiently aligned with the most controllable directions of the dynamics to encourage preparation.

      We did consider optimizing C away (which required differentiating through the iLQR optimizer, which is possible but very costly), but the question inevitably arises what exactly should C be optimized for, and under what constraints (e.g fixed norm or not). One possibility is to optimize C with respect to the same control objective that the control inputs are optimized for, and constrain its norm (otherwise, inputs to the M1 model, and its internal activity, could become arbitrarily small as C can grow to compensate). We performed this experiment (new Supplementary Figure S7) and obtained a similar preparation index; there was one notable difference, namely that the optimized readout modes led to greater observability compared to a random readout; thus, the same amount of “muscle energy” required for a given movement could now be produced by a smaller initial condition. In turn, this led to smaller control inputs, consistent with a lower control cost overall.

      Whilst we could have systematically optimized C away, we reasoned that (i) it is computationally expensive, and (ii) the way M1 affects downstream effectors is presumably “optimized” for much richer motor tasks than simple 2D reaching, such that optimizing C for a fixed set of simple reaches could lead to misleading conclusions. We therefore decided to stick with random readouts.

      Additional comments:

      (1) The choice of cost function seems very important. Is it? For example, penalising the square of u(t) may produce very different results than penalising the absolute value.

      Yes, the choice of cost function does affect the results, at least qualitatively. The absolute value of the inputs is a challenging cost to use, as iLQR relies on a local quadratic approximation of the cost function. However, we have included additional experiments in which we penalized the squared derivative of the inputs (Supplementary Figure S8; see also our response to Reviewer 3's suggestion on this topic), and we do see differences in the qualitative behavior of the model (though the main takeaway, i.e. the reliance on preparation, continues to hold). This is now referred to and discussed in the Discussion section.

      (2) In future work it would be useful to consider the role of spinal networks, which are known to contribute to preparation in some cases (e.g. Prut and Fetz, 1999).

      (3) The control signal magnitude is penalised, but not the output torque magnitude, which highlights the fact that control in the model is quite different from muscle control, where co-contraction would be a possibility and therefore a penalty of muscle activation would be necessary. Future work should consider the role of these differences in control policy.

      Thank you for pointing us to this reference! Regarding both of these concerns, we agree that the model could be greatly improved and made more realistic in future work (another avenue for this would be to consider a more realistic biophysical model, e.g. using the MotorNet library). We hope that the current Discussion, which highlights the various limitations of our modeling choices, makes it clear that a lot of these choices could easily be modified depending on the specific assumptions/investigation being performed.

      Reviewer 2:

      Thank you for your positive review! We very much agree with the limitations you pointed out, some of which overlapped with the comments of the other reviewers. We have done our best to address them through additional discussion and new supplementary figures. We briefly highlight below where those changes can be found.

      (1) Though the optimal control theory framework is ideal to determine inputs that minimize output error while regularizing the input norm, it however cannot easily account for some other varied types of objectives especially those that may lead to a complex optimization landscape. For instance, the reusability of parts of the circuit, sparse use of additional neurons when learning many movements, and ease of planning (especially under uncertainty about when to start the movement), may be alternative or additional reasons that could help explain the preparatory activity observed in the brain. It is interesting to note that inputs that optimize the objective chosen by the authors arguably lead to a trade-off in terms of other desirable objectives. Specifically, the inputs the authors derive are time-dependent, so a recurrent network would be needed to produce them and it may not be easy to interpolate between them to drive new movement variants. In addition, these inputs depend on the desired time of output and therefore make it difficult to plan, e.g. in circumstances when timing should be decided depending on sensory signals. Finally, these inputs are specific to the full movement chain that will unfold, so they do not permit reuse of the inputs e.g. in movement sequences of different orders.

      Yes, that is a good point! We have incorporated further Discussion related to this point. We have additionally included a new example in which we regularize the temporal complexity of the inputs (see also our response to Reviewer 3's suggestion on this topic), which leads to more slowly varying inputs, and may indeed represent a more realistic constraint and lead to simpler inputs that can more easily be interpolated between. We also agree that uncertainty about the upcoming go cue may play an important role in the strategy adopted by the animals. While we have not performed an extensive investigation of the topic, we have included a Supplementary Figure (S9) in which we used Model Predictive Control to investigate the effect of planning under uncertainty about the go cue arrival time. We hope that this will give the reader a better sense of what sort of model extensions are possible within our framework.

      (2) Relatedly, if the motor circuits were to balance different types of objectives, the activity and inputs occurring before each movement may be broken down into different categories that may each specialize into one objective. For instance, previous work (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021) has suggested that inputs occurring before the movement could be broken down into preparatory inputs 'stricto sensu' - relating to the planned characteristics of the movement - and a trigger signal, relating to the transition from planning to execution - irrespective of whether the movement is internally timed or triggered by an external event. The current work does not address which type(s) of early input may be labeled as 'preparatory' or may be thought of as a part of 'planning' computations.

      Yes, our model does indeed treat inputs in a very general way, and does not distinguish between the different types of processes they may be composed of. This is partly because we do not explicitly model where the inputs come from, such that our inputs likely englobe multiple processes. We have added discussion related to this point.

      (3) While the authors rightly point out some similarities between the inputs that they derive and observed preparatory activity in the brain, notably during motor sequences, there are also some differences. For instance, while both the derived inputs and the data show two peaks during sequences, the data reproduced from Zimnik and Churchland show preparatory inputs that have a very asymmetric shape that really plummets before the start of the next movement, whereas the derived inputs have larger amplitude during the movement period - especially for the second movement of the sequence. In addition, the data show trigger-like signals before each of the two reaches. Finally, while the data show a very high correlation between the pattern of preparatory activity of the second reach in the double reach and compound reach conditions, the derived inputs appear to be more different between the two conditions. Note that the data would be consistent with separate planning of the two reaches even in the compound reach condition, as well as the re-use of the preparatory input between the compound and double reach conditions. Therefore, different motor sequence datasets - notably, those that would show even more coarticulation between submovements - may be more promising to find a tight match between the data and the author's inputs. Further analyses in these datasets could help determine whether the coarticulation could be due to simple filtering by the circuits and muscles downstream of M1, planning of movements with adjusted curvature to mitigate the work performed by the muscles while permitting some amount of re-use across different sequences, or - as suggested by the authors - inputs fully tailored to one specific movement sequence that maximize accuracy and minimize the M1 input magnitude.

      Regarding the exact shape of the occupancy plots, it is important to note that some of the more qualitative aspects (e.g the relative height of the two peaks) will change if we change the parameters of the cost function. Right now, we have chosen the parameters to ensure that both reaches would be performed at roughly the same speed (as a way to very loosely constrain the parameters based on the observed behavior). However, small changes to the hyperparameters can lead to changes in the model output (e.g one of the two consecutive reaches being performed using greater acceleration than the other), and since our biophysical model is fairly simple, changes in the behavior are directly reflected in the network activity. Essentially, what this means is that while the double occupancy is a consistent feature of the model, the exact shape of the peaks is more sensitive to hyperparameters, and we do not wish to draw any strong conclusions from them, given the simplicity of the biophysical model. However, we do agree that our model exhibits some differences with the data. As discussed above, we have included additional discussion regarding the potential existence of separate inputs for planning vs triggering the movement in the context of single reaches.

      Overall, we are excited about the suggestions made by the Reviewer here about using our approach to analyze other motor sequence datasets, but we think that in order to do this properly, one would need to adopt a more realistic musculo-skeletal model (such as one provided by MotorNet).

      (4) Though iLQR is a powerful optimization method to find inputs optimizing the author's cost function, it also has some limitations. First, given that it relies on a linearization of the dynamics at each timestep, it has a limited ability to leverage potential advantages of nonlinearities in the dynamics. Second, the iLQR algorithm is not a biologically plausible learning rule and therefore it might be difficult for the brain to learn to produce the inputs that it finds. It remains unclear whether using alternative algorithms with different limitations - for instance, using variants of BPTT to train a separate RNN to produce the inputs in question - could impact some of the results.

      We agree that our choice of iLQR has limitations: while it offers the advantage of convergence guarantees, it does indeed restrict the choice of cost function and dynamics that we can use. We have now included extensive discussion of how the modeling choices affect our results.

      We do not view the lack of biological plausibility of iLQR as an issue, as the results are agnostic to the algorithm used for optimization. However, we agree that any structure imposed on the inputs (e.g by enforcing them to be the output of a self-contained dynamical system) would likely alter the results. A potentially interesting extension of our model would be to do just what the reviewer suggested, and try to learn a network that can generate the optimal inputs. However, this is outside the scope of our investigation, as it would then lead to new questions (e.g what brain region would that other RNN represent?).

      (5)  Under the objective considered by the authors, the amount of input occurring before the movement might be impacted by the presence of online sensory signals for closed-loop control. It is therefore an open question whether the objective and network characteristics suggested by the authors could also explain the presence of preparatory activity before e.g. grasping movements that are thought to be more sensory-driven (Meirhaeghe et al., Cell Reports 2023).

      It is true that we aren’t currently modeling sensory signals explicitly. However, some of the optimal inputs we infer may be capturing upstream information which could englobe some sensory information. This is currently unclear, and would likely depend on how exactly the model is specified. We have added new discussion to emphasize that our dynamics should not be understood as just representing M1, but more general circuits whose state can be decoded from M1.

      Reviewer #2 (Recommendations For The Authors):

      Additionally, thank you for pointing out various typos in the manuscript, we have fixed those!

      Reviewer 3:

      Thank you very much for your review, which makes a lot of very insightful points, and raises several interesting questions. In summary, we very much agree with the limitations you pointed out. In particular, the choice of input cost is something we had previously discussed, but we had found it challenging to decide on what a reasonable cost for “complexity” could be. Following your comment, we have however added a first attempt at penalizing “temporal complexity”, which shows promising behavior. We have only included those additional analyses as supplementary figures, and we have included new discussion, which hopefully highlights what we meant by the different model components, and how the model behavior may change as we vary some of our choices. We hope this can be informative for future models that may use a similar approach. Below, we highlight the changes that we have made to address your comments.

      The main limitation of the study is that it focuses exclusively on one specific constraint - magnitude - that could limit motor-cortex inputs. This isn't unreasonable, but other constraints are at least as likely, if less mathematically tractable. The basic results of this study will probably be robust with regard such issues - generally speaking, any constraint on what can be delivered during execution will favor the strategy of preparing - but this robustness cuts both ways. It isn't clear that the constraint used in the present study - minimizing upstream energy costs - is the one that really matters. Upstream areas are likely to be limited in a variety of ways, including the complexity of inputs they can deliver. Indeed, one generally assumes that there are things that motor cortex can do that upstream areas can't do, which is where the real limitations should come from. Yet in the interest of a tractable cost function, the authors have built a system where motor cortex actually doesn't do anything that couldn't be done equally well by its inputs. The system might actually be better off if motor cortex were removed. About the only thing that motor cortex appears to contribute is some amplification, which is 'good' from the standpoint of the cost function (inputs can be smaller) but hardly satisfying from a scientific standpoint.

      The use of a term that punishes the squared magnitude of control signals has a long history, both because it creates mathematical tractability and because it (somewhat) maps onto the idea that one should minimize the energy expended by muscles and the possibility of damaging them with large inputs. One could make a case that those things apply to neural activity as well, and while that isn't unreasonable, it is far from clear whether this is actually true (and if it were, why punish the square if you are concerned about ATP expenditure?). Even if neural activity magnitude an important cost, any costs should pertain not just to inputs but to motor cortex activity itself. I don't think the authors really wish to propose that squared input magnitude is the key thing to be regularized. Instead, this is simply an easily imposed constraint that is tractable and acts as a stand-in for other forms of regularization / other types of constraints. Put differently, if one could write down the 'true' cost function, it might contain a term related to squared magnitude, but other regularizing terms would by very likely to dominate. Using only squared magnitude is a reasonable way to get started, but there are also ways in which it appears to be limiting the results (see below).

      I would suggest that the study explore this topic a bit. Is it possible to use other forms of regularization? One appealing option is to constrain the complexity of inputs; a long-standing idea is that the role of motor cortex is to take relatively simple inputs and convert them to complex time-evolving inputs suitable for driving outputs. I realize that exploring this idea is not necessarily trivial. The right cost-function term is not clear (should it relate to low-dimensionality across conditions, or to smoothness across time?) and even if it were, it might not produce a convex cost function. Yet while exploring this possibility might be difficult, I think it is important for two reasons.

      First, this study is an elegant exploration of how preparation emerges due to constraints on inputs, but at present that exploration focuses exclusively on one constraint. Second, at present there are a variety of aspects of the model responses that appear somewhat unrealistic. I suspect most of these flow from the fact that while the magnitude of inputs is constrained, their complexity is not (they can control every motor cortex neuron at both low and high frequencies). Because inputs are not complexity-constrained, preparatory activity appears overly complex and never 'settles' into the plateaus that one often sees in data. To be fair, even in data these plateaus are often imperfect, but they are still a very noticeable feature in the response of many neurons. Furthermore, the top PCs usually contain a nice plateau. Yet we never get to see this in the present study. In part this is because the authors never simulate the situation of an unpredictable delay (more on this below) but it also seems to be because preparatory inputs are themselves strongly time-varying. More realistic forms of regularization would likely remedy this.

      That is a very good point, and it mirrors several concerns that we had in the past. While we did focus on the input norm for the sake of simplicity, and because it represents a very natural way to regularize our control solutions, we agree that a “complexity cost” may be better suited to models of brain circuits. We have addressed this in a supplementary investigation. We chose to focus on a cost that penalizes the temporal complexity of the inputs, as ||u(t+1) - u(t)||^2. Note that this required augmenting the state of the model, making the computations quite a bit slower; while it is doable if we only penalize the first temporal derivative, it would not scale well to higher orders.

      Interestingly, we did find that the activity in that setting was somewhat more realistic (see new Supplementary Figure S8), with more sustained inputs and plateauing activity. While we have kept the original model for most of the investigations, the somewhat more realistic nature of the results under that setting suggests that further exploration of penalties of that sort could represent a promising avenue to improve the model.

      We also found the idea of a cost that would ensure low-dimensionality of the inputs across conditions very interesting. However, it is challenging to investigate with iLQR as we perform the optimization separately for each condition; nevertheless, it could be investigated using a different optimizer.

      At present, it is also not clear whether preparation always occurs even with no delay. Given only magnitude-based regularization, it wouldn't necessarily have to be. The authors should perform a subspace-based analysis like that in Figure 6, but for different delay durations. I think it is critical to explore whether the model, like monkeys, uses preparation even for zero-delay trials. At present it might or might not. If not, it may be because of the lack of more realistic constraints on inputs. One might then either need to include more realistic constraints to induce zero-delay preparation, or propose that the brain basically never uses a zero delay (it always delays the internal go cue after the preparatory inputs) and that this is a mechanism separate from that being modeled.

      I agree with the authors that the present version of the model, where optimization knows the exact time of movement onset, produces a reasonably realistic timecourse of preparation when compared to data from self-paced movements. At the same time, most readers will want to see that the model can produce realistic looking preparatory activity when presented with an unpredictable delay. I realize this may be an optimization nightmare, but there are probably ways to trick the model into optimizing to move soon, but then forcing it to wait (which is actually what monkeys are probably doing). Doing so would allow the model to produce preparation under the circumstances where most studies have examined it. In some ways this is just window-dressing (showing people something in a format they are used to and can digest) but it is actually more than that, because it would show that the model can produce a reasonable plateau of sustained preparation. At present it isn't clear it can do this, for the reasons noted above. If it can't, regularizing complexity might help (and even if this can't be shown, it could be discussed).

      In summary, I found this to be a very strong study overall, with a conceptually timely message that was well-explained and nicely documented by thorough simulations. I think it is critical to perform the test, noted above, of examining preparatory subspace activity across a range of delay durations (including zero) to see whether preparation endures as it does empirically. I think the issue of a more realistic cost function is also important, both in terms of the conceptual message and in terms of inducing the model to produce more realistic activity. Conceptually it matters because I don't think the central message should be 'preparation reduces upstream ATP usage by allowing motor cortex to be an amplifier'. I think the central message the authors wish to convey is that constraints on inputs make preparation a good strategy. Many of those constraints likely relate to the fact that upstream areas can't do things that motor cortex can do (else you wouldn't need a motor cortex) and it would be good if regularization reflected that assumption. Furthermore, additional forms of regularization would likely improve the realism of model responses, in ways that matter both aesthetically and conceptually. Yet while I think this is an important issue, it is also a deep and tricky one, and I think the authors need considerable leeway in how they address it. Many of the cost-function terms one might want to use may be intractable. The authors may have to do what makes sense given technical limitations. If some things can't be done technically, they may need to be addressed in words or via some other sort of non-optimization-based simulation.

      Specific comments

      As noted above, it would be good to show that preparatory subspace activity occurs similarly across delay durations. It actually might not, at present. For a zero ms delay, the simple magnitude-based regularization may be insufficient to induce preparation. If so, then the authors would either have to argue that a zero delay is actually never used internally (which is a reasonable argument) or show that other forms of regularization can induce zero-delay preparation.

      Yes, that is a very interesting analysis to perform, which we had not considered before! When investigating this, we found that the zero-delay strategy does not rely on preparation in the same way as is seen in the monkeys. This seems to be a reflection of the fact that our “Go cue” corresponds to an “internal” go cue which would likely come after the true, “external go cue” – such that we would indeed never actually be in the zero delay setting. This is not something we had addressed (or really considered) before, although we had tried to ensure we referred to “delta prep” as the duration of the preparatory period but not necessarily the delay period. We have now included more discussion on this topic, as well as a new Supplementary Figure S10.

      I agree with the authors that prior modeling work was limited by assuming the inputs to M1, which meant that prior work couldn't address the deep issue (tackled here) of why there should be any preparatory inputs at all. At the same time, the ability to hand-select inputs did provide some advantages. A strong assumption of prior work is that the inputs are 'simple', such that motor cortex must perform meaningful computations to convert them to outputs. This matters because if inputs can be anything, then they can just be the final outputs themselves, and motor cortex would have no job to do. Thus, prior work tried to assume the simplest inputs possible to motor cortex that could still explain the data. Most likely this went too far in the 'simple' direction, yet aspects of the simplicity were important for endowing responses with realistic properties. One such property is a large condition-invariant response just before movement onset. This is a very robust aspect of the data, and is explained by the assumption of a simple trigger signal that conveys information about when to move but is otherwise invariant to condition. Note that this is an implicit form of regularization, and one very different from that used in the present study: the input is allowed to be large, but constrained to be simple. Preparatory inputs are similarly constrained to be simple in the sense that they carry only information about which condition should be executed, but otherwise have little temporal structure. Arguably this produces slightly too simple preparatory-period responses, but the present study appears to go too far in the opposite direction. I would suggest that the authors do what they can to address these issue via simulations and/or discussion. I think it is fine if the conclusion is that there exist many constraints that tend to favor preparation, and that regularizing magnitude is just one easy way of demonstrating that. Ideally, other constraints would be explored. But even if they can't be, there should be some discussion of what is missing - preparatory plateaus, a realistic condition-invariant signal tied to movement onset - under the present modeling assumptions.

      As described above, we have now included two additional figures. In the first one (S8, already discussed above), we used a temporal smoothness prior, and we indeed get slightly more realistic activity plateaus. In a second supplementary figure (S9), we have also considered using model predictive control (MPC) to optimize the inputs under an uncertain go cue arrival time. There, we found that removing the assumption that the delay period is known came with new challenges: in particular, it requires the specification of a “mental model” of when the Go cue will arrive. While it is reasonable to expect that monkeys will have a prior over the go time arrival cue that will be shaped by the design of the experiment, some assumptions must be made about the utility functions that should be used to weigh this prior. For instance, if we imagine that monkeys carry a model of the possible arrival time of the go cue that is updated online, they could nonetheless act differently based on this information, for instance by either preparing so as to be ready for the earliest go cue possible or alternatively to be ready for the average go cue. This will likely depend on the exact task design and reward/penalty structure. Here, we added simulations with those two cases (making simplifying assumptions to make the problem tractable/solvable using model predictive control), and found that the “earliest preparation” strategy gives rise to more realistic plateauing activity, while the model where planning is done for the “most likely go time” does not. We suspect that more realistic activity patterns could be obtained by e.g combining this framework with the temporal smoothness cost. However, the main point we wished to make with this new supplementary figure is that it is possible to model the task in a slightly more realistic way (although here it comes at the cost of additional model assumptions). We have now added more discussion related to those points. Note that we have kept our analyses on these new models to a minimum, as the main takeaway we wish to convey from them is that most components of the model could be modified/made more realistic. This would impact the qualitative behavior of the system and match to data but – in the examples we have so far considered – does not appear to modify the general strategy of networks relying on preparation.

      On line 161, and in a few other places, the authors cite prior work as arguing for "autonomous internal dynamics in M1". I think it is worth being careful here because most of that work specifically stated that the dynamics are likely not internal to M1, and presumably involve inter-area loops and (at some latency) sensory feedback. The real claim of such work is that one can observe most of the key state variables in M1, such that there are periods of time where the dynamics are reasonably approximated as autonomous from a mathematical standpoint. This means that you can estimate the state from M1, and then there is some function that predicts the future state. This formal definition of autonomous shouldn't be conflated with an anatomical definition.

      Yes, that is a good point, thank you for making it so clearly! Indeed, as previous work, we do not think of our “M1 dynamics” as being internal to M1, but they may instead include sensory feedback / inter-area loops, which we summarize into the connectivity, that we chose to have dynamics that qualitatively resemble data. We have now incorporated more discussion regarding what exactly the dynamics in our model represent.

      Round 2 of reviews

      Reviewer 3:

      My remaining comments largely pertain to some subtle (but to me important) nuances at a few locations in the text. These should be easy for the authors to address, in whatever way they see fit.

      Specific comments:

      (1) The authors state the following on line 56: "For preparatory processes to avoid triggering premature movement, any pre-movement activity in the motor and dorsal pre-motor (PMd) cortices must carefully exclude those pyramidal tract neurons."

      This constraint is overly restrictive. PT neurons absolutely can change their activity during preparation in principle (and appear to do so in practice). The key constraint is looser: those changes should have no net effect on the muscles. E.g., if d is the vector of changes in PT neuron firing rates, and b is the vector of weights, then the constraint is that b'd = 0. d = 0 is one good way of doing this, but only one. Half the d's could go up and half could go down. Or they all go up, but half the b's are negative. Put differently, there is no reason the null space has to be upstream of the PT neurons. It could be partly, or entirely, downstream. In the end, this doesn't change the point the authors are making. It is still the case that d has to be structured to avoid causing muscle activity, which raises exactly the point the authors care about: why risk this unless preparation brings benefits? However, this point can be made with a more accurate motivation. This matters, because people often think that a null-space is a tricky thing to engineer, when really it is quite natural. With enough neurons, preparing in the null space is quite simple.

      That is a good point – we have now reformulated this sentence to instead say “to avoid triggering premature movement, any pre-movement activity in the motor and dorsal premotor (PMd) cortices must engage the pyramidal tract neurons in a way that ensures their activity patterns will not lead to any movement”.

      (2) Line 167: 'near-autonomous internal dynamics in M1'.

      It would be good if such statements, early in the paper, could be modified to reflect the fact that the dynamics observed in M1 may depend on recurrence that is NOT purely internal to M1. A better phrase might be 'near-autonomous dynamics that can be observed in M1'. A similar point applies on line 13. This issue is handled very thoughtfully in the Discussion, starting on line 713. Obviously it is not sensible to also add multiple sentences making the same point early on. However, it is still worth phrasing things carefully, otherwise the reader may have the wrong impression up until the Discussion (i.e. they may think that both the authors, and prior studies, believe that all the relevant dynamics are internal to M1). If possible, it might also be worth adding one sentence, somewhere early, to keep readers from falling into this hole (and then being stuck there till the Discussion digs them out).

      That is a good point: we have now edited the text after line 170 to make it clear that the underlying dynamics may not be confined to M1, and have referenced the later discussion there.

      (3) The authors make the point, starting on line 815, that transient (but strong) preparatory activity empirically occurs without a delay. They note that their model will do this but only if 'no delay' means 'no external delay'. For their model to prepare, there still needs to be an internal delay between when the first inputs arrive and when movement generating inputs arrive.

      This is not only a reasonable assumption, but is something that does indeed occur empirically. This can be seen in Figure 8c of Lara et al. Similarly, Kaufman et al. 2016 noted that "the sudden change in the CIS [the movement triggering event] occurred well after (~150 ms) the visual go cue... (~60 ms latency)" Behavioral experiments have also argued that internal movement-triggering events tend to be quite sluggish relative to the earliest they could be, causing RTs to be longer than they should be (Haith et al. Independence of Movement Preparation and Movement Initiation). Given this empirical support, the authors might wish to add a sentence indicating that the data tend to justify their assumption that the internal delay (separating the earliest response to sensory events from the events that actually cause movement to begin) never shrinks to zero.

      While on this topic, the Haith and Krakauer paper mentioned above good to cite because it does ponder the question of whether preparation is really necessary. By showing that they could get RTs to shrink considerably before behavior became inaccurate, they showed that people normally (when not pressured) use more preparation time than they really need. Given Lara et al, we know that preparation does always occur, but Haith and Krakauer were quite right that it can be very brief. This helped -- along with neural results -- change our view of preparation from something more cognitive that had to occur, so something more mechanical that was simply a good network strategy, which is indeed the authors current point. Working a discussion of this into the current paper may or may not make sense, but if there is a place where it is easy to cite, it would be appropriate.

      This is a nice suggestion, and we thank the reviewer for pointing us to the Haith and Krakauer paper. We have now added this reference and extended the paragraph following line 815 to briefly discuss the possible decoupling between preparation and movement initiation that is shown in the Haith paper, emphasizing how this may affect the interpretation of the internal delay and comparisons with behavioral experiments.

    2. eLife assessment

      This important study provides a new perspective on why preparatory activity occurs before the onset of movement. The authors report that when there is a cost on the inputs, the optimal inputs should start before the desired network output for a wide variety of recurrent networks. The authors present compelling evidence by combining mathematically tractable analyses in linear networks and numerical simulation in nonlinear networks.

    3. Reviewer #1 (Public Review):

      In this work, the authors investigate an important question - under what circumstances should a recurrent neural network optimised to produce motor control signals receive preparatory input before the initiation of a movement, even though it is possible to use inputs to drive activity just-in-time for movement?

      This question is important because many studies across animal models have shown that preparatory activity is widespread in neural populations close to motor output (e.g. motor cortex / M1), but it isn't clear under what circumstances this preparation is advantageous for performance, especially since preparation could cause unwanted motor output during a delay.

      They show that networks optimised under reasonable constraints (speed, accuracy, lack of pre-movement) will use the input to seed the state of the network before movement and that these inputs reduce the need for ongoing input during the movement. By examining many different parameters in simplified models they identify a strong connection between the structure of the network and the amount of preparation that is optimal for control - namely, that preparation has the most value when nullspaces are highly observable relative to the readout dimension and when the controllability of readout dimensions is low. They conclude by showing that their model predictions are consistent with the observation in monkey motor cortex that even when a sequence of two movements is known in advance, preparatory activity only arises shortly before movement initiation.

      Overall, this study provides valuable theoretical insight into the role of preparation in neural populations that generate motor output, and by treating input to motor cortex as a signal that is optimised directly this work is able to sidestep many of the problematic questions relating to estimating the potential inputs to motor cortex.

    4. Reviewer #2 (Public Review):

      This work clarifies neural mechanisms that can lead to a phenomenology consistent with motor preparation in its broader sense. In this context, motor preparation refers to activity that occurs before the corresponding movement. Another property often associated with preparatory activity is a correlation with global movement characteristics such as reach speed (Churchland et al., Neuron 2006), reach angle (Sun et al., Nature 2022), or grasp type (Meirhaeghe et al., Cell Reports 2023). Such activity has notably been observed in premotor and primary motor cortices, and it has been hypothesized to serve as an input to a motor execution circuit. The timing and mechanisms by which such 'preparatory' inputs are made available to motor execution circuits remain however unclear in general, especially in light of the presence of a 'trigger-like' signal that appears to relate to the transition from preparatory dynamics to execution activity (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021).

      The preparatory inputs have been hypothesized to fulfill one or several (non-mutually-exclusive) possible objectives. Two notable hypotheses are that these inputs could be shaped to maximize output accuracy under regularization of the input magnitude; or that they may help the flexible re-use of the neural machinery involved in the control of movements in different contexts.

      Here, the authors investigate in detail how the former hypothesis may be compatible with the presence of early inputs in recurrent network models driving arm movements, and compare models to data.

      Strengths:

      The authors are able to deploy an in-depth evaluation of inputs that are optimized for producing an accurate output at a pre-defined time while using a regularization term on the input magnitude, in the case of movements that are thought to be controlled in a quasi-open loop fashion such as reaches.

      First, the authors have identified that optimal control theory is a great framework to study this question as it provides methods to find and analyze exact solutions to this cost function in the case of models with linear dynamics. The authors not only use this framework to get an exact assessment of how much pre-movement input arises in large recurrent networks, but also give insight into the mechanisms by which it happens by dissecting in detail low-dimensional networks. The authors find that two key network properties - observability of the readout's nullspace and limited controllability - give rise to optimal inputs that are large before the start of the movement (while the corresponding network activity lies in the nullspace of the readout). Further, the authors numerically investigate the timing of optimized inputs in models with nonlinear dynamics, and find that pre-movement inputs can also arise in these more general networks. The authors also explore how some variations on their model's constraints - such as penalizing the input roughness or changing task contingencies about the go cue timing - affect their results. Finally, the authors point out some coarse-grained similarities between the pre-movement activity driven by the optimized inputs in some of the models they studied, and the phenomenology of preparation observed in the brain during single reaches and reach sequences. Overall, the authors deploy an impressive arsenal of tools and a very in-depth analysis of their models.

      Oustanding questions that could lead to interesting follow-up work:

      Like all great pieces of research, this article makes it clear where current limitations lie and therefore opens up opportunities for future work.

      (1) Though the optimal control theory framework is ideal for determining inputs that minimize output error while regularizing the input norm or other simple input features, it cannot easily account for some other varied types of objectives - especially those that may lead to a complex optimization landscape. For instance, the reusability of parts of the circuit, sparse use of additional neurons when learning many movements, and ease of planning (especially under uncertainty about when to start the movement), may be alternative or additional reasons that could help explain the preparatory activity observed in the brain. It is interesting to note that inputs that optimize the objective chosen by the authors arguably lead to a trade-off in terms of other desirable objectives. Specifically, the inputs the authors derive are time-dependent, so a recurrent network would be needed to produce them and it may not be easy to interpolate between them to drive new movement variants. In addition, these inputs depend on the desired time of output and therefore make it difficult to plan, e.g. in circumstances when timing should be decided depending on sensory signals. Finally, these inputs are specific to the full movement chain that will unfold, so they do not permit reuse of the inputs e.g. in movement sequences of different orders. Of note, the authors have pointed out in the discussion how their framework may be extended in future work to account for some additional objectives, such as inputs' temporal smoothness or some strategies for dealing with go cue timing uncertainty.

      (2) Relatedly, if the motor circuits were to balance different types of objectives, the activity and inputs occurring before each movement may be broken down into different categories that may each specialize into their own objective. For instance, previous work (Kaufman et al. eNeuron 2016, Iganaki et al., Cell 2022, Zimnik and Churchland, Nature Neuroscience 2021) has suggested that inputs occurring before the movement could be broken down into preparatory inputs 'stricto sensu' - relating to the planned characteristics of the movement - and a trigger signal, relating to the transition from planning to execution - irrespective of whether the movement is internally timed or triggered by an external event. The current work does not address which type(s) of early input may be labeled as 'preparatory' or may be thought of as a part of 'planning' computations, or whether these inputs may come from several different source circuits. Future research could investigate these questions using a different approach, for instance, by including structural constraints from brain architecture into a neural network model.

      (3) While the authors rightly point out some similarities between the inputs that they derive and observed preparatory activity in the brain, notably during motor sequences, there are also some differences. For instance, while both the derived inputs and the data show two peaks during sequences, the data reproduced from Zimnik and Churchland show preparatory inputs that have a very asymmetric shape that really plummets before the start of the next movement, whereas the derived inputs have larger amplitude during the movement period - especially for the second movement of the sequence. In addition, the data show trigger-like signals before each of the two reaches. Finally, while the data show a very high correlation between the pattern of preparatory activity of the second reach in the double reach and compound reach conditions, the derived inputs appear to be more different between the two conditions. Note that the data would be consistent with separate planning of the two reaches even in the compound reach condition, as well as the re-use of the preparatory input between the compound and double reach conditions. Therefore, different motor sequence datasets - notably, those that would show even more coarticulation between submovements - may be more promising for finding a tight match between the data and the author's inputs. In the future, further analyses in these datasets could help determine whether the coarticulation could be due to simple filtering by the circuits and muscles downstream of M1, planning of movements with adjusted curvature to mitigate the work performed by the muscles while permitting some amount of re-use across different sequences, or - as suggested by the authors - inputs fully tailored to one specific movement sequence that maximize accuracy and minimize the M1 input magnitude.

      (4) Though iLQR is a powerful optimization method to find inputs optimizing the author's cost function, it also has some limitations. First, given that it relies on a linearization of the dynamics at each timestep, it has a limited ability to leverage potential advantages of nonlinearities in the dynamics. Second, the iLQR algorithm is not a biologically plausible learning rule and does not account for biological constraints affecting the circuits that produce and process these inputs. Therefore, it might be difficult for the brain to learn to produce the inputs that it finds. Consequently, when observing differences between model and data, this can confound the question of whether it comes from a difference of assumed objective or a difference of optimization procedure or circuit implementation. It remains unclear whether using alternative algorithms with different limitations - for instance, using variants of BPTT to train a separate RNN to produce the inputs in question - could impact some of the results.

      (5) Under the objective considered by the authors, the amount of input occurring before the movement might be impacted by the presence of online sensory signals for closed-loop control. Even if the inputs include some sensory activity and/or the RNN activity could represent all general variables (e.g. sensory) whose states can be decoded from M1, the model does not currently include mechanisms that process imperfect (delayed, noisy) sensory feedback to adapt the output in a trial-specific manner. The information related to such sensory feedback cannot be anticipated, and therefore the related input would have to reach the motor cortex after preparation. Thus, it is an open question whether the objective and network characteristics suggested by the authors could also explain the presence of large preparatory activity before e.g. grasping movements that are thought to be more sensory-feedback-driven (Meirhaeghe et al., Cell Reports 2023).

      (6) More broadly, with the type of objectives that the authors assume the inputs fulfill, some M1 properties that lead to strong preparation - notably, limited readout controllability - may not be favorable for control in general, so it would be interesting if other objectives and assumptions could robustly lead to strong preparation under more general M1 properties.'

    5. Reviewer #3 (Public Review):

      I remain enthusiastic about this study. The manuscript is well-written, logical, and conceptually clear. To my knowledge, no prior modeling study has tackled the question of 'why prepare before executing, why not just execute?' Prior studies have simply assumed, to emulate empirical findings, that preparatory inputs precede execution. They never asked why. The authors show that, when there are constraints on inputs, preparation becomes a natural strategy. In contrast, with no constraint on inputs, there is no need for preparation as one could get anything one liked just via the inputs during movement. For the sake of tractability, the authors use a simple magnitude constraint: the cost function punishes the integral of the squared inputs. Thus, if small inputs before movement can reduce the size of the inputs needed during movement, preparation is a good strategy. This occurs if (and only if) the network has strong dynamics (otherwise feeding it preparatory activity would not produce anything interesting). All of this is sensible and clarifying.

      As discussed in the prior round of reviews, the central constraint that the authors use is a mathematically tractable stand-in for a range of plausible (but often trickier to define and evaluate) constraints, such as simplicity of inputs (or inputs being things that other areas could provide). The manuscript now embraces this fact more explicitly and also gives some results showing that other constraints (such as on the derivative of activity, which is one component of complexity) can have the same effect. The manuscript also now discusses and addresses a modest weakness of the previous manuscript: the preparatory activity in their simulations is often overly complex temporally, lacking the (rough) plateau typically seen for data. Depending on your point of view, this is simply 'window dressing', but from my perspective it was important to know that their approach could yield more realistic-looking preparatory activity.

      The most recent version of the manuscript also has a useful section in the Discussion on the topic of preparation when there is no external delay, which I found helpful given prior behavioral and physiological studies arguing that preparation can 1) be very brief, but 2) is always present. These findings mesh nicely with the authors' central result that preparation is a good network strategy, and that it would thus be normative for there to be at least a brief interval of preparation even when not imposed externally.

    1. eLife assessment

      This is a potentially useful study that shows changes in the chromatin landscape of GABAergic neurons in induced pluripotent stem cells (iPSCs) derived from both Dravet Syndrome (DS) patients and healthy donors. The strength of the evidence is currently incomplete because the authors compared iPSCs from different individuals, rather than isogenic controls. A strategy for minimizing variability across cell lines is used, but the explanation is not complete. The revised manuscript adds RNAseq and qPCR measurements of the expression of the gene SCN1A, however these do not appear to agree, perhaps because of the way the qPCR measurements are normalized, and there is no measurement of Nav1.1, the gene product thought to be responsible for the majority of DS cases. Hence the evidence that there is reduced expression of SCN1A or its gene product is not complete and therefore it is difficult to evaluate whether or not the observed epigenetic changes are causal. The work would potentially be of interest to scientists who study development, developmental disorders, and epigenetic contributions to disease.

    2. Reviewer #2 (Public Review):

      Summary:

      Overall this is an interesting innovative study that examines chromatin accessibility in an inhibitory iPSC model of Dravet Syndrome. The authors detect a potential intriguing development defect in the patient-specific neurons, however the correlation with gene expression or protein abundance is not compelling and the variability of the data is still difficult to determine.

      Strengths:

      (1) This is a novel and interesting study that aims to investigate the epigenetic changes that occur in a sodium channel model of epilepsy, these are oft ignored, but also an interesting area for future therapeutics.

      (2) The paper is well written with good graphics and flow.

      (3) With caveats noted below, there is an intriguing developmental defect in GABAergic neuron differentiation in this model. It would be interesting to see how this correlated with the expression of SCN1A, and I was surprised this was not addressed in the manuscript via RNA/protein abundance, nor how the absence of a sodium channel can accelerate differentiation when a priori I might expect the opposite (as less 'neuronal' signal)

      (4) There is exploratory analysis that VPA alters chromatin accessibility at an individual-specific level. Though it was not noted if any of the DS patients,

      Weaknesses addressed:

      (1) Representative images for cell-identity markers are now shown for D19 and D65.

      (2) The methods now state that three differentiations were performed.

      (3) The authors address a possible role for cell death in data obtained from their cultures by assessing viability with trypan blue staining.

      (4) Some features of ATAC signal normalization and enrichment analysis have been better documented.

      (5) Some of the variability in key results is better documented.

      Weaknesses poorly or not addressed:

      (1) Although the authors include prior RNAseq data and report on qPCR measurements for SCN1A (Supp Fig 1)these do not on the surface appear to agree, with the RNAseq showing little apparent difference between patients and controls, while the qPCR seems to show a two-fold difference at D65. This is likely a misleading artifact of normalizing PCR expression to that at D0 when the gene is not expressed but has mildly different low levels in patients and controls. No measurement of the protein product or its function is included. This is a major weakness that casts doubt on the core hypothesis that epigenetic changes play a key causal role in Dravet syndrome.

      (2) Although some QC on ATAC is described, QC performed on iPSC lines, i.e. karyotype/CNV analysis and confirmation of genotypes is not described in the paper.

      (3) The authors describe a method for trying to diminish variability but do not adequately explain this method or how much variability remains in many of their measures.

      (4) Given that VPA would be administered in patients with fully mature inhibitory neurons, it is difficult to determine the biological relevance of these findings.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Review: 

      This study used ATAC-Seq to characterize chromatin accessibility during stages of GABAergic neuron development in induced pluripotent stem cells (iPSCs) derived from both Dravet Syndrome (DS) patients and healthy donors. The authors report accelerated GABAergic maturation to a point, followed by further differentiation into a perturbed chromatin profile, in the cells from patients. In a preliminary analysis, valproic acid, an anti-seizure medication commonly used in patients with DS, increased open chromatin in both patient and control iPSCs in a nonspecific manner, and to different degrees in cultures derived from different patients. These findings provide new information about DS-associated changes in chromatin, and provide further evidence for developmental abnormalities in interneurons with DS. 

      Strengths:

      This is a novel study that aims to investigate the epigenetic changes that occur in a sodium channel model of epilepsy; these changes are often ignored but may be an interesting area for future therapeutics. In general, the flow of the paper is good, and the figures are well-designed.  Reply: Thank you for your positive feedback about our work. 

      Weaknesses:

      The most substantial weakness relates to the observation that DS is often viewed as a monogenic form of epilepsy. It is directly linked to SCN1A gene haploinsufficiency (Yu et al, 2006; Ogiwara et al, 2007). The gene product is Nav1.1, the alpha subunit of voltage-gated sodium channel type I that regulates neuronal excitability. Yet, analysis was conducted at time points of GABAergic interneuron differentiation in which SCN1A is likely not expressed. The paper would be strengthened if SCN1A expression and Nav1.1 protein were examined across the experimental time course. If SCN1A is not yet expressed, this would complicate any explanation of how the observed epigenetic changes might arise. It also seems counterintuitive that the absence of a sodium channel can accelerate differentiation, when, a priori, one might expect the opposite (a 'less neuronal' signal). 

      Thanks, this is an important point!  In our revised manuscript, we have incorporated data on the expression of SCN1A at d19 and d65 of GABAergic development in both the control and patient groups. We first retrieved data from our previous RNA-Seq analysis, showing SCN1A gene expression in our cells at both d19 and d65. We have now updated our text on the SCN1A gene expression in the revised manuscript (Revised Supplementary Figure 1A, revised text Line 108-109). Second, we confirmed the dynamics of SCN1A expression by real-time quantitative RT/PCR analysis at four time-pionts of GABAergic development (d0, d19, d35 and d65). Notably, expression of SCN1A was detected by qRT-PCR from d19 and the expression increased with differentiation. We have now included this information in the revised manuscript (Revised Supplementary Figure 1B, revised text Line 112). 

      Related to this, another important limitation of the study is that the controls are cells derived from healthy individuals and not from isogenic lines. The usage of isogenic lines is extremely relevant for every study in which iPSC-derived somatic cells are used to model a disease, but specifically in diseases like DS, in which the genetic background has an ascertained impact on disease phenotype (Cetica et al, 2017 and others). This serious limitation should be considered.

      Yes, we fully agree that isogenic and edited patient-derived iPSC would have been the ideal controls. At an early stage we therefore invested considerable time and efforts in order to generate isogenic lines from patientderived iPSC. However, editing of the SCN1A variants in patient-derived iPSC turned out unsuccessful after several trials and modifications so we finally turned to iPSC from healthy donors. This is now discussed together with other limitations of our study in the revised manuscript (end of discussion section, lines 499-506).

      In addition, the authors should provide data on variability across cell lines and differentiations to help convince the reader that the results can be attributed to genetic defects, rather than variability across individuals. 

      This is a valuable point. In the revised manuscript, we have now added plots and IF staining from individual samples to give the readers a complete picture on how they are distributed (Revised Supplementary Figure 1C, Revised Supplementary Figure 2, and Revised Supplementary Figure 4).

      In the revised manuscript, we incorporated an explanation on the strategy used to compare the two groups (cases vs. controls) in more detail. In our analysis, we first compared the dynamic changes of chromatin accessibility cell line by cell line across differentiation. We then extracted the common changes from different cell lines at each time point (Revised text line 152-155, line 226-228). Using this strategy, we extracted the common changes confined to the control and patient groups, respectively. With this approach we avoid to capture the variability across individuals.

      Additionally, the authors acknowledge the variability of the differentiations and cell lines, which is commendable, and they attribute this to "possibly reflecting cell line specific and endogenous differences reported previously", but could also have to do with cell death. This is a large confounding factor for ATAC-seq. Certainly, Sup Fig 1C shows lower FrIP scores, consistent with cell death, and there seems to be a lot of death in the representative images. Moreover, the iGABA neurons are very difficult to keep alive, especially to 65 days, without co-culturing with glia and/or glutamatergic neurons. The authors should comment on how much these factors may have influenced their results. 

      With this point in mind, we re-examined QC of our ATAC-Seq across all samples: As shown in revised

      Supplementary Figure 2C and Supplementary Figure 4C, our cutoff for FRiP is 15%, and all of samples have an FrIP of more than 15%. At the later time points (d35 or d65), we did not observe a FRiP <15%. We therefore feel confident that the quality of ATAC-Seq is good enough for downstream analysis and data interpretation.  

      Regarding the differentiation protocol, we are following a directed protocol of iPSC towards interneurons. The protocol is described in detail by Maroof et al (reference 34) and slightly modified in our lab (described in reference 13). With our modified protocol, GABAergic cells are viable beyond day 65 without the need of co-cultures with astrocyte or microglia. This is also reflected by the electrophysiological activity of interneurons at d65 and at later time points (reference 13). Additionally, our ambition was to obtain a homogeneous cell population for further analysis. Adding other cell types to the cultures would have interfered with downstream processes and a need for cell sorting. Using our protocol, we obtain viable GABA interneurons after up to 100 days in culture. To assess the viability of our cells at the point of sampling (other than by morphological assessment), we used Trypan blue staining and an automated cell counter. Only samples with a viability >90% were processed for ATAC seq. which is a commonly used cut-off for cell viability. We have now modified the method section in the revised version to describe the GABAergic differentiation and sampling (line 519-529).

      Finally, changes in gene expression are only inferred, as no RNA levels were measured. If RNA-seq was not possible it would have been good to see at least some of the key genes/findings corroborated with RNA/protein levels vs chromatin accessibility alone, particularly given that these molecular readouts do not always correlate. 

      In our revised manuscript, we include our recently published RNA-seq performed at d19 and d65. We also correlated the RNAseq and ATACseq data obtained from the same samples.  The Pearson correlations between gene expression and chromatin accessibility were within the range 0.49-0.57 (Revised Supplementary Figure 2G, Revised supplementary Figure 4G), which is acceptable according to standard criteria. The results confirmed that the quality of ATAC-Seq is good enough for analysis of expression levels and chromatin openness in key genes. We also added gene expression levels from RNA-seq (d19 and d65) in our revised manuscript (Revised Figure 1G, Revised Figure 2G). Finally, we performed qRT-PCR analysis of key genes in each cluster and the results are now included in the revised version (Revised Supplementary Figure 3E, Revised Supplementary Figure 5E)

      Additional Points:

      (1) Representative images for cell-identity markers for only D65 are shown, and not D0, D19, and D35 though it is stated in the text that this was performed. At a minimum, these representative images should be shown for all lines. 

      As suggested, we have now added images for cell identity markers of all iPSC lines in the revised version (Revised Supplementary Figure 1C).

      (2) What QC was performed on iPSC lines, i.e. karyotype/CNV analysis and confirmation of genotypes?

      All iPSC lines used in this study have been fully characterized according to standard and state-of-the art procedures: Expression of pluripotency and stemness genes has been shown by immunostaining, flow cytometry and scorecard analysis; integrity of the genome has been assessed by karyotyping using g-banding; differentiation capacity was characterized using an embryoid body assay in combination with scorecard analysis; and genotypes were verified by Sanger sequencing. Please, see the following publications for full datasets: Schuster et all, Neurobiol Dis 2019, Schuster et al Stem Cell Res 2019, Sobol et al Stem Cells and Development 2015. In our lab, the integrity of iPSC lines are routinely verified using flow cytometry (expression for TRA-1-60 and SSEA4), immunostaining (expression of NANOG, SOX2 and OCT4), Sanger sequencing (targeting variants in SCN1A gene), cell morphology analysis and analysis of mycoplasma by MycoAlert® (Lonza).

      (3) Were all experiments performed on a single differentiation? Or multiples? Were the differentiations performed with the same type? If not, was batch considered in the analysis? 

      Thank you for raising this question. The text Material and Methods has been modified as follows, to better describe the differentiation and sampling procedure:

      “GABAergic interneuron differentiation from iPSCs was performed as previously described (reference 13). The protocol utilizes DUAL SMAD inhibition to induce neurogenesis towards neural stem cells for 10 days, followed by patterning with high levels of sonic hedgehog for nine days towards cortically fated neuronal progenitor cells (NPC) and subsequent maturation for 46 days, i.e. a total of 65 days (Figure 1A). Neuronal cells at day 65 and onwards are healthy and viable as judged by morphological assessment by light microscopy. Differentiation was performed at least 3 times per cell line.  

      Cell cultures were sampled at days 0 (D0), D19, D35 and D65, respectively, by harvesting cells with TryplE and centrifugation (300 x g, 3 min). Harvested cells were counted and assessed for viability using trypan blue staining and an automated EVE cell counter (Nano Entek). Samples with a viability of >90% were chosen for ATAC-Seq library preparation (see below).”.  

      I also assume that technical replicates were merged, and then all three biological replicates were kept for each analysis and outliers were not removed, e.g. Control_D19_8F seems like an example of an outlier. 

      This is a valuable point. We agree on that there is variability across three health donors and patients, respevtively, but the quality of ATAC-Seq is good after multiple assessment of QC (Revised Supplementary Figure 2B-D). The color code in Supplementary Figure 1C may be mis-leading as the Pearsson correlation of all samples was displayed. Overall, the correlation from all ATAC-seq among replicates are over 0.8. At the same time, we observed that samples at d0 are clustered together, but not at the later time points. We interpret this as related to the cell-line specific plasticity of chromatin dynamic during differentiation. The observation agrees with our results from PCA (Revised Supplementary Figure 2F).  

      (4) In Figure 1C, it is intriguing that the ATACseq signal gets stronger in imN. One might expect it to be strongest in the iPSCs which are undifferentiated and have the highest levels of open chromatin. Is this a function of sequencing depth, or are all the Y-axes normalized across all time points? 

      This is another valuable point. Figure 1C present the average chromatin openness for clusters specific regions- not of chromatin openness from the entire genome, which is a reason for why the chromatin openness at

      D35 is higher than at other time-points. The genome-wide chromatin openness is presented in revised

      Supplementary Figure 2D and we have now updated the figure legend to avoid any potential misunderstanding. 

      The sequencing depth for each sample is extracted in a similar range. To give the readers a complete picture, we also present the depth of sequencing reads for each sample (Revised Supplementary Figure 2A and Revised Supplementary Figure 4A). The Y-axes of genome browser tracks were normalized, and we added the normalized value in the figures. 

      (5) In Figure 1F, are these all enriched terms, or were they prioritized somehow? 

      Yes, the enriched terms are prioritized based on biological meanings, and we have now clarified this in the updated legend of the manuscript. In addition, all enriched terms are now included in revised Supplementary Table 2 and Supplementary Table 4. 

      (6) In Figure 1G (also the same plots in Fig 2/3), are all these images normalized i.e. there is no scale bar for each track, and do they represent and aggregate BAM/bigwig?

      Yes, the genome browser tracks were normalized and we have now revised the figures by adding scale bars.

      It would be good to show in supplement the variability across cell lines/diffs - particularly given the variability in the heatmap/PCA - and demonstrate the rigor/reproducibility of these results. This comment applies to all these plots across the 3 figures, particularly as in some instances the samples appear to cluster by individual first and then time point (Sup Fig 3B). 

      Thanks. We have now revised the figure with plots showing individual samples. 

      How confident are the authors that these effects are driven by genotype and not a single cell line? In the Fig 3D representation of NANOG, it is very difficult to see any difference between patient and control. 

      In Figure 3D, we showed common chromatin dynamics in the control and patient groups. To avoid any misunderstanding, we have now updated our legend in the revised manuscript. 

      (7) For the changes in occupancy annotation (UTR/exon/intron etc), are these differences still significant after correcting for variability from cell line to cell line at each time point? I.e. rather than average across all three samples, what is the range?  Reply: Revised accordingly. 

      (8) The VPA timepoint is not well-justified. Given that VPA would be administered in patients with fully mature inhibitory neurons, it is difficult to determine the biological relevance. I appreciate that this is a limitation of the model, but this should at least be addressed in the manuscript. 

      We agree on that our model system of GABAergic interneuron development has limitations and that cells may not fully recapitulate the development and physiology in vivo. Obvious factors to consider in our system are the directed protocol to enrich for GABAergic interneurons and the differentiation time-line restricted to 65d. This is now discussed (lines 499-506).

      Recommendations for the authors:

      (1) The term 'mutation' has been replaced with the term ' pathogenic variant' or likely pathogenic variant depending on the context, please see PMID: 25741868 

      Thank you for pointing this out. We have replaced all instances of “mutation” with “pathogenic variant” throughout the manuscript.

      (2) It is unclear what the nomenclature for sample labelling is in Supplementary Figure 1, e.g. 7C, 8F, 1B.  

      We apologize for this confusion. There are cell lines names. We labeled all data and images according to cell line name, i.e. control lines: Ctl1B, Ctl7C and Ctl8F; patient lines: DD1C, DD4A, DD5A. To avoid any potential confusion, we have added a note in the revised legend of Supplementary Figure 1B.

      (3) Can the authors confirm that the Deseq2 FDR values are Benjamini-Hochberg procedure corrected per default settings? If so, this should ideally be added to methods or legend for clarity 

      Yes, default settings were used in Deseq2 FDR values, which is added in the method part of revised manuscript. 

      (4) While it makes sense that the authors present the data in the order of Figure 1, and Figure 2, this actually makes it quite difficult to compare the two datasets, especially for the functional enrichment in the "F" figures. It may be helpful to consider re-organizing the figure order. For instance, for the long-term potentiation signal in the DS-iPSCs, what does this mean in terms of biological relevance? Or maybe Figure 2 needs to be supplementary given that Figure 3 is a more direct comparison.  

      Thank you for the suggestions. We attempted to reorganize during our revision. We still believe it is easier for the audience to grasp the main message if we organize it according to our current workflow—first presenting an individual differential landscape for controls and patients, and then comparing the common and unique aspects among them.

    1. eLife assessment

      This important study identifies the anti-inflammatory function of PEGylated PDZ peptides that are derived from the ZO-1 protein. Results from cellular and in vivo experiments tracking key inflammatory markers are compelling. Although the present study would benefit from investigating chronic inflammation conditions using microbe and protein data, the work provides a proof of concept for developing novel strategies against acute inflammatory conditions such as sepsis.

    2. Reviewer #2 (Public Review):

      Summary:

      The authors investigated systemic inflammation induced by LPS in various tissues and also examined immune cells of the mice using tight junction protein-based PDZ peptide. They explored the mechanism of anti-systemic inflammatory action of PDZ peptides, which enhanced M1/M2 polarization and induced the proliferation of M2 macrophages. Additionally, they insisted the physiological mechanism that inhibited the production of ROS in mitochondria, thereby preventing systemic inflammation.

      Strengths:

      In the absence of specific treatments for septic shock or sepsis, the study demonstrating that tight junction-based PDZ peptides inhibit systemic inflammation caused by LPS is highly commendable. Whereas previous research focused on antibiotics, this study proves that modifying parts of intracellular proteins can significantly suppress symptoms caused by septic shock. The authors expanded the study of localized inflammation caused by LPS or PM2.5 in the respiratory tract to systemic inflammation, presenting promising results. They not only elucidated the physiological mechanism by identifying the transcriptome through RNA sequencing but also demonstrated that PDZ peptides inhibit the production of ROS in mitochondria and prevent mitochondrial fission. This research is highly regarded as an excellent study with potential as a treatment for septic shock or sepsis.

      Weaknesses:

      (1) They Focused intensively on acute inflammation for a short duration instead of chronic inflammation.

      (2) LPS was used to induce septic shock but administrating actual microbes such as E.coli would yield more accurate results.

      (3) The authors used pegylated peptides, but future research should utilize the optimized peptides to derive the optimal peptide, and further, PK/PD studies are also necessary.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      (1) Peptides were synthesized with fluorescein isothiocyanate (FITC) and Tat tag, and then PEGylated with methoxy PEG Succinimidyl Succinate.

      I have two concerns about the peptide design. First, FTIC was intended "for monitoring" (line 129), but was never used in the manuscript. Second, PEGylation targets the two lysine sidechains on the Tat, which would alter its penetration property.

      (1) We conducted an analysis of the cellular trafficking of FITC-tagged peptides following their permeabilization into cells.

      Author response image 1.

      However, we did not include it in the main text because it is a basic result.

      (2) As can be seen in the figure above, after pegylation and permeabilization, the cells were stained with FITC. It appears that this does not affect the ability to penetrate into the cells.

      (2) "Superdex 200 increase 10/300 GL column" (line 437) was used to isolate mono/di PEGylated PDZ and separate them from the residual PEG and PDZ peptide. "m-PEG-succinimidyl succinate with an average molecular weight of 5000 Da" (lines 133 and 134).

      To my knowledge, the Superdex 200 increase 10/300 GL column is not suitable and is unlikely to produce traces shown in Figure 1B.

      As Superdex 200 increase 10/300 GL featrues a fractionation range of 10,000 to 600,000 Da, we used it to fractionate PEGylated products including DiPEGylated PDZ (approx. 15 kDa) and MonoPEGylated PDZ (approx. 10 kDa) from residuals (PDZ and PEG), demonstrating successful isolation of PEGylated products (Figure 1C). Considering the molecular weights of PDZ and PEG are approximately 4.1 kDa and and 5.0 kDa, respectively, the late eluting peaks from SEC were likely to represent a mixed absorbance of PDZ and PEG at 215 nm.

      However, as the reviewer pointed out, it could be unreasonable to annotate peaks representing PDZ and PEG, respectively, from mixed absorbance detected in a region (11-12 min) beyond the fractionation range.

      In our revised manuscript, therefore, multiple peaks in the late eluting volume (11-12 min) were labeled as 'Residuals' all together. As a reference, the revised figure 1B includes a chromatogram of pure PDZ-WT under the same analytic condition.

      Therefore, we changed Fig.1B to new results.

      (3) "the in vivo survival effect of LPS and PDZ co-administration was examined in mice. The pretreatment with WT PDZ peptide significantly increased survival and rescued compared to LPS only; these effects were not observed with the mut PDZ peptide (Figure 2a)." (lines 159-160).

      Fig 2a is the weight curve only. The data is missing in the manuscript.

      We added the survived curve into Fig. 2A.

      (4) Table 1, peptide treatment on ALT and AST appears minor.

      In mice treated with LPS, levels of ALT and AGT in the blood are elevated, but these levels decrease upon treatment with WT PDZ. However, the use of mut PDZ does not result in significant changes. Figure 3A shows inflammatory cells within the central vein, yet no substantial hepatotoxicity is observed during the 5-day treatment with LPS. Normally, the ranges of ALT and AGT in C57BL6 mice are 16 ~ 200 U/L and 46 ~ 221 U/L, respectively, according to UCLA Diagnostic Labs. Therefore, the values in all experiments fall within these normal ranges. In summary, a 5-day treatment with LPS induces inflammation in the liver but is too short a duration to induce hepatotoxicity, resulting in lower values.

      (5) MitoTraker Green FM shouldn't produce red images in Figure 6.

      We changed new results (GREEN one) into Figs 6A and B.

      (6) Figure 5. Comparison of mRNA expression in PDZ-treated BEAS-2B cells. Needs a clearer and more detailed description both in the main text and figure legend. The current version is very hard to read.

      We changed Fig. 5A to new one to understand much easier and added more detailed results and figure legend.

      Results Section in Figure 5:

      we performed RNA sequencing analysis. The results of RNA-seq analysis showed the expression pattern of 24,424 genes according to each comparison combination, of which the results showed the similarity of 51 genes overlapping in 4 gene categories and the similarity between each comparison combination (Figure 5a). As a result, compared to the control group, it was confirmed that LPS alone, WT PDZ+LPS, and mut PDZ+LPS were all upregulated above the average value in each gene, and when LPS treatment alone was compared with WT PDZ+LPS, it was confirmed that they were averaged or downregulated. When comparing LPS treatment alone and mut PDZ+LPS, it was confirmed that about half of the genes were upregulated. Regarding the similarity between comparison combinations, the comparison combination with LPS…

      Figure 5 Legend Section:

      Figure 5. Comparison of mRNA expression in PDZ-treated BEAS-2B cells.

      BEAS-2B cells were treated with wild-type PDZ or mutant PDZ peptide for 24 h and then incubated with LPS for 2 h, after which RNA sequencing analysis was performed. (a) The heat map shows the general regulation pattern of about 51 inflammation-related genes that are differentially expressed when WT PDZ and mut PDZ are treated with LPS, an inflammatory substance. All samples are RED = upregulated and BLUE = downregulated relative to the gene average. Each row represents a gene, and the columns represent the values of the control group treated only with LPS and the WT PDZ and mut PDZ groups with LPS. This was used by converting each log value into a fold change value. All genes were adjusted to have the same mean and standard deviation, the unit of change is the standard deviation from the mean, and the color value range of each row is the same. (b) Significant genes were selected using Gene category chat (Fold change value of 2.00 and normalized data (log2) value of 4.00). The above pie chart shows the distribution of four gene categories when comparing LPS versus control, WT PDZ+LPS/LPS, and mut PDZ+LPS/LPS. The bar graph below shows RED=upregulated, GREEN=downregulated for each gene category, and shows the number of upregulated and downregulated genes in each gene category. (c) The protein-protein interaction network constructed by the STRING database differentially displays commonly occurring genes by comparing WT PDZ+LPS/LPS, mut PDZ+LPS/LPS, and LPS. These nodes represent proteins associated with inflammation, and these connecting lines denote interactions between two proteins. Different line thicknesses indicate types of evidence used in predicting the associations.

      Reviewer #2:

      (1) In this paper, the authors demonstrated the anti-inflammatory effect of PDZ peptide by inhibition of NF-kB signaling. Are there any results on the PDZ peptide-binding proteins (directly or indirectly) that can regulate LPS-induced inflammatory signaling pathway? Elucidation of the PDZ peptide-its binding partner protein and regulatory mechanisms will strengthen the author's hypothesis about the anti-inflammatory effects of PDZ peptide.

      As mentioned in the Discussion section, we believe it is crucial to identify proteins that directly interact with PDZ and regulate it. This direct interaction can modulate intracellular signaling pathways, so we plan to express GST-PDZ and induce binding with cellular lysates, then characterize it using the LC-Mass/Mass method. We intend to further research these findings and submit them for publication.

      (2) The authors presented interesting insights into the therapeutic role of the PDZ motif peptide of ZO-1. PDZ domains are protein-protein interaction modules found in a variety of species. It has been thought that many cellular and biological functions, especially those involving signal transduction complexes, are affected by PDZ-mediated interactions. What is the rationale for selecting the core sequence that regulates inflammation among the PDZ motifs of ZO-1 shown in Figure 1A?

      The rationale for selecting the core sequence that regulates inflammation among the PDZ motifs of ZO-1, as shown in Figure 1A, is grounded in the specific roles these motifs play in signal transduction pathways that are crucial for inflammatory processes. PDZ domains are recognized for their ability to function as scaffolding proteins that organize signal transduction complexes, crucial for modulating cellular and biological functions. The chosen core sequence is particularly important because it is conserved across ZO-1, ZO-2, and ZO-3, indicating a fundamental role in maintaining cellular integrity and signaling pathways. This conservation suggests that the sequence’s involvement in inflammatory regulation is not only significant in ZO-1 but also reflects a broader biological function across the ZO family.

      (3) In Figure 3, the authors showed the representative images of IHC, please add the quantification analysis of Iba1 expression and PAS-positive cells using Image J or other software. To help understand the figure, an indication is needed to distinguish specifically stained cells (for example, a dotted line or an arrow).

      We added the semi-quantitative results into Figs. 3d,e,f.

      Result section: The specific physiological mechanism by which WT PDZ peptide decreases LPS-induced systemic inflammation in mice and the signal molecules involved remain unclear. These were confirmed by a semi-quantitative analysis of Iba-1 immunoreactivity and PAS staining in liver, kidney, and lung,respectively (Figures 4d, e, and f). To examine whether WT PDZ peptide can alter LPS-induced tissue damage in the kidney, cell toxicity assay was performed (Figure 3g). LPS induced cell damage in the kidney, however, WT PDZ peptide could significantly alleviate the toxicity, but mut PDZ peptide could not. Because cytotoxicity caused by LPS is frequently due to ROS production in the kidney (Su et al., 2023; Qiongyue et al., 2022), ROS production in the mitochondria was investigated in renal mitochondria cells harvested from kidney tissue (Figure 3h)......

      Figure legend section: Indicated scale bars were 20 μm. (d,e,f) Semi-quantitative analysis of each are positive for Iba-1 in liver and kidney, and positive cells of PAS in lung, respectively. (g) After the kidneys were harvested, tissue lysates were used for MTT assay. (h) After.....

      (4) In Figure 6G, H, the authors confirmed the change in expression of the M2 markers by PDZ peptide using the mouse monocyte cell line Raw264.7. It would be good to add an experiment on changes in M1 and M2 markers caused by PDZ peptides in human monocyte cells (for example, THP-1).

      We thank you for your comments. To determine whether PDZ peptide regulates M1/M2 polarization in human monocytes, we examined changes in M1 and M2 gene expression in THP-1 cells. As a result, wild-type PDZ significantly suppressed the expression of M1 marker genes (hlL-1β, hIL-6, hIL-8, hTNF-ɑ), while increasing the expression of M2 marker genes (hlL-4, hIL-10, hMRC-1). However, mutant PDZ did not affect M1/M2 polarization. These results suggest that PDZ peptide can suppress inflammation by regulating M1/M2 polarization of human monocyte cells. These results are for the reviewer's reference only and will not be included in the main content.

      Author response image 2.

      Minor point:

      The use of language is appropriate, with good writing skills. Nevertheless, a thorough proofread would eliminate small mistakes such as:

      • line 254, " mut PDZ+LPS/LPS (45.75%) " → " mut PDZ+LPS/LPS (47.75%) "

      • line 296, " Figure 6f " → " Figure 6h "

      We changed these points into the manuscript.

    1. eLife assessment

      This important study presents a novel pipeline for the large-scale genomic prediction of members of the non-ribosomal peptide group of pyoverdines based on a dataset from nearly 2000 Pseudomonas genomes. The advance presented in this study is based on convincing evidence. This study of bacterial siderophores has broad theoretical and practical implications beyond a singular subfield.

    2. Reviewer #1 (Public Review):

      The manuscript introduces a bioinformatic pipeline designed to enhance the structure prediction of pyoverdines, revealing an extensive and previously overlooked diversity in siderophores and receptors. Utilizing a combination of feature sequence and phylogenetic approaches, the method aims to address the challenging task of predicting structures based on dispersed gene clusters, particularly relevant for pyoverdines.

      Predicting structures based on gene clusters is still challenging, especially pyoverdines as the gene clusters are often spread to different locations in the genome. The revised manuscript has much improved in clarity and reproducibility. I believe that the method is not yet applicable to all NRPS in general and that there is a clear scalability issue when talking about Big Data. However, the method is highly useful for specific NRPS families such as the pyoverdines, so the manuscript presents a useful bioinformatic pipeline for pyoverdine structure prediction, showcasing a commendable exploration of siderophore diversity.

    3. Reviewer #2 (Public Review):

      Pyoverdines, siderophores produced by many Pseudomonads, are one of the most diverse groups of specialized metabolites and frequently used as model systems. Thousands of Pseudomonas genomes are available, but large scale analyses of pyoverdines are hampered by the biosynthetic gene clusters (BGCs) being spread across multiple genomic loci and existing tools' inability to accurately predict amino acid substrates of the biosynthetic adenylation (A) domains. The authors present a bioinformatics pipeline that identifies pyoverdine BGCs and predicts the A domain substrates with high accuracy. They tackled a second challenging problem by developing an algorithm to differentiate between outer membrane receptor selectivity for pyoverdines versus other siderophores and substrates. The authors applied their dataset to thousands of Pseudomonas strains, producing the first comprehensive overview of pyoverdines and their receptors and predicting many new structural variants.

      The A domain substrate prediction is impressive, including the correction of entries in the MIBiG database. Their high accuracy came from a relatively small training dataset of A domains from 13 pyoverdine BGCs. The authors acknowledge that this small dataset does not include all substrates, and correctly point out that new sequence/structure pairs can be added to the training set to refine the prediction algorithm. The workflow unfortunately cannot differentiate between different variants of Asp and OHOrn. To validate their predictions, they elucidated structures of several new pyoverdines, and their predictions performed well. The authors tested their workflow on Burkholderiales A domains and had good results, suggesting it can be used on other taxa. Skimming through the source code and data, the algorithm itself appears to be sound and a clear improvement over existing tools for pyoverdine BGC annotation.

      Predicting outer membrane receptor specificity is likewise a challenging problem and the authors have made a promising achievement by finding specific gene regions that differentiate the pyoverdine receptor FpvA from FpvB and other receptor families. Their predictions were not tested experimentally, but the finding that only predicted FpvA receptors were proximate to the biosynthesis genes lends credence to the predictive power of the workflow. The authors find predicted pyoverdine receptors across an impressive 468 genera, an exciting finding for expanding the role of pyoverdines as public goods beyond Pseudomonas. However, whether or not these receptors can actually recognize pyoverdines (and if so, which structures!) remains to be investigated.

      In all, the authors have assembled a rich dataset that will enable large scale comparative genomic analyses. This dataset could be used by a variety of researchers, including those studying natural product evolution, public good eco/evo dynamics, and NRPS engineering.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study presents a novel pipeline for the large-scale genomic prediction of members of the non-ribosomal peptide group of pyoverdines based on a dataset from nearly 2000 Pseudomonas genomes. The advance presented in this study is largely based on solid evidence, although some main claims are only incompletely supported. This study on bacterial siderophores has broad theoretical and practical implications beyond a singular subfield.

      Thank you for the supportive and encouraging words. We appreciate the editor’s and reviewers’ careful and professional assessment of this manuscript. The reviewers’ scrutiny has helped us to improve the presentation and discussion of our work. We have now carefully revised the manuscript following their instructive suggestions and comments. Please find below our detailed responses (marked in blue) to each of the comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      The manuscript introduces a bioinformatic pipeline designed to enhance the structure prediction of pyoverdines, revealing an extensive and previously overlooked diversity in siderophores and receptors. Utilizing a combination of feature sequence and phylogenetic approaches, the method aims to address the challenging task of predicting structures based on dispersed gene clusters, particularly relevant for pyoverdines.

      Predicting structures based on gene clusters is still challenging, especially pyoverdines as the gene clusters are often spread to different locations in the genome. An improved method would indeed be highly useful, and the diversity of pyoverdine gene clusters and receptors identified is impressive.

      However, so far the method basically aligns the structural genes and domains involved in pyoverdine biosynthesis and then predicts A domain specificity to predict the encoded compounds. Both methods are not particularly new as they are included in other tools such as PRISM (10.1093/nar/gkx320) or Sandpuma (https://doi.org/10.1093/bioinformatics/btx400) among others. The study claims superiority in A domain prediction compared to existing tools, yet the support is currently limited, relying on a comparison solely with AntiSMASH. A more extensive and systematic comparison with other tools is needed.  

      Thanks for pointing this out. In the revised manuscript, we have included a comprehensive comparative analysis, in which we compared our pipeline to six different commonly used methods, including NP.searcher, PRISM4, AdenPredictor, SeMPI2, SANDPUMA, antiSMASH5 (see Supplementary_table 6 for details, and lines 281-286). These approaches either consist of a single specific algorithm or integrate several methods. Our approach performs best (see table below), demonstrating a clear improvement over previous tool. The improvements are due to several methodological differences inherent to our approach. Additionally, while exploring existing prediction tools, we found that some had not been maintained for years. For instance, we were unable to access NRPSsp (www.nrpssp.com) and NRPSpredictor2 (http://nrps.informatik.uni-tuebingen.de/). Below, we briefly explain these differences, particularly in relation to PRISM and SANDPUMA, as highlighted by the reviewer. 

      Author response table 1.

      PRISM annotates biosynthetic gene clusters (BGC) and reconstructs the linear structures of NRPS synthetases, with this function depending on proper annotations of open reading frames. This pipeline can have difficulties in assembling the linear structure into a final product. In our approach, we found that the annotations of NRPS gene are frequently truncated because of sequencing errors and annotation issues. Our method fixes this problem through rescanning all possible reading frames of the BGC to rebuild complete pyoverdine synthetase genes. 

      Sandpum and our approach are based on similar ideas (using the prediCAT algorithm) to predict A domain substrates, namely by using the closest reference A domain annotated. However, our method uses a self-adaptive feature extraction step to reduce the co-founding influence of phylogeny. This small adjustment significantly improves the performance of our approach and even works well for small training sets (101 experimentally validated A domains with our approach as opposed to 494 A domains used by Sandpuma from MIBiG).

      Additionally, in contradiction to the authors' claims, the method's applicability seems constrained to well-known and widely distributed gene clusters. The absence of predictions for new amino acids raises concerns about its generalizability to NRPS beyond the studied cases.

      We thank the reviewers for this comment. We acknowledge that our method cannot directly predict new amino acids. Nevertheless, for several reasons we believe that our approach is not constrained and can be widely applied in the future.

      First, our method can identify A domains that select new unknown amino acid substrates. In fact, three of the four unresolved cases in our experimental verification analysis (Fig. 3d) represent new amino acids. Obviously, experimental verification is required to characterize the unknown substrate. Once verified, the new A domains and their substrates can expand the reference dataset, allowing targeted improvement of our phylogeny-focused prediction technique. We now discuss this aspect in lines 634-645.

      Second, despite that the overall substrate diversity in NRPS is high across the microbial kingdom, our analysis suggests that the number of amino acids used for a specific group of secondary metabolites quickly reaches a saturation point. The discovery rate of new amino acids was 1.7% for our experimental Pseudomonas data set (Fig. 3d). The discovery rate of new amino acids was even 0.0 % for the Burkholderiales data set. This suggests that as the database expands, the discovery rate of novel amino acid substrates is expected to drop rapidly.

      Third, we acknowledge that the inability to predict the substrates of unknown domains is a common limitation among all knowledge-guided learning algorithms, including ours. However, we have made significant improvements in prediction accuracy. As the database grows, we expect the rate of unknown substrates to decrease, and the prediction accuracy to increase.

      The manuscript lacks clarity on how the alignment of structural genes operates when dealing with multiple NRPS gene clusters on different genome contigs. How would the alignment of each BGC work?

      We thank the reviewers for this comment. The pyoverdine molecules consist of a conserved fluorescent chromophore (Flu) and a peptide chain (Pep), both synthesized by NRPS enzymes. In most instances (over 90%), Flu and Pep are produced by two separate biosynthetic gene clusters (BGCs). In these cases, we merge the two BGCs by positioning Flu at the head and Pep at the tail. For the remaining less than 10%, there are two scenarios: 1. Flu and Pep are located on the same BGC, which eliminates any issues with BGC alignment. 2. In very rare cases, Flu and Pep are synthesized by three BGCs. Here, Flu is still synthesized by one BGC at the head, while Pep is produced by two BGCs. We put the BGC containing the Thioesterase (TE) domain as the tail and the BGC not containing the TE domain in the middle.

      (see lines 165-169).

      Another critical concern is that a main challenge in NRPS structure prediction is not the backbone prediction but rather the prediction of tailoring reactions, which is not addressed in the manuscript at all, and this limitation extensively restricts the applicability of the method.

      While we thank the reviewer for this comment, we only partly agree with it. Peptide backbone predictions are still a significant challenge. This challenge is clearly visible in our new analysis comparing prediction accuracies of different pipelines, such as antiSMASH5, PRISM4, AdenPredictor, SeMPI2, NP.searcher, Sandpuma. Unresolved and wrong substrate predictions are still common, highlighting the importance of our contribution in developing a new approach with improved high accuracy. 

      However, we agree with the reviewer that our current algorithm does not predict tailoring reactions (now discussed on lines 680-685). Although tailoring reactions are important for predicting the final NRPS product structure, none of the other existing pipelines address this issue either, and it remains a challenge for future work. For our study, it is important to note that the specificity of pyoverdines is primarily determined by the backbone composition, whereas tailoring reactions seem to play a minor role.

      The manuscript presents a potentially highly useful bioinformatic pipeline for pyoverdine structure prediction, showcasing a commendable exploration of siderophore diversity. However, some of the claims made remain unsubstantiated. Overall, while the study holds promise, further validation and refinement are required to fulfill its potential impact on the field of bioinformatic structure prediction.

      Thank you for the supportive and encouraging words. We deeply appreciate your constructive comments and suggestions. 

      Reviewer #2 (Public Review):

      Pyoverdines, siderophores produced by many Pseudomonads, are one of the most diverse groups of specialized metabolites and are frequently used as model systems. Thousands of Pseudomonas genomes are available, but large-scale analyses of pyoverdines are hampered by the biosynthetic gene clusters (BGCs) being spread across multiple genomic loci and existing tools' inability to accurately predict amino acid substrates of the biosynthetic adenylation (A) domains. The authors present a bioinformatics pipeline that identifies pyoverdine BGCs and predicts the A domain substrates with high accuracy. They tackled a second challenging problem by developing an algorithm to differentiate between outer membrane receptor selectivity for pyoverdines versus other siderophores and substrates. The authors applied their dataset to thousands of Pseudomonas strains, producing the first comprehensive overview of pyoverdines and their receptors and predicting many new structural variants.

      The A domain substrate prediction is impressive, including the correction of entries in the MIBiG database. Their high accuracy came from a relatively small training dataset of A domains from 13 pyoverdine BGCs. The authors acknowledge that this small dataset does not include all substrates, and correctly point out that new sequence/structure pairs can be added to the training set to refine the prediction algorithm. 

      The authors could have been more comprehensive in finding their training set data. For instance, the authors claim that histidine "had not been previously documented in pyoverdines", but the sequenced strain P. entomophila L48, incorporates His (10.1007/s10534-009-9247-y). 

      Thank you for highlighting this issue. We agree that stating histidine has not been reported before in pyoverdine was incorrect. We have reviewed the full text and made the necessary corrections.

      The primary reason for excluding the sequenced strains P. syringae 1448a (10.1186/14712180-11-218) and P. entomophila L48 (10.1007/s10534-009-9247-y) from the training set is that the pyoverdine structures of these strains were not determined solely through experimental methods. In these works, the pyoverdine structures were predicted based on the synthetic gene sequence using bioinformatical analysis, followed by structural analysis experiments based on this predicted structure. We found that pre-prediction probably has introduced biases into downstream analyses. Specifically, in the case of Pseudomonas entomophila L48, we discovered inaccuracies in the annotation of certain domains (see figures below). For example, the third A domain of the peptide chain in P. entomophila L48 pyoverdine was initially annotated with Dab specificity. However, upon closer examination, it appears to differ significantly from other Dab references (top) or Dab from our experimentally validated (right) domains (left panel in the figure below). By analyzing the interface (I) domain (10.1073/pnas.1903161116) in its predicted site, we suggested that it should actually recognize OHHis. The OHAsp domain of P. entomophila L48 reported in the paper is actually close in sequence similarity to the OHAsp domain (left panel in the figure below), while the Ala domain reported is more similar to the Ser domain (right panel in the figure below). For these reasons, we did not include this supervised pyoverdine structure analysis strain in the training set data.

      Author response image 1.

      The workflow cannot differentiate between different variants of Asp and OHOrn, and it's not clear if this is a limitation of the workflow, the training data, or both. 

      Thanks for pointing this out. It is generally challenging to differentiate between variants of the same amino acid (for all the algorithms existing to date). In this sense, it is a limitation of our but also of all other workflows. Nonetheless, we wish to stress that we observed feature sequence divergence (using the A motif4-5 region), which helped us to separate some (but not all) of the Asp and Orn variants. For example, separations between Asp-variants are distinct (left panel in the figure below). To be on the conservative side, we only differentiated between OHAsp and Asp for our predictions, but also differentiation between DOHAsp and OHAsp would be possible. In the case of Orn-variants, there was a clear separation between Orn and the OHOrn variants (right panel). In contrast, it was difficult to differentiate between the subgroups of OHOrn variants. We believe that no A domain prediction tool will be able to solve this issue. Instead, it would be important to include information on substrate-modifying enzymes in future approaches.

      Author response image 2.

      The prediction workflow holds up well in Burkholderiales A domains, however, they fail to mention in the main text that they achieved these numbers by adding more A domains to their training set.

      We thank the reviewers for this comment. We apologize for not having mentioned the training data set in the main text, while we described it in detail in the methods section (lines 714-732). We now provided more details on the analysis procedure in the main text (lines 307313). Important to note is that we did not add more A domains to the training data set but built up a new independent data set for Burkholderiales. The aim was to mirror the analysis we performed for pyoverdines with a completely new data set, featuring 124 A domains for training and 178 A domains as test set.

      To validate their predictions, they elucidated structures of several new pyoverdines, and their predictions performed well. However, the authors did not include their MS/MS data, making it impossible to validate their structures. In general, the biggest limitation of the submitted manuscript is the near-empty methods section, which does not include any experimental details for the 20 strains or details of the annotation pipeline (such as "Phydist" and "Syndist"). The source code also does not contain the requisite information to replicate the results or re-use the pipeline, such as the antiSMASH version and required flags. That said, skimming through the source code and data (kindly provided upon request) suggests that the workflow itself is sound and a clear improvement over existing tools for pyoverdine BGC annotation.

      Thank you for highlighting these issues. We agree that the methods section is short. This is because the entire paper is a step-by-step methodological introduction to our pipeline. We have now carefully revised the main text to add the information requested by the reviewer. Moreover, we have included a supplementary file with the MS/MS data of the experimentally analyzed pyoverdine structures. Finally, we further include a link to a one-click online notebook that can be used to replicate the annotation and substrate prediction results See: https://drive.google.com/drive/folders/1JsfyPUGDTFo8BDDZk8JLSvKry8emzMhr?usp=drive_ link , following a more detail explanation on code.

      Predicting outer membrane receptor specificity is likewise a challenging problem and the authors have made a promising achievement by finding specific gene regions that differentiate the pyoverdine receptor FpvA from FpvB and other receptor families. Their predictions were not tested experimentally, but the finding that only predicted FpvA receptors were proximate to the biosynthesis genes lends credence to the predictive power of the workflow. The authors find predicted pyoverdine receptors across an impressive 468 genera, an exciting finding for expanding the role of pyoverdines as public goods beyond Pseudomonas. However, whether or not these receptors can recognize pyoverdines (and if so, which structures!) remains to be investigated.

      Thank you for the supportive and encouraging words. The bioinformatic analysis and experimental testing of pyoverdine-receptor matching is complicated and it is not part of this paper. We treated it in a separate manuscript in which we developed an experimentally verified co-evolution algorithm that matches pyoverdines to receptors. With this algorithm, we can identify self-receptors (i.e. receptors used to take up the self-produced pyoverdine), and therefore establish pyoverdine sharing and interaction networks across strains in communities.

      Please see DOI:10.1101/2023.11.05.565711 for details.

      In all, the authors have assembled a rich dataset that will enable large-scale comparative genomic analyses. This dataset could be used by a variety of researchers, including those studying natural product evolution, public good eco/evo dynamics, and NRPS engineering.

      Thank you for the supportive and encouraging words. We are grateful for the reviewers’ instructive suggestions and comments.

      Reviewer #3 (Public Review):

      Summary:

      Secondary metabolites are produced by numerous microorganisms and have important ecological functions. A major problem is that neither the function of a secondary metabolite enzyme nor the resulting metabolite can be precisely predicted from gene sequence data.

      In the current paper, the authors addressed this highly relevant question.

      The authors developed a bioinformatic pipeline to reconstruct the complete secondary metabolism pathway of pyoverdines, a class of iron-scavenging siderophores produced by Pseudomonas spp. These secondary metabolites are biosynthesized by a series of nonribosomal peptide synthetases and require a specific receptor (FpvA) for uptake. The authors combined knowledge-guided learning with phylogeny-based methods to predict with high accuracy encoding NRPSs, substrate specificity of A domains, pyoverdine derivatives, and receptors. After validation, the authors tested their pipeline with sequence data from 1664 phylogenetically distinct Pseudomonas strains and were able to determine 18,292 enzymatic A domains involved in pyoverdine synthesis, reliably predicted 97.8% of their substrates, identified 188 different pyoverdine molecule structures and 4547 FpvA receptor variants belonging to 94 distinct groups. All the results and predictions were clearly superior to predictions that are based on antiSMASH. Novel pyoverdine structures were elucidated experimentally by UHPLC-HR-MS/MS.

      To assess the extendibility of the pipeline, the authors chose Burkholderiales as a test case which led to the results that the pipeline consistently maintains high prediction accuracy within Burkholderiales of 83% which was higher than for antiSMASH (67%).

      Together, the authors concluded that supervised learning based on a few known compounds produced by species from the same genus probably outperforms generalized prediction algorithms trained on many products from a diverse set of microbes for NRPS substrate predictions. As a result, they also show that both pyoverdine and receptor diversity have been vastly underestimated.

      Strengths:

      The authors developed a very useful bioinformatic pipeline with high accuracy for secondary metabolites, at least for pyoverdines. The pipelines have several advantages compared to existing pipelines like the extensively used antiSMASH program, e.g. it can be applied to draft genomes, shows reduced erroneous gene predictions, etc. The accuracy was impressively demonstrated by the discovery of novel pyoverdines whose structures were experimentally substantiated by UHPLC-HR-MS/MS.

      The manuscript is very well written, and the data and the description of the generation of pipelines are easy to follow.

      Weaknesses:

      The only major comment I have is the uncertainty of whether the pipeline can be applied to more complex non-ribosomal peptides. In the current study, the authors only applied their pipeline to a very narrow field, i.e., pyoverdines of Pseudomonas and Burkholderia strains.

      Thanks for your positive and encouraging comment. Regarding your only major comment, we think that the design concept of our pipeline has the potential to be applied to more complex non-ribosomal peptides. Currently, our method is tailored to accurately predict the structural composition of the Pseudomonas siderophore pyoverdine (see also response 3). A key point emphasized in our article is the importance of considering phylogeny in developing substrate prediction algorithms for A domains. Currently, the main challenge in advancing these algorithms is the limited availability of data on A domains and their corresponding substrates. However, with the future accumulation of more reference data, we are confident that the design principles of our method will enable precise predictions of the structural compositions of all products synthesized by non-ribosomal peptide synthetases (see our discussions in lines 634-

      645). 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I believe that the manuscript would benefit from focusing solely on the task of improving pyoverdine predictions. This aspect alone is significant, and robustly supporting this claim would strengthen the manuscript. The diversity analysis provided is valuable and would undoubtedly benefit the scientific community. However, additional systematic comparisons with other methods are necessary. Furthermore, clarification of certain terms, such as 'featurebased' (e.g., whether it refers to NRPS domains or CDS), would enhance clarity.

      Thank you for the supportive and encouraging words. We followed the reviewer’s suggestion and now provide the requested method comparison, see also response 2 for details. Furthermore, we have carefully checked the main text to clarify terms whenever needed. Specifically, we now define the terms “feature sequence” and “feature sequence distance” in lines 227-229.  

      Additionally, several minor points could be improved upon:

      In line 85, clarification is needed on how pyoverdine genes were identified.

      Thank you for your thorough review. In the introduction section, we provided a brief overview of our work, while the detailed methodology is outlined in the results section on lines 160-174.

      In line 382, it would be helpful to know the source of the sequences.

      We agree and have now carefully revised the manuscript following your suggestions (lines 403-405).

      Line 392 could be explained more clearly. Does it mean that the authors used an hmm search to search pHMMs against each reference sequence?

      Thanks for your comment. Yes, we used an hmm search to search pHMMs against each reference sequence. We have now revised the manuscript to improve explanations (lines 413-418).

      Reviewer #2 (Recommendations For The Authors):

      The authors state they "elucidated the chemical structure of the 20 pyoverdines using culturebased methods combined with UHPLC-HR-MS/MS", so I was alarmed to see that KR and LB already published several of those structures in the cited paper. I hope that this "double dipping" will be fixed in a revision process.

      Thank you for pointing this out. We agree that we have not explained clearly enough what steps were conducted in this study and which data were used from a previous paper (https://doi.org/10.1007/s00216-022-03907-w). The genomes of the 20 strains used for the verification analysis (Fig. 3d) were sequenced as part of this study (access code now provided). 14 out of the 20 pyoverdine structures were elucidated with UHPLC-HR-MS/MS in this study. For 6 out of the 20 pyoverdines, we had structural information already at hand from the previous paper. We have now clarified these details in our manuscript (lines 276-280). 

      Thank you for providing the source code and data, and I hope that the final non-redundant dataset will be uploaded to Zenodo or another repository. Please deposit the 20 newlysequenced genomes to GenBank or another public repository. Please also show the UHPLC-

      HR-MS/MS data, preferably in the form of raw data uploaded to GNPS.

      We have followed the reviewer’s advice and deposited our data:

      - The sequences of the 20 newly sequenced strains are available on ENA accession PRJEB76792.

      - The MS/MS plots of the 14 newly analyzed pyoverdines are shown in the Supplementary Materials.

      - We provide a one-click online notebook to allow readers to replicate the pyoverdine cluster annotation and substrate prediction of the 20 experimentally analyzed strains.

      I suggest adding "at least" or a similar qualifier when the 73 variants are mentioned unless the literature search was truly exhaustive. What were the criteria for inclusion of the 13 strains in Table S2? For instance, sequenced strains P. syringae 1448a (10.1186/1471-2180-11-218) and P. entomophila L48 (10.1007/s10534-009-9247-y) were not included.

      Thank you for your comment. We have now carefully revised the manuscript following your suggestions (lines 291-295). Regarding the criteria for including the 13 strains in Table S2, we aimed to select strains with the high credibility for inclusion in the training set data. The primary reason for excluding the two strains from the training set is that their siderophore structures were analyzed through supervised experiments. We wanted to avoid any form of biases that bioinformatic pre-predictions could introduce to downstream analyses (see Response 13 for details).

      OHAsp in pyoverdines has been reported to arise from hydroxylation of Asp after it's already been activated by the A domain (10.1073/pnas.1903161116). Was there a clear difference between A domains that lead to Asp and OHAsp? Conversely, acetylation and formylation of OHOrn occur before adenylation. Can your workflow be used to differentiate cOHOrn, fOHOrn, and AcOHOrn, which are currently difficult to predict through genome mining?

      Thank you for these considerations. We treated these aspects in our response 8.  

      Throughout, define non-proteinogenic AA substrate abbreviations (ex: Rsc, Dab).

      Revised as per suggestion (lines 329-333).

      Additional line comments:

      189: Mention PhyloPhlAn in the main text.

      Revised as per suggestion (lines 189).

      191: Define these filtering/selection criteria.

      Thanks for your comment, we have added the criteria in the main text (line 196 and line 198). 

      309, 620: An A domain presumably loading histidine is present in sequenced strain P. entomophila L48 (10.1007/s10534-009-9247-y). Please also clarify that Val has previously been seen in a pyoverdine (it is in Table S1) albeit not sequenced.

      We have clarified these aspects as per suggestion (lines 314-315 and line 630).

      310: The pipeline can "highlight" new substrates, but not identify them.

      Revised as per suggestion (line 295).

      354: Please clarify "13 amino acid substrates form the core of all the 188 pyoverdine structures", considering that 279 A domain substrates couldn't be predicted.

      Thanks for your comments. We have now clarified “our analysis found that 13 amino acids form the main structural substrates of all the 188 pyoverdine structures.” (lines

      360-363)

      630: "discovered" implies that there is experimental evidence. I suggest something like "here we predicted 151 putatively new variants".

      Revised as per suggestion (line 648).

      Reviewer #3 (Recommendations For The Authors):

      Weakness:

      The only major comment I have is the uncertainty of whether the pipeline can be applied to more complex non-ribosomal peptides. In the current study, the authors only applied their pipeline to a very narrow field, i.e., pyoverdines of Pseudomonas and Burkholderia strains

      Thanks for your comment. Please see our Responses 3+13 above, where we treat this concern in detail. Moreover, we discussed the possibility of extension to other groups of secondary metabolites in our discussion. We believe that we deliver a balanced view on the applicability of our approach and the next steps to be taken.  

      Please comment on this aspect.

      Minor:

      (1)  When you speak about "synthesis" it is rather biosynthesis. Synthesis is chemical synthesis.

      Please replace all instances of the word synthesis with biosynthesis.

      Revised as per suggestion.

      (2)  Line 188: synthetase is rather synthetases

      Revised as per suggestion (line 191).

    1. Reviewer #2 (Public Review):

      Summary:

      One of the greatest challenges for the spliceosome is to be able to repress the many cryptic splice sites that can occur in both the intronic and exotic sequences of genes. Although many studies have focused on cryptic signals in introns (because of their common involvement in disease) the question still remained open as to the factors that repress cryptic exons in exons. Because exons are normally much shorter than introns, in many cases the problem does not exist. However, in human genes a significant proportion of exons can be considerably longer than the average 150 nt length and this raises the question of how cryptic splicing can be prevented in long exons. To address this question, the authors have focused on the possible role played by an ancient mammalian RBD protein called RBMX. Using a combination of high-throughput and classic splicing methodologies, they have shown that there is a class of RBMX-dependent ultra-long exons connected where the RBMX, RBMXL2 and RBMY paralogs have closely related functional activity in repressing cryptic splice site selection.

      Strengths:

      In general, the present work sheds light on what has been a rather understudied process in splicing research. The use of iCLIP and RNA-seq data has not only allowed to identify the long exons where cryptic splicing is prevented by the RBMX proteins but has also allowed to identify a network of genes mostly involved in genome stability and transcriptional control where these proteins seem to play a prominent role. This can therefore also shed additional information on the way splicing has shaped evolutionary processes in the mammalian lineage and will therefore be of interest to many researchers in this field.

      Weaknesses:

      There are no major weaknesses, although some specific aspects of the findings could be addressed more in-depth in the recommendations to authors.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Point-by-point reply in response to the Reviewer’s comments

      Reviewer #1

      Public review:

      [1] (a) Given that only a fraction of the FAPs express BDNF after injury, the authors need to demonstrate the specificity of the Prrx1-Cre for FAPs. This is particularly important because muscle stem cell also express GDNF receptors (Fig. 3C & D) and myogenic progenitors/satellite cells produce BDNF after nerve injury (Griesbeck et al., 1995 (PMID 8531223); Omura et al., 2005 (PMID 16221288)). (b) Moreover, as the authors point out, there are multipotent mesenchymal precursor cells in the nerve that migrate into the surrounding tissue following nerve injury and contribute to regeneration (Carr et al, PMID 30503141). Therefore, there are multiple possible sources of BDNF, highlighting the need to clearly demonstrate that FAP-derived BDNF is essential.

      - (a) As the Reviewer noted, both GDNF receptor expression and increased BDNF expression in response to nerve injury are detectable in both FAPs and muscle stem cells (MuSCs). Therefore, we agree with the Reviewer that demonstrating the specificity of Prrx1-Cre in FAPs is crucial to support our claim. In our previous publication (Kim et al., 2022), using Prrx1-Cre; Rosa-eYFP mice, we showed that while most of the CD31-CD45-Vcam1-Sca1+ FAPs are eYFP+, CD31-CD45-Vcam1+Sca1- MuSCs do not express eYFP (Liu et al., 2015; Kim et al., 2022) (Attached Figure 1). Additionally, genomic DNA PCR using mononuclear cells sorted from our Prrx1Cre; Bdnffl/fl mice showed that DNA recombination in the floxed Bdnf gene could only be detected in FAPs and CD31-CD45-Vcam1-Sca1- cells, but not in MuSCs (Author response image 2). This is consistent with a previous report that showed Prrx1-Cre activity in FAPs, pericytes, vascular smooth muscle cells (vSMCs) and tenocytes (Leinroth et al.,

      2022), where pericytes, vSMCs and tenocytes are included the CD31-CD45-Vcam1Sca1- population (Giordani et al., 2019). Together, these results demonstrate that while Prrx1-Cre is active in FAPs, it is absent in MuSCs.

      Author response image 1.

      Expression of eYFP in muscle-resident, lineage-negative, live mononuclear cells isolated from Prrx1Cre;RosaeYFP mice. Supplemental Figure 3A from Kim et al., 2022. Lin-: lineage-negative (CD31-CD45-); Neg.: Vcam1-Sca1-.

      Author response image 2.

      Recombination of the floxed Bdnf gene in the mononuclear cells sorted from muscles of Prrx1Cre; Bdnffl/fl or Bdnffl/fl mice. Genotypes and cell types sampled for each lane is specified. P4, P5, and P6 indicate primers used for each PCR. Lin+: lineage(CD31/CD45)-positive; DN: CD31-CD45-Vcam1-Sca1-.

      - (b) We appreciate and agree with the Reviewer’s comment that additional experiments are needed to confirm that FAP-derived BDNF is indeed essential for nerve regeneration, considering other potential cellular sources of BDNF, such as nerve-resident mesenchymal precursor cells. One possible experiment that could demonstrate the requirement of FAP-derived BDNF in nerve regeneration would be the transplantation of wild-type FAPs into our Prrx1Cre; Bdnf fl/fl mice and to see if the delay in nerve regeneration and remyelination is recovered, making the process similar to that in control mice. Unfortunately, since the genetic background of our Prrx1Cre; Bdnffl/fl mice is a mixture of B6, 129S4, and BALB/c, immune rejection of the transplanted cells may occur, which makes the experiment technically difficult. Another experimental approach could involve the use of FAP-specific Cre mouse line, as we have mentioned in the Discussion of our original manuscript. However, such a line does not yet exist due to the lack of a marker gene that is expressed specifically in FAPs, but not in nerve-resident mesenchymal precursor cells. Overcoming such technical challenges and demonstrating the requirement of FAP-derived BDNF in nerve regeneration would significantly strengthen our report, though we regret that these methods are currently unavailable.

      [2] Similarly, the authors should provide some evidence that BDNF protein is produced by FAPs. All of their data for BDNF expression is based on mRNA expression and that appears to only be increased in a small subset of FAPs. Perhaps an immunostaining could be done to demonstrate up-regulation of BDNF in FAPs after injury.

      - We appreciate the Reviewer’s constructive comment. To demonstrate that BDNF protein is produced by FAPs upon nerve injury, we performed western blot analysis. FAPs were isolated from either sciatic nerve crush injury-affected muscles at 7 days post injury (dpi) or from the contralateral, uninjured muscles, and protein samples were prepared for SDS-PAGE and western blot using anti-BDNF, anti-PDGFRα and antiGAPDH antibodies. As a result, while both nerve injury-affected and uninjured musclederived FAPs expressed PDGFRα, the mature from of BDNF protein was only detected in nerve injury-affected FAPs, showing that BDNF is indeed expressed in FAPs at the protein level after injury. We have added this new result as Figure 4F in the New Figure 4 with the experimental scheme as New Figure 4—figure supplement 1, and revised the Results section (lines 364-374) and the Materials and Methods section (lines 687-705) in our manuscript to include the new results in detail.

      [3] The suggestion that Schwann cell-derived GDNF is responsible for upregulation of BDNF in the FAPs is indirect, based largely on the data showing that injection of GDNF into the muscle is sufficient to up-regulate BDNF (Fig. 4F & G). However, to more directly connect the 2 observations in a causal way, the authors should inject a Ret/GDNF antagonist, such as a Ret-Fc construct, then measure the BDNF levels.

      - We appreciate the Reviewer’s constructive comment, and we agree that testing the necessity of GDNF/RET signaling in BDNF upregulation is crucial to link the expression of the two neurotrophic factors in a causal way. As a means to antagonize GDNF/RET signaling, we injected anti-GDNF antibodies into the tibialis anterior and gastrocnemius muscles following sciatic nerve crush injury to block the activity of intramuscular GDNF protein. As a result, although the differences were not statistically significant, we observed a tendancy towards decreased Bdnf mRNA expression upon anti-GDNF injection compared to IgG controls. We have added this new result as New Figure 4—figure supplement 2, and revised our manuscript to include the details in both the Results section (lines 381-390) and the Materials and Methods section (lines 611-616). We have also changed the title of New Figure 4 (line 332) to encompass the new results. We are aware that further experiments that may involve increasing the number of animals tested, increasing the antibody injection dosage or frequency, or implementation of genetic models such as Plp1CreER; Gdnffl/fl should be carried out to validate our hypothesis with statistical significance. Unfortunately, due to limited time, resources, and research funds, we were unable to perform such additional experiments. We hope that the Reviewer understands these limitations.

      [4] (a) In assessing the regeneration after nerve crush, the authors focus on remyelination, for example, assessing CMAP and g-ratios. However, they should also quantify axon regeneration, which can be done distal to the crush injury at earlier time points, before the 6 weeks scored in their study. Evaluating axon regeneration, which occurs prior to remyelination, would be especially useful because BDNF can act on both Schwann cells, to promote myelination, and axons, enhancing survival and growth. (b) They could also evaluate the stability of the neuromuscular junctions, particularly if a denervation was done with the conditional knock outs, although that may be a bit beyond the scope of this study.

      - (a) As the Reviewer mentioned, BDNF is known to act on both Schwann cells and axons, where it promotes myelination and axonal growth, respectively (Oudega and

      Hagg, 1998; Zhang et al., 2000; Chan et al., 2001; Xiao et al., 2009; English et al.,

      2013). We fully agree with the Reviewer’s comment that quantification of axon regeneration, which could be achieved through immunostaining of the distal part of the sciatic nerve at earlier time points after injury, would shed light on whether FAPderived BDNF can also contribute to axon regeneration in addition to remyelination. Unfortunately, we could not perform such additional experiments within the limited time frame, since preparing enough numbers of control and conditional knockout mice that match the age groups used in this study (3-4 months old), followed by waiting for additional 2-4 weeks after nerve crush injury for sample collection, and subsequent immunostaining for quantification could take almost 6 months in total. We hope that the Reviewer understands this limitation.

      - (b) We appreciate the Reviewer’s constructive comment. Although the number of animals used for neuromuscular junction (NMJ) analyses was not sufficient, we had briefly examined the structure of NMJs at 4 weeks post nerve crush injury in control (Ctrl) and conditional knockout (cKO) mice as a preliminary experiment. As a result, no significant differences were observed between Ctrl and cKO mice in terms of NMJ morphology and innervation (Author response image 3). 

      Author response image 3.

      Structures of neuromuscular junctions from Ctrl vs cKO mice at 4 weeks post nerve crush injury. Whole-mount immunostaining was done using the exterior digitorum longus muscles that were affected by sciatic nerve crush injury. Samples were stained with α-bungarotoxin (green), neurofilament (red), and synaptophysin (blue). Scale bar: 50 μm. 

      Going back to part (a) of this Reviewer’s comment, considering the data presented in Author response image 3, where innervation of axons into acetylcholine receptor clusters was not significantly different between Ctrl versus cKO mice, FAP-derived BDNF may not be critical for the axonal growth upon nerve injury. Although we acknowledge that additional experiments are required to draw a meaningful conclusion on this point, we could not perform such additional experiments due to insufficient time and resources.

      We hope that the Reviewer understands our limitation.

      Recommendations for the authors:

      [1] In citing the ability of BDNF to promote Schwann cell myelination the authors should include Chan et al., 2001 (PMID 11717413) in addition to the Zhang et al, 2000 and Xiao et al, 2009 references.

      - We apologize for missing out the reference mentioned by the Reviewer. We have added the suggested reference in our revised manuscript (lines 395, 425, and 517).

      Reviewer #2

      Public review:

      [1] Although, I find the data the authors generated enough for their claims. I do see them as relatively poor, and (a) a complementary analysis of protein expression would strengthen the paper through immunostaining of the different genes mentioned for FAPs and Schwann cells. The model is entirely supported by measuring mRNA levels and negative regulation of gene expression in specific cells. Additionally, (b) what happens to the structure of the neuromuscular junction after regeneration when GDNF or BDNF expression is reduced? (c) The determination of decreasing levels of FAPs BDNF mRNA during aging is interesting; is the gain of BDNF expression in FAPs reverting the phenotype?

      - (a) We appreciate and agree with the Reviewer’s comment that validation of BDNF protein expression in FAPs and GDNF protein expression in Schwann cells upon nerve injury would strengthen this paper. Regarding GDNF protein expression in Schwann cells upon nerve injury, it has already been demonstrated by previous studies (Höke et al., 2002; Xu et al., 2013). For BDNF protein expression in FAPs upon nerve injury, we performed western blot analysis for validation, as mentioned in the response to Reviewer #1 Public review [2]. The results showed that while the mature form of BDNF protein could not be readily detected in FAPs isolated from uninjured muscles, it could be detected in FAPs isolated from sciatic nerve crush injury-affected muscles at 7 days post injury. We have added the new result as Figure 4F in the New Figure 4 with the experimental scheme as New Figure 4—figure supplement 1, and revised the Results section (lines 364-374) and the Materials and Methods section (lines 687-705) in our manuscript to include the new results in detail.

      - (b) Though the data is preliminary, we examined the structures of neuromuscular junctions (NMJs) from control and Prrx1Cre; Bdnf fl/fl mice at 4 weeks post injury in the exterior digitorum longus muscles, as mentioned in the response to Reviewer #1 Publilc review [4](b). As a result, we could not identify significant differences between control versus Prrx1Cre; Bdnf fl/fl mice, where BDNF expression is reduced specifically in Prrx1-expressing cells, including FAPs (Attached Figure 3). Since other cellular sources of BDNF, such as Schwann cells, exist, regeneration of the NMJs may not have been as significantly affected as remyelination in our Prrx1Cre; Bdnf fl/fl mice. However, further experiments with a sufficient number of mice and more observation time points are required to statistically validate this hypothesis in detail. Unfortunately, preparing samples for such additional analyses would take more than four months, as we need to produce sufficient numbers of control and Prrx1Cre; Bdnf fl/fl mice that match the age groups used in this study. We hope that the Reviewer understands our limitation.

      Regarding analyzing NMJ structures after regeneration affected by reduced GDNF levels, using genetic models such as Plp1CreER; Gdnffl/fl mice would be appropriate, as we have used the Prrx1Cre; Bdnffl/fl mice in this study to reduce BDNF levels produced by FAPs. Unfortunately, we do not have the Gdnffl mice, and obtaining these mice to produce Plp1CreER; Gdnffl/fl mice and performing the additional experiment would take too much time for this current revision. In a further study, we will try to perform the additional experiment by obtaining the required mouse line. We hope that the Reviewer understands our limitation.

      - (c) We appreciate the Reviewer for highlighting this point. In this paper, we have shown that BDNF expression upon nerve injury is decreased in aged FAPs compared to young adult FAPs, and suggested that this may be one of the causes of the delayed nerve regeneration phenotype in aged mice. Previously, it has been reported that while intramuscular injection of BDNF accelerates nerve regeneration, intramuscular injection of anti-BDNF antibodies delays the regeneration process (Zheng et al., 2016). This implies that intramuscular levels of active BDNF can significantly influence the speed of nerve regeneration. Therefore, the gain of BDNF expression in aged FAPs may contribute to reversing the delayed nerve regeneration phenotype in aged mice, since it would result in additional supply of active, intramuscular BDNF, which has previously been shown to accelerate nerve regeneration. Though experimental validation is required to support such claim, we could not obtain sufficient numbers of aged mice within the limited time frame. We hope that the Reviewer understands our limitation.

      Recommendations for the authors:

      [1] The authors should include the experimental design and several drawings in the leading figures indicating, for example, how remyelination after injury was quantified and how the response of regenerated sciatic nerve to a depolarizing stimulus was studied.

      - We apologize for any confusion caused by insufficient information provided in the leading figures. Unfortunately, due to limited space, we could not add experimental designs or drawings in the leading figures. Instead, to do our best to comply with the

      Reviewer’s comment, we have revised the figure legends in the leading figures so that the experimental designs or diagrams can be referred to in the figure supplements.

      We hope that the Reviewer understands this limitation.

      Reviewer #3

      Public review:

      [1] In Fig. 1 and 2 authors provide data on scRNA seq and this is important information reporting the finding of RET and GFRa1 transcripts in the subpopulation of FAP cells. However, authors provide no data on the expression of RET and GFRa1 proteins in FAP cells.

      - Reply for this comment by the Reviewer is in the Recommendations for the authors section below ([2]), as the same comment is repeated.

      [2] Another problem is the lack of information showing that GDNF secreted by Schwann cells can activate RET and its down-stream signaling in FAP cells. There is no direct experimental proof that GDNF activating GFRa1-RET signaling triggers BDNF upregulation In FAP cells. The data that GDNF signaling is inducing the synthesis and secretion of BDNF is also not conclusive.

      - Reply for this comment by the Reviewer is in the Recommendations for the authors section below ([3]), as the same comment is repeated.

      Recommendations for the authors:

      [1] Although this is a novel study and contains very well-performed parts, the GDNF section is preliminary and requires additional experimentation. In the introduction authors describe well FAPs but even do not mention how GDNF is signaling. Moreover, the reader may get an impression that Ras-MAPK pathway is the only or at least the main GDNF signaling pathway. In fact, for neurons Akt and Src signaling pathways play also crucial role.

      - We apologize for the missing content in the Introduction section of our manuscript and for any confusion caused by our misleading description of the GDNF signaling pathway. We have revised our manuscript to include the GDNF signaling pathway in the Introduction section, along with a description of other downstream signaling pathways of GDNF that are known to play crucial roles, as mentioned by the Reviewer (lines 115-130). Additionally, we changed the expression in the Results section to avoid making any misleading impressions (lines 318-319).

      [2] In Fig. 1 and 2 authors provide data on scRNA seq and this is important information reporting the finding of RET and GFRa1 transcripts in the subpopulation of FAP cells. However, authors provide no data on the expression of RET and GFRa1 proteins in FAP cells.

      - We appreciate the Reviewer for the constructive comment. Though we fully agree with the Reviewer that validating the expression of RET and GFRα1 proteins in FAPs is needed, we were unable to obtain the antibodies required for such experiments within the limited time frame for this revision. We hope that the Reviewer understands our limitation. Although we could not directly show the expression of those GDNF receptor genes at the protein level in FAPs, based on the result where intramuscular GDNF injection could sufficiently induce Bdnf expression in FAPs compared to PBS control in the absence of nerve damage, it is likely that GDNF receptors are indeed expressed at the protein level in FAPs, since if otherwise, FAPs would not have been able to respond to the injected GDNF protein. Nevertheless, in a future study, we will try to validate the protein-level expression of GDNF receptors in FAPs to comply with the Reviewer’s suggestion and to further support this study.

      [3] Another problem is the lack of information showing that GDNF secreted by Schwann cells can activate RET and its down-stream signaling in FAP cells. Authors can monitor activation of MAPK pathway by detecting phospho-Erk and PI3 kinase-Akt pathway measuring phospho-S6 using immunohistochemistry. We can recommend to use the following antibodies: pErk1/2 (1:300, Cell Signaling, Cat# 4370L RRID:AB_2297462), pS6 (1:300, Cell Signaling, Cat# 4858L RRID:AB_1031194). These experiments are crucial because RET and GFRa1 proteins maybe not expressed at the sufficient level on the cell surface.

      - We sincerely appreciate the Reviewer’s constructive comment. In this study, we suggested that the GDNF-BDNF axis within FAPs would signal through the MAPK pathway based on the bioinformatic analysis of our single cell RNA-seq data and matching the results with the previously known pathways. We fully agree that monitoring the activation of the MAPK pathway and the PI3K-Akt pathway by immunohistochemistry would experimentally demostrate whether GDNF can activate those pathways within FAPs through GFRα1/RET activation. Unfortunately, we could not obtain the antibodies suggested by the Reviewer for this revision due to insufficient research funds and limited time frame. We hope that the Reviewer understands our limitation. In future studies, we will try to validate the detailed molecular pathway that mediates the GDNF-BDNF axis in FAPs by incorporating the methodology suggested by the Reviewer, along with implementation of genetic models such as Plp1CreER; Gdnffl/fl, Prrx1Cre; Retfl/fl or Prrx1Cre; Gfra1fl/fl to validate whether Schwann cell-derived

      GDNF can actually signal through its canonical receptor RET/GFRα1 expressed in FAPs to induce expression of BDNF upon nerve injury.

      [4] (a) There is no direct experimental proof that GDNF activating GFRa1-RET signaling triggers BDNF upregulation in FAP cells. Authors can use GDNF blocking antibodies, siRNA or use RET or GFRa1 cKO mice to delete them from FAP cells. (b) The data that GDNF signaling is inducing the synthesis and secretion of BDNF is also not conclusive. Authors should show that GDNF injection is increasing BDNF protein levels in FAPs. To get sufficient material for ELISA detection of BDNF is perhaps problematic. However, authors can use BDNF antibodies from Icosagen company and use IHC.

      - (a) We appreciate the Reviewer for the critical comment. As mentioned in the reply for Reviewer #1 Public review [3], we used GDNF blocking antibodies to reduce GDNF signaling within the tibialis anterior and gastrocnemius muscles by intramuscular injection after sciatic nerve crush injury, and included the result as a new figure supplement in our revised manuscript (New Figure 4—figure supplement 2) with its details in both the Results section (lines 381-390) and the Materials and Methods section (lines 611-616). Though the results were not statistically significant, intramuscular injection of anti-GDNF antibodies showed a tendency toward reduced Bdnf expression in FAPs, compared to IgG controls. As mentioned in the reply for Reviewer #1 Public review [3], and as suggested by the Reviewer, using cKO mice such as Plp1CreER; Gdnffl/fl, Prrx1Cre; Retfl/fl, or Prrx1Cre; Gfra1fl/fl mice would further validate the GDNF-BDNF axis suggested in this study, likely with statistical significance. Unfortunately, obtaining these genetic models within the limited time frame of this current revision is not feasible. We will try to adopt such models in our future study to validate the role of Schwann cell-derived GDNF in inducing BDNF expression in FAPs via activation of RET/GFRα1.  

      - (b) We appreciate the Reviewer for the constructive comment. Though we fully agree that the experiment suggested by the Reviewer would validate the synthesis and secretion of BDNF protein by GDNF signaling in FAPs, we were not able to perform it due to lack of research funds to obtain enough amount of the GDNF protein. We hope that the Reviewer understands our limitation. Still, combining the results from New Figure 4H in this study with the New Figure 4F, where GDNF injection induced Bdnf mRNA expression in FAPs, and BDNF protein expression in FAPs in response to nerve injury was demonstrated via western blot, we anticipate that GDNF injection would increase BDNF protein levels in FAPs, though direct validation of this statement would require conducting the additional experiments mentioned by the Reviewer.

      References

      Chan JR, Cosgaya JM, Wu YJ, and Shooter EM (2001). Neurotrophins are key mediators of the myelination program in the peripheral nervous system. Proceedings of the National Academy of Sciences 98:14661-14668.

      English AW, Liu K, Nicolini JM, Mulligan AM, and Ye K (2013). Small-molecule trkB agonists promote axon regeneration in cut peripheral nerves. Proc Natl Acad Sci U S A 110:16217-22.10.1073/pnas.1303646110

      Giordani L, He GJ, Negroni E, Sakai H, Law JY, Siu MM, Wan R, Corneau A, Tajbakhsh S, and Cheung TH (2019). High-dimensional single-cell cartography reveals novel skeletal muscle-resident cell populations. Molecular Cell 74:609-621. e6.

      Höke A, Gordon T, Zochodne D, and Sulaiman O (2002). A decline in glial cell-linederived neurotrophic factor expression is associated with impaired regeneration after long-term Schwann cell denervation. Experimental neurology 173:77-85.

      Kim J-H, Kang J-S, Yoo K, Jeong J, Park I, Park JH, Rhee J, Jeon S, Jo Y-W, and Hann S-H (2022). Bap1/SMN axis in Dpp4+ skeletal muscle mesenchymal cells regulates the neuromuscular system. JCI Insight 7:

      Leinroth AP, Mirando AJ, Rouse D, Kobayahsi Y, Tata PR, Rueckert HE, Liao Y, Long JT, Chakkalakal JV, and Hilton MJ (2022). Identification of distinct non-myogenic skeletal-muscle-resident mesenchymal cell populations. Cell Reports 39:

      Liu L, Cheung TH, Charville GW, and Rando TA (2015). Isolation of skeletal muscle stem cells by fluorescence-activated cell sorting. Nature protocols 10:1612-1624.

      Oudega M, and Hagg T (1998). Neurotrophins promote regeneration of sensory axons in the adult rat spinal cord. Brain Research 818:431-438.10.1016/S0006-8993(98)01314-6

      Xiao J, Wong AW, Willingham MM, Kaasinen SK, Hendry IA, Howitt J, Putz U, Barrett GL, Kilpatrick TJ, and Murray SS (2009). BDNF exerts contrasting effects on peripheral myelination of NGF-dependent and BDNF-dependent DRG neurons. J Neurosci 29:4016-22.10.1523/JNEUROSCI.3811-08.2009

      Xu P, Rosen KM, Hedstrom K, Rey O, Guha S, Hart C, and Corfas G (2013). Nerve injury induces glial cell linederived neurotrophic factor (gdnf) expression in schwann cells through purinergic signaling and the pkcpkd pathway. Glia 61:1029-1040.

      Zhang JY, Luo XG, Xian CJ, Liu ZH, and Zhou XF (2000). Endogenous BDNF is required for myelination and regeneration of injured sciatic nerve in rodents. European Journal of Neuroscience 12:4171-4180.10.1111/j.1460-9568.2000.01312.x

      Zheng J, Sun J, Lu X, Zhao P, Li K, and Li L (2016). BDNF promotes the axonal regrowth after sciatic nerve crush through intrinsic neuronal capability upregulation and distal portion protection. Neuroscience letters 621:1-8.