10,000 Matching Annotations
  1. Nov 2024
    1. Reviewer #2 (Public review):

      The authors investigated the conformational dynamics and energetics of the SthK Clinker/CNBD fragment using both steady-state and time-resolved transition metal ion Förster resonance energy transfer (tmFRET) experiments. To do so, they engineered donor-acceptor pairs at specific sites of the CNBD (C-helix and β-roll) by incorporating a fluorescent noncanonical amino acid donor and metal ion acceptors. In particular, the authors employed two cysteine-reactive metal chelators (TETAC and phenM). This allowed to coordinate three transition metals (Cu2+, Fe2+, and Ru2+) to measure both short (10-20 Å, Cu2+) and long distances (25-50 Å, Fe2+, and Ru2+). By measuring tmFRET with fluorescence lifetimes, the authors determined intramolecular distance distributions in the absence and presence of the full agonist cAMP or the partial agonist cGMP. The probability distributions between conformational states without and with ligands were used to calculate the changes in free energy (ΔG) and differences in free energy change (ΔΔG) in the context of a simple four-state model.

      Overall, the work is conducted in a rigorous manner, and it is well-written.

      In terms of methodology, this work provides a further support to steady-state and time-resolved tmFRET approaches previously developed by the authors of the present work to probe conformational rearrangements by using a fluorescent noncanonical amino acid donor (Anap) and transition metal ion acceptor (Zagotta et al., eLife 2021; Gordon et al., Biohpysical Journal 2024; Zagotta et al., Biohpysical Journal 2024).

      For what concerns Cyclic nucleotide-binding domain (CNBD)-containing ion channels, the literature on this subject is vast and the authors of the present work have significantly contributed to the understanding of the allosteric mechanism governing the ligand-induced activation of CNBD-containing channels, including a detailed description of the energetic changes induced by ligand binding. Particularly relevant are their works based on DEER spectroscopy. In DeBerg et al., JBC 2016, the authors described, at atomic details, the conformational changes induced by different cyclic nucleotides on the HCN CNBD fragment and derived energetics associated with ligand binding to the CNBD (ΔΔG). In Collauto et al., Phys Chem Chem Phys. 2017, they further detailed the ligand-CNBD conformational changes by combining DEER spectroscopy with microfluidic rapid freeze quench to resolve these processes and obtain both equilibrium constants and reaction rates, thus demonstrating that DEER can quantitatively resolve both the thermodynamics and the kinetics of ligand binding and the associated conformational changes.<br /> In the revised manuscript the authors better framed their work in light of the literature by highlighting novelty and limitations, in particular the decision to work with the isolated Clinker/CNBD fragment and not with the full-length protein.

    2. Reviewer #3 (Public review):

      Summary:

      The manuscript by Eggan et al provides insights into conformational transitions in the cyclic nucleotide binding domain of a cyclic nucleotide-gated (CNG) channel. The authors use transition metal FRET (tmFRET) which has been pioneered by this lab and previously led to detailed insights into ion channel conformational changes. Here, the authors not only use steady-state measurements but also time-resolved, fluorescence lifetime measurements to gain detailed insights into conformational transitions within a protein construct that contains the cytosolic C-linker and cyclic nucleotide binding domain (CNBD) of a bacterial CNG channel. The use of time-resolved tmFRET is a clear advancement of this technique and a strength of this manuscript.

      In summary, the present work introduces time-resolved tmFRET as a novel tool to study conformational distributions in proteins. This is a clear technological advance. The limitations of the truncated construct used in this study and how they relate to the energetics in full-length CNG channels are discussed. It will be interesting to see in the future how results compare to similar measurements on full-length channels, for example, reconstituted into nanodiscs.

      Strengths:

      The results capture known differences in promoting the open state between different ligands (cAMP and cGMP) and are consistent across three donor-acceptor FRET pairs. The calculated distance distributions are further in agreement with predicted values based on available structures. The finding that the C-helix is conformationally more mobile in the closed state as compared to the open state quantitatively increases our understanding of conformational changes in these channels.

      Weaknesses:

      The results describe movements of the C-helix in CNBDs, but detailed energetics as calculated in this study, need to be limited to the truncated protein construct. This is a weakness that cannot be overcome easily as it will require future experiments using the full-length channel.

      The data only describe movements of the C-helix. Upon ligand binding, the C-helix moves upwards to coordinate the ligand. Thus, the results are ligand-induced conformational changes (as the title states). Allosteric regulation usually involves remote locations in the protein, which is applicable only in a limited fashion here.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The sample size of the in-house dataset used for training the model was relatively small (34 patients), which might limit the generalizability of the findings.

      (2) The authors did not perform functional experiments to directly validate the roles of the identified key genes in radiotherapy sensitivity, relying instead on associations with immune features and signaling pathways.

      (3) The study did not discuss the potential limitations of using machine learning algorithms, such as the risk of overfitting and the need for larger, diverse datasets for more robust model development and validation.

      (1) Currently, we are actively expanding the dataset by incorporating additional patient samples to enhance the model's robustness and generalizability. Furthermore, we implement advanced statistical techniques, including cross-validation, during model development to mitigate the potential limitations associated with the small sample size on our results. This limitation has been comprehensively addressed in the discussion section of our manuscript.

      (2) Given the current resource limitations, our study predominantly employed bioinformatics analyses. We acknowledge the critical importance of experimental validation and are actively pursuing additional funding and collaborative opportunities to facilitate future experimental studies. Concurrently, we have enhanced the discussion section to comprehensively address the limitations of our approach and emphasize the necessity for future experimental validation.

      (3) We appreciate the reviewers' insightful comments regarding the potential limitations of machine learning algorithms, particularly the risk of overfitting. In response, we have incorporated a comprehensive discussion of these concerns, detailing the measures implemented to mitigate such risks, including the application of regularization techniques and the adoption of more rigorous cross-validation methodologies. We further acknowledge the necessity for larger and more diverse datasets to enhance model validity and generalizability, a concern we intend to address in our future research endeavors. The revised manuscript includes an expanded discussion on these critical points.

      Here is the limitation section in the revised Manuscript:

      “This study primarily focuses on specific subtypes of nasopharyngeal carcinoma (NPC), potentially limiting its direct generalizability to other NPC subtypes or related head and neck malignancies. Furthermore, the limited sample size of our dataset may impact the model's generalizability and extrapolation capabilities. To mitigate the potential limitations associated with the small sample size, we employed advanced statistical methodologies, including cross-validation, to enhance the robustness and reliability of our findings. Nevertheless, we acknowledge the necessity for larger datasets and are actively collaborating with other research institutions to expand our sample size, thereby enhancing the robustness and broader applicability of our findings. Additionally, while our study utilizes bioinformatics approaches to identify and analyze key genes, we recognize that the absence of direct experimental functional validation represents a significant limitation. To address this limitation, we are actively pursuing additional funding and establishing collaborations with specialized laboratories to conduct crucial functional validation experiments, which will further elucidate the specific roles of these genes in radiotherapy response. Moreover, we acknowledge the potential risk of overfitting inherent in the application of machine learning algorithms to biomedical data analysis. To mitigate this risk, we implemented regularization techniques during model development and adopted a rigorous cross-validation strategy for model validation. These methodological approaches aim to ensure that our models maintain robust predictive performance on unseen data. Notwithstanding these limitations, our study offers novel insights into the molecular mechanisms underlying radiotherapy sensitivity in NPC and indicates promising avenues for future investigation. Future research endeavors will prioritize expanding the dataset, conducting comprehensive experimental validation, and refining our predictive model to enhance its accuracy and clinical applicability.”

      Reviewer #2 (Public Review):

      (1) The study focuses on a specific type of nasopharyngeal carcinoma (NPC) and may not be generalizable to other subtypes or related head and neck cancers. The applicability of NPC-RSS to a broader range of patients and tumor types remains to be determined.

      (2) The study does not account for potential differences in radiotherapy protocols, doses, and techniques between the training and validation cohorts, which could influence the performance of the predictive model. Standardization of treatment parameters would be important for future validation studies.

      (3) The binary classification of patients into radiotherapy-sensitive and resistant groups may oversimplify the complex spectrum of treatment responses. A more granular stratification system that captures intermediate responses could provide more nuanced predictions and better guide personalized treatment decisions.

      (4) The study does not address the potential impact of other relevant factors, such as tumor stage, histological subtype, and concurrent chemotherapy, on the predictive performance of NPC-RSS. Incorporating these clinical variables into the model could enhance its accuracy and clinical utility.

      (1) We appreciate the reviewers' interest in the applicability of our study. This study specifically focuses on a particular subtype of nasopharyngeal carcinoma (NPC), which may limit its direct generalizability to other NPC subtypes or related head and neck malignancies. We have incorporated a detailed discussion of this limitation in the Discussion section and intend to investigate the applicability of NPC-RSS across a broader spectrum of tumor types and subtypes in subsequent studies.

      (2) We acknowledge the reviewers' emphasis on the significance of potential variations in radiotherapy regimens, doses, and techniques. In the current study, we did not sufficiently account for these factors, potentially impacting the model's generalizability and accuracy. We aim to improve data consistency and strengthen model validation by standardizing treatment parameters in future investigations.

      (3) We concur with the reviewers' assessment that binary categorization may oversimplify the intricate nature of treatment responses. Indeed, radiotherapy responses likely exist on a continuous spectrum. Consequently, we intend to develop more refined stratification systems to capture intermediate responses, thereby enhancing the accuracy of treatment outcome predictions and facilitating personalized treatment decisions.

      (4) We appreciate the reviewers' recommendation to incorporate clinical variables, including tumor stage, histological subtype, and concurrent chemotherapy, into the model. We acknowledge that these factors are crucial for enhancing the accuracy and clinical applicability of predictive models. We are presently compiling these additional data and intend to integrate these variables into subsequent model iterations.

      Reviewer #1 (Recommendations For The Authors):

      (1) The manuscript would benefit from a more comprehensive comparison of the NPC-RSS with existing prognostic models or biomarkers for nasopharyngeal carcinoma. This would help highlight the unique value and potential superiority of the NPC-RSS in predicting radiotherapy sensitivity.

      2) The authors should consider expanding their discussion on the potential molecular mechanisms underlying the association between the key NPC-RSS genes and radiotherapy response. They could explore whether these genes have been previously implicated in radiotherapy resistance in other cancer types and discuss the potential functional roles of these genes in the context of nasopharyngeal carcinoma.

      (1) We appreciate your thorough review and valuable suggestions concerning our study. In response to the suggestion of comparing the Nasopharyngeal Carcinoma Radiotherapy Sensitivity Score (NPC-RSS) with existing prognostic models or biomarkers, we have carefully considered this proposal and determined that such a comparison is beyond the scope of our current study. The primary focus of our research is on the development and internal validation of the NPC-RSS model's accuracy and reliability. At present, we do not have access to the necessary external data to conduct a valid comparison, and the integration of such data extends beyond the parameters of this study. We intend to incorporate this comparative analysis in future studies to further validate the efficacy and explore the clinical application potential of the NPC-RSS model. We appreciate your understanding and continued support for our research endeavors.(2) In the revised manuscript, we have incorporated a comprehensive review of the functions of these key genes in various cancer types and explored their potential mechanisms of action in nasopharyngeal carcinoma (NPC). Through the citation of pertinent studies, we have elucidated the impact of these genes on radiotherapy sensitivity and resistance. Furthermore, we have proposed future research directions to elucidate the specific roles of these genes in the radiotherapy response of NPC.

      The following are new additions to the revised draft:

      “Previous studies have demonstrated that SMARCA2 significantly influences the radiotherapy response in non-small cell lung cancer (NSCLC). Depletion of SMARCA2 has been shown to enhance radiosensitivity, suggesting its potential as a therapeutic target for radiosensitization [30478150]. Additionally, the DMC1 gene has been incorporated into the radiosensitivity index (RSI) to evaluate radiotherapy sensitivity and prognosis, particularly in endometrial cancers. This inclusion provides valuable insights into the DNA damage repair process [38628740]. Studies on CD9 in glioblastoma multiforme (GBM) have revealed that post-radiotherapy increases in CD9 and CD81 levels in extracellular vesicles (EVs) are strongly correlated with the cytotoxic response to treatment. This finding suggests the potential of CD9 as a novel biomarker for monitoring radiotherapy efficacy [36203458]. In contrast, the association of PSG4 and KNG1 with radiotherapy resistance remains unexplored in the current literature.

      Future research should focus on analyzing the expression patterns of SMARCA2 in NPC patients and its correlation with radiotherapy efficacy using clinical samples. This analysis could elucidate its potential as a target for radiosensitization therapy. Investigating the correlation between DMC1 expression levels and radiotherapy sensitivity in NPC could potentially aid in predicting treatment efficacy and optimizing therapeutic regimens. Furthermore, analysis of extracellular vesicles, particularly those containing CD9, in post-radiotherapy NPC patients could assess their feasibility as biomarkers for monitoring treatment response. These proposed studies would not only contribute to a deeper understanding of the mechanisms underlying the role of these genes in NPC radiotherapy but could also potentially lead to the development of novel strategies for enhancing radiotherapy efficacy.”

      Minor Recommendations:

      (1) It is recommended that the author share the code for the article on Github or a similar open source platform.

      (2) The manuscript would benefit from a thorough review of the punctuation and sentence structure to improve readability and clarity.

      (1) You suggest sharing the code utilized in this study on GitHub or a comparable open-source platform to enhance the transparency and reproducibility of the research. I fully recognize the significance of this suggestion. However, due to the sensitivity of the data involved and the existing intellectual property agreement with my research team, we are unable to make the code publicly available at this time. We are actively seeking a method to safeguard the intellectual property of the project while also planning to share our tools and methodologies in the future. At this stage, we are open to collaborating with other researchers under appropriate frameworks and conditions to validate and replicate our findings by providing essential code execution snippets or assisting with data analysis.

      (2) Your suggestions are vital for enhancing the quality of the manuscript. I will perform a comprehensive linguistic and structural review of the manuscript to ensure that statements flow coherently and punctuation is employed correctly. We also intend to engage a professional scientific and technical writing editor to ensure that the manuscript adheres to the high standards required for academic publishing.

      Reviewer #2 (Recommendations For The Authors):

      (1) The manuscript would benefit from a more in-depth discussion of the potential clinical implications of the NPC-RSS. The authors should elaborate on how this score could be integrated into clinical decision-making and patient management.

      (2) The authors should consider including a section discussing the limitations of their study and potential areas for future research. This could include the need for prospective validation of the NPC-RSS in larger patient cohorts and the exploration of additional biological mechanisms.

      (1) We concur that a more comprehensive discussion regarding the application of the NPC-RSS in clinical decision-making would significantly enhance the practical value of this study. In the revised draft, we will include a section that elaborates on the integration of the NPC-RSS scoring system into daily clinical practice, detailing how it can assist physicians in developing individualized treatment plans and optimize patient management by predicting treatment responses.

      The following are new additions to the revised draft:

      “The incorporation of the NPC-RSS scoring system into clinical decision-making and patient management involves several key steps: first, establishing genetic testing as a standard component of nasopharyngeal cancer diagnosis and ensuring that physicians have prompt access to scoring results to guide treatment planning. Second, physicians should utilize the scoring results to tailor individualized treatment plans and engage in multidisciplinary discussions to optimize decision-making. Concurrently, physicians should elucidate the clinical significance of the scores and effectively communicate with patients to facilitate shared decision-making. Furthermore, continuous monitoring of the relationship between scoring and treatment outcomes, optimizing the scoring model based on empirical data, and ensuring the integration of technological platforms along with regulatory compliance are essential for safeguarding the effective operation of the scoring system and the protection of patient information.

      (2) In light of the reviewers' valuable suggestions, we acknowledge the significance of prospective validation of the NPC-RSS scoring system in a broader patient population and the necessity for thorough exploration of the underlying biological mechanisms. Accordingly, we are incorporating a new section in the revised manuscript that elaborates on the limitations of the current study and outlines potential directions for future research. This encompasses plans to increase the sample size for validation and further investigations into the biological basis of the scoring system to enhance its predictive validity and clinical applicability. We believe that these additions will significantly enrich the depth and breadth of the study, thereby serving the scientific community and clinical practice more effectively.”

      Minor Recommendations:

      (1) The authors should ensure that all abbreviations are defined at their first mention in the text.

      (2) The figure legends should be more descriptive and self-explanatory, allowing readers to understand the main findings without referring back to the main text.

      (1) You pointed out the need to define all acronyms at the first mention in the text and suggested that a comprehensive list of acronyms be included in the revised draft. We fully concur and have included a comprehensive list of acronyms in the revised text. Additionally, to enhance clarity, we have included the full name and definition of each acronym alongside its first occurrence in the text. This will assist readers in comprehending the study without the need to repeatedly refer to the glossary.

      (2) You recommended enhancing the descriptive quality of the figure legends to enable readers to discern the key findings from the figures without consulting the text. We have redesigned and refined all charts and legends to ensure they provide adequate information and are more descriptive. Each legend now outlines the experimental conditions, the variables employed, and the primary conclusions, ensuring that the charts themselves sufficiently convey the key findings of the study.

    2. eLife Assessment

      The authors have developed a robust machine learning approach to predict radio sensitivity in patients with NPC based on a defined gene signature. Some key aspects of this signature have been validated in vitro using relevant cell lines which strengthens the conclusions of this important and convincing study. The publication will be of interest to clinicians working on this indication as well as a more broader readership made up of scientists working on radiation biology and those with a bioinformatics/machine learning background.

    3. Reviewer #1 (Public review):

      Summary:

      In this study, the authors developed a novel radiotherapy sensitivity score (NPC-RSS) for nasopharyngeal carcinoma patients using machine learning algorithms. They identified 18 key genes associated with radiosensitivity and demonstrated that NPC-RSS could effectively predict radiotherapy response in both public and in-house datasets. Furthermore, they found that the key genes of NPC-RSS were closely related to immune characteristics, the expression of radiosensitivity-related genes, and signaling pathways involved in disease progression. The authors validated the consistency of expression of two key genes, SMARCA2 and CD9, with NPC-RSS in their own cell lines. They also showed that the radiosensitive group, classified by NPC-RSS, exhibited a more enriched and activated state of immune infiltration compared to the radioresistant group.

      Strengths:

      (1) The study employed a comprehensive approach by integrating multiple machine learning algorithms to develop a robust predictive model for radiotherapy sensitivity in nasopharyngeal carcinoma patients.<br /> (2) The predictive performance of NPC-RSS was validated using both public and in-house datasets, demonstrating its potential clinical applicability.<br /> (3) The authors conducted extensive analyses to investigate the biological mechanisms underlying the association between NPC-RSS and radiotherapy response, including immune characteristics, radiosensitivity-related gene expression, and relevant signaling pathways.<br /> (4) The consistency of key gene expression with NPC-RSS was validated in the authors' own cell lines, providing additional experimental evidence.

      Weaknesses:

      (1) The sample size of the in-house dataset used for training the model was relatively small (34 patients), which might limit the generalizability of the findings.<br /> (2) The authors did not perform functional experiments to directly validate the roles of the identified key genes in radiotherapy sensitivity, relying instead on associations with immune features and signaling pathways.<br /> (3) The study did not discuss the potential limitations of using machine learning algorithms, such as the risk of overfitting and the need for larger, diverse datasets for more robust model development and validation.

    4. Reviewer #2 (Public review):

      Summary:

      This article utilizes machine learning methods and transcriptomic data from nasopharyngeal carcinoma (NPC) patients to construct a biomarker called NPC-RSS that can predict the radiosensitivity of NPC patients. The authors further explore the biological mechanisms underlying the relationship between NPC-RSS and radiotherapy response in NPC patients. The main objective of this study is to guide the selection of radiotherapy strategies for NPC patients, thereby improving their clinical outcomes and prognosis.

      Strengths:

      (1) The combination of multiple machine learning algorithms and cross-validation was used to select the best predictive model for radiotherapy sensitivity from 71 differentially expressed genes, enhancing the robustness and reliability of the predictions.<br /> (2) Functional enrichment analysis revealed close associations between NPC-RSS key genes and immune characteristics, expression of radiotherapy sensitivity-related genes, and signaling pathways related to disease progression, providing a biological basis for NPC-RSS in predicting radiotherapy sensitivity.<br /> (3) Grouping NPC samples according to NPC-RSS showed that the radiotherapy-sensitive group exhibited a more enriched and activated state of immune infiltration compared to the radioresistant group. In single-cell samples, NPC-RSS was higher in the radiotherapy-sensitive group, with immune cells playing a dominant role. These results clarify the mechanism of NPC-RSS in predicting radiotherapy sensitivity from an immunological perspective.<br /> (4) The study used public datasets and in-house cohort data for validation, confirming the good predictive performance of NPC-RSS and increasing the credibility of the results.

      Limitation:

      (1) The study focuses on a specific type of nasopharyngeal carcinoma (NPC) and may not be generalizable to other subtypes or related head and neck cancers. The applicability of NPC-RSS to a broader range of patients and tumor types remains to be determined.<br /> (2) The study does not account for potential differences in radiotherapy protocols, doses, and techniques between the training and validation cohorts, which could influence the performance of the predictive model. Standardization of treatment parameters would be important for future validation studies.<br /> (3) The binary classification of patients into radiotherapy-sensitive and resistant groups may oversimplify the complex spectrum of treatment responses. A more granular stratification system that captures intermediate responses could provide more nuanced predictions and better guide personalized treatment decisions.<br /> (4) The study does not address the potential impact of other relevant factors, such as tumor stage, histological subtype, and concurrent chemotherapy, on the predictive performance of NPC-RSS. Incorporating these clinical variables into the model could enhance its accuracy and clinical utility.

    1. eLife Assessment

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    2. Reviewer #1 (Public review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient.

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis.

      The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

    3. Reviewer #2 (Public review):

      Summary:

      The authors developed an imaging-based device, that provides both spatial confinement and stiffness gradient, to investigate if and how amoeboid cells, including T cells, neutrophils and Dictyostelium can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that are not dependent on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient. 

      The authors observed that NMIIA is required for durotaxis and, buiding on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis. 

      The authors responded to all my comments and I have nothing to add. The evidence provided for durotaxis of non adherent (or low-adhering) cells is strong. I am particularly impressed by the fact that amoeboid cells can durotax even when not confined. I wish to congratulate the authors for the excellent work, which will fuel discussion in the field of cell adhesion and migration.

      We thank the reviewer for critically evaluating our work and giving kind suggestions. We are glad that the reviewer found our work to be of potential interest to the broad scientific community.

      Reviewer #2 (Public Review):

      Summary:

      The authors developed an imaging-based device that provides both spatialconfinement and stiffness gradient to investigate if and how amoeboid cells, including T cells, neutrophils, and Dictyostelium, can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.

      Strengths:

      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that do not depend on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.

      Weaknesses:

      Overall this study is well performed but there are still some minor issues I recommend the authors address:

      (1) When using NMIIA/NMIIB knockdown cell lines to distinguish the role of NMIIA and NMIIB in amoeboid durotaxis, it would be better if the authors took compensatory effects into account.

      We thank the reviewer for this suggestion. We have investigated the compensation of myosin in NMIIA and NMIIB KD HL-60 cells using Western blot and added this result in our updated manuscript (Fig. S4B, C). The results showed that the level of NMIIB protein in NMIIA KD cells doubled while there was no compensatory upregulation of NMIIA in NMIIB KD cells. This is consistent with our conclusion that NMIIA rather than NMIIB is responsible for amoeboid durotaxis since in NMIIA KD cells, compensatory upregulation of NMIIB did not rescue the durotaxis-deficient phenotype. 

      (2) The expansion microscopy assay is not clearly described and some details are missed such as how the assay is performed on cells under confinement.

      We thank the reviewer for this comment. We have updated details of the expansion microscopy assay in our revised manuscript in line 481-485 including how the assay is performed on cells under confinement:

      Briefly, CD4+ Naïve T cells were seeded on a gradient PA gel with another upper gel providing confinement. 4% PFA was used to fix cells for 15 min at room temperature. After fixation, the upper gradient PA gel is carefully removed and the bottom gradient PA gel with seeded cells were immersed in an anchoring solution containing 1% acrylamide and 0.7% formaldehyde (Sigma, F8775) for 5 h at 37 °C.

      (3) In this study, an active gel model was employed to capture experimental observations. Previously, some active nematic models were also considered to describe cell migration, which is controlled by filament contraction. I suggest the authors provide a short discussion on the comparison between the present theory and those prior models.

      We thank the reviewer for this suggestion. Active nematic models have been employed to recapitulate many phenomena during cell migration (Nat Commun., 2018, doi: 10.1038/s41467-018-05666-8.). The active nematic model describes the motion of cells using the orientation field, Q, and the velocity field, u. The director field n with (n = −n) is employed to represent the nematic state, which has head-tail symmetry. However, in our experiments, actin filaments are obviously polarized, which polymerize and flow towards the direction of cell migration. Therefore, we choose active gel model which describes polarized actin field during cell migration. In the discussion part, we have provided the comparison between active gel model and motor-clutch model. We have also supplemented a short discussion between the present model and active nematic model in the main text of line 345-347:

      The active nematic model employs active extensile or contractile agents to push or pull the fluid along their elongation axis to simulate cells flowing (61). 

      (4) In the present model, actin flow contributes to cell migration while myosin distribution determines cell polarity. How does this model couple actin and myosin together?

      We thank the reviewer for this question. In our model, the polarization field is employed to couple actin and myosin together. It is obvious that actin accumulate at the front while myosin diffuses in the opposite direction. Therefore, we propose that actin and myosin flow towards the opposite direction, which is captured in the convection term of actin ) and myosin () density field.

    1. eLife Assessment

      This study represents a potentially useful tool for extracting quantitative data from intravital microscopy directed at in vivo cancer models. In general, this is an area of interest as accessible non-proprietary tools are needed and some evidence of the tool's utility is provided. However, the work in its current form is incomplete as it is heavily reliant on proprietary software to segment, track, and correct the data. In addition, there are significant reservations regarding the methods used to produce statistics in the software, limiting its applicability and the potential advance over other approaches.

    2. Reviewer #1 (Public review):

      Summary:

      Intravital microscopy (IVM) is a powerful tool that facilitates live imaging of individual cells over time in vivo in their native 3D tissue environment. Extracting and analysing multi-parametric data from IVM images however is challenging, particularly for researchers with limited programming and image analysis skills. In this work, Rios-Jimenez and Zomer et al have developed a 'zero-code' accessible computational framework (BEHAV3D-Tumour Profiler) designed to facilitate unbiased analysis of IVM data to investigate tumour cell dynamics (via the tool's central 'heterogeneity module') and their interactions with the tumour microenvironment (via the 'large-scale phenotyping' and 'small-scale phenotyping' modules). It is designed as an open-source modular Jupyter Notebook with a user-friendly graphical user interface and can be implemented with Google Colab, facilitating efficient, cloud-based computational analysis at no cost.

      To demonstrate the utility of BEHAV3D-TP, they apply the pipeline to timelapse IVM imaging datasets to investigate the in vivo migratory behaviour of fluorescently labelled DMG cells in tumour-bearing mice. Using the tool's 'heterogeneity module' they were able to identify distinct single-cell behavioural patterns (based on multiple parameters such as directionality, speed, displacement, and distance from tumour edge) which was used to group cells into distinct categories (e.g. retreating, invasive, static, erratic). They next applied the framework's 'large-scale phenotyping' and 'small-scale phenotyping' modules to investigate whether the tumour microenvironment (TME) may influence the distinct migratory behaviours identified. To achieve this, they combine TME visualisation in vivo during IVM (using fluorescent probes to label distinct TME components) or ex vivo after IVM (by large-scale imaging of harvested, immunostained tumours) to correlate different tumour behavioural patterns with the composition of the TME. They conclude that this tool has helped reveal links between TME composition (e.g. degree of vascularisation, presence of tumour-associated macrophages) and the invasiveness and directionality of tumour cells, which would have been challenging to identify when analysing single kinetic parameters in isolation.

      A key limitation of the pipeline is that it does not overcome the main challenges and bottlenecks associated with processing and extracting quantitative cellular data from timelapse and longitudinal intravital images. This includes correcting breathing-induced movement artifacts, automated registration of longitudinal images taken over days/weeks, and accurate, automated segmentation and tracking of individual cells over time. Indeed, there are currently no standardised computational methods available for IVM data processing and analysis, with most laboratories relying on custom-built solutions or manual methods. This isn't made explicit in the manuscript early on (described below), and the researchers rely on expensive software packages such as IMARIS for image processing and data extraction to feed the required parameters into their pipeline. This limitation unfortunately reduces the likely impact of BEHAV3D-TP on the IVM field.

      Nonetheless, this computational framework appears to represent a useful and comparatively user-friendly tool to analyse dynamic multi-parametric data to help identify patterns in cell migratory behaviours, and to assess whether these behaviours might be influenced by neighbouring cells and structures in their microenvironment. When combined with other methods, it, therefore, has the potential to be a valuable addition to a researcher's IVM analysis 'tool-box'.

      Strengths:

      (1) The figures are clearly presented, and the manuscript is easy to follow.

      (2) The pipeline appears to be intuitive and user-friendly for researchers with limited computational expertise. A detailed step-by-step video is also included to support its uptake.

      (3) The different computational modules have been tested using a relevant dataset.

      (4) All code is open source, and the pipeline can be implemented with Google Colab.

      (5) The tool combines multiple dynamic parameters extracted from time-lapse IVM images to identify single-cell behavioural patterns and to cluster cells into distinct groups sharing similar behaviours, and provides avenues to map these onto in vivo or ex vivo imaging data of the tumour microenvironment.

      Weaknesses:

      (1) As highlighted above, the tool does not facilitate the extraction of quantitative kinetic cellular parameters (e.g. speed, directionality, persistence, and displacement) from intravital images. Indeed, to use the tool researchers must first extract dynamic cellular parameters from their IVM datasets, requiring access to expensive software (e.g. IMARIS as used here) and/or above-average computational expertise to develop and use custom-made open-source solutions. This limitation is not made explicit or discussed in the text.

      (2) The number of cells (e.g. per behavioural cluster), and the number of independent mice, represented in each result figure, is not included in the figure legends and are difficult to ascertain from the methods.

      (3) The data used to test the pipeline in this manuscript is currently not available, making it difficult to assess its usability. It would be important to include this for researchers to use as a 'training dataset'.

      (4) Precisely how the BEHAV3D-TP large-scale phenotyping module can map large-scale spatial phenotyping data generated using LSR-3D imaging data and Cytomap to 3D intravital imaging movies is unclear. Further details in the text and methods would be beneficial to aid understanding.

      (5) The analysis provides only preliminary evidence in support of the authors' conclusions on DMG cell migratory behaviours and their relationship with components of the tumour microenvironment. Conclusions should therefore be tempered in the absence of additional experiments and controls.

    3. Reviewer #2 (Public review):

      Summary:<br /> The authors produce a new tool, BEHAV3D to analyse tracking data and to integrate these analyses with large and small-scale architectural features of the tissue. This is similar to several other published methods to analyse spatiotemporal data, however, the connection to tissue features is a nice addition, as is the lack of requirement for coding. The tool is then used to analyse tracking data of tumour cells in diffuse midline glioma. They suggest that 7 clusters exist within these tracks and that they differ spatially. They ultimately suggest that these behaviours occur in distinct spatial areas as determined by CytoMAP.

      Strengths:

      (1) The tool appears relatively user-friendly and is open source. The combination with CytoMAP represents a nice option for researchers.

      - The identification of associations between cell track phenotype and spatial features is exciting and the diffuse midline glioma data nicely demonstrates how this could be used.

      Weaknesses:

      (1) The strength of democratizing this kind of analysis is undercut by the reliance upon Imaris for segmentation, so it would be nice if this was changed to an open-source option for track generation.

      (2) The main issue is with the interpretation of the biological data in Figure 3 where ANOVA was used to analyse the proportional distribution of different clusters. Firstly the n is not listed so it is unclear if this represents an n of 3 where each mouse is an individual or whether each track is being treated as a test unit. If the latter this is seriously flawed as these tracks can't be treated as independent. Also, a more appropriate test would be something like a Chi-squared test or Fisher's exact test. Also, no error bars are included on the stacked bar graphs making interpretation impossible. Ultimately this is severely flawed and also appears to show very small differences which may be statistically different but may not represent biologically important findings. This would need further study.

      (3) Figure 4 has similar statistical issues in that the n is not listed and, again, it is unclear whether they are treating each cell track as independent which, again, would be inappropriate. The best practice for this type of data would be the use of super plots as outlined in Lord et al. (2020) JCI - SuperPlots: Communicating reproducibility and variability in cell biology.

      (4) The main issue that this raises is that the large-scale phenotyping module and the heterogeneity module appear designed to produce these statistical analyses that are used in these figures and, if they are based on the assumption that each track is independent, then this will produce inappropriate analyses as a default.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Rios-Jimenez developed a computational tool, BEHAV3D Tumor Profiler, to analyze intravital imaging data and extract distinctive tumor cell migratory phenotypes based on the quantified 3D image data.

      Weaknesses:

      (1) The most challenging task of analyzing 3D time-lapse imaging data is to accurately segment and track the individual cells in 3D over a long time duration. BEHAV3D Tumor Profiler did not provide any new advancement in this regard, and instead relies on commercial software, Imaris, for this critical step. Imaris is known to have a very high error rate when used for analyzing 3D time-lapse data. In the Methods section, the authors themselves stated that "Tumor cell tracks were manually corrected to ensure accurate tracking". Based on our own experience of using Imaris, such manual correction is tedious and often required for every time step of the movie. Therefore, Imaris is not a satisfactory tool for analyzing 3D time-lapse data. Moreover, Imaris is expensive and many research labs probably can't afford to buy it. The fact that BEHAV3D Tumor Profiler critically depends on the faulty ImarisTrack module makes it unclear whether the BEHAV3D tool or the results are reliable.

      (2) The authors developed a "Heterogeneity module" to extract distinctive tumor migratory phenotypes from the cell tracks quantified by Imaris. The cell tracks of the individual tumor cells are all quite short, indicating relatively low motility of the tumor cells. It's unclear whether such short migratory tracks are sufficient to warrant the PCA analysis to identify the 7 distinctive migratory phenotypes shown in Figure 2d. It's also unclear whether these 7 migratory phenotypes correspond to unique functional phenotypes.

      (3) Using only motility to classify tumor cell behaviours in the tumor microenvironment (TME) is probably not sufficient to capture the tumor cell difference. There are also other non-tumor cell types in the TME. If the authors aim to develop a computational tool that can elucidate tumor cell behaviors in the TME, they should consider other tumor cell features, e.g., morphology, proliferation state, and tumor cell interaction with other cell types, e.g., fibroblasts and distinct immune cells.

      (4) The authors have already published two papers on BEHAV3D [Alieva M et al. Nat Protoc. 2024 Jul;19(7): 2052-2084; Dekkers JF, et al. Nat Biotechnol. 2023 Jan;41(1):60-69]. Although the previous two papers used BEHAV3D to analyze T cells, the basic pipeline and computational steps are similar, in particular regarding cell segmentation and tracking. The addition of a "Heterogeneity module" based on PCA analysis does not make a significant advancement in terms of image analysis and quantification.

    5. Author response:

      We want to thank the reviewers for their positive and constructive comments on the manuscript. We already addressed some of their concerns and are planning the following revisions to both BEHAV3D-TP and the corresponding manuscript to address the reviewers’ comments. Below, we provide a response to the most significant comments, followed by a detailed, point-by-point response:

      (1) We acknowledge the reviewer's suggestion to incorporate open-source segmentation and tracking functionalities, increasing its accessibility to a wider user base; however, these additions fall outside the primary scope of our current work and represent a substantial undertaking in their own right. This topic has been comprehensively explored in other studies (e.g. https://doi.org/10.4049/jimmunol.2100811 ; https://doi.org/10.7554/eLife.60547 ; https://doi.org/10.1016/j.media.2022.102358 ; https://doi.org/10.1038/s41592-024-02295-6), which we will cite in our revised manuscript as indicated in our responses to the reviewers’ comments. Instead, the goal of our manuscript is to provide an analytical framework for processing data generated by existing segmentation and tracking pipelines. In our analyses, we used data processed with Imaris, a commercial software that, despite its limitations, is widely used by the intravital microscopy community due to its user-friendly platform for 3D image visualization and analysis. Nevertheless, to enhance compatibility with tracking data from various pipelines, we have modified our tool to accept data formats, such as those generated by open-source Fiji plugins like TrackMate (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ). These updates are available in our GitHub repository, and we will describe this feature in the revised manuscript to emphasize compatibility with segmented and tracked data from diverse open-source platforms.

      (2) We appreciate the reviewer’s suggestion to incorporate additional features into our analytical pipeline. In response, we have already updated the GitHub repository to allow users to input and select which features (dynamic, morphological, or spatial) they wish to include in the analysis (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#feature-selection ) . In the revised manuscript, we will highlight this new functionality and provide examples using alternative datasets to demonstrate the application of these features.

      (3) We appreciate the constructive feedback of reviewers #1 and #2 regarding the statistical analysis and interpretation of the data presented in Figures 3 and 4. We understand the importance of clarity and rigor in data analysis and presentation, and we are committed to addressing the concerns raised in the revised version of the manuscript.

      (4) We appreciate Reviewer #1's suggestion regarding the inclusion of demo data, as we believe it would greatly enhance the usability of our pipeline. We acknowledge that this was an oversight on our part. To address this, we have now added demo data to our GitHub repository (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler/tree/BEHAV3D_TP-v2.0/demo_datasets). In the upcoming revised manuscript, we will also ensure to reference this addition. Additionally, we will  provide both original and processed IVM movie samples to support users in navigating the complete pipeline effectively.

      (5) Finally, we agree with the reviewers to make some small changes to the manuscript based on their feedback.

      Below we provide a point-by-point response to the reviewers’ comments, along with proposed revisions.

      Reviewer #1:

      Comment: A key limitation of the pipeline is that it does not overcome the main challenges and bottlenecks associated with processing and extracting quantitative cellular data from timelapse and longitudinal intravital images. This includes correcting breathing-induced movement artifacts, automated registration of longitudinal images taken over days/weeks, and accurate, automated segmentation and tracking of individual cells over time. Indeed, there are currently no standardised computational methods available for IVM data processing and analysis, with most laboratories relying on custom-built solutions or manual methods. This isn't made explicit in the manuscript early on (described below), and the researchers rely on expensive software packages such as IMARIS for image processing and data extraction to feed the required parameters into their pipeline. This limitation unfortunately reduces the likely impact of BEHAV3D-TP on the IVM field.

      As highlighted above, the tool does not facilitate the extraction of quantitative kinetic cellular parameters (e.g. speed, directionality, persistence, and displacement) from intravital images. Indeed, to use the tool researchers must first extract dynamic cellular parameters from their IVM datasets, requiring access to expensive software (e.g. IMARIS as used here) and/or above-average computational expertise to develop and use custom-made open-source solutions. This limitation is not made explicit or discussed in the text.

      As mentioned previously, we agree with the reviewer that image processing steps, such as segmentation, tracking, and motion correction, present significant challenges in intravital microscopy (IVM) data processing. While these aspects are being addressed by other researchers, our publication centers on the analysis of acquired data rather than on the image processing itself. Our motivation, as outlined in the manuscript, arises from our own experience: despite the substantial effort invested in image processing, researchers often rely on simplistic analytical approaches, such as averaging single parameters and comparing them across conditions. These approaches tend to overlook potential tumor heterogeneity.

      Our work aimed to develop an analytical tool that provides a comprehensive framework for extracting more insights from processed IVM data, with a focus on two key aspects: capturing the heterogeneity of tumor behavior and examining the spatial distribution of these behaviors within the tumor microenvironment. In the revised manuscript, we will clarify the scope of our study, emphasizing its limitations as an analytical tool rather than an image-processing solution. Additionally, we will provide references to relevant literature on available (open-source) software options for image processing (e.g. Diego Ulisse Pizzagalli et al J Immunol (2022); Aby Joseph et al eLife (2020) ;Molina-Moreno M et al Medical Image Analysis (2022); Hidalgo-Cenalmor, I et al, Nat Methods  (2024); Ershov. D et al Nat Methods  (2022)).

      Regarding the reviewer’s comment on our use of Imaris, we acknowledge that Imaris is a costly commercial software. However, based on our experience, it is widely used by the intravital microscopy community due to its user-friendly interface for 3D image visualization and analysis. Despite its limitations in accuracy and the fact that it is not open-source, we believe that including data processed with Imaris will be valuable to the IVM community.

      However, to improve compatibility with data from other segmentation and tracking pipelines, we have already updated our tool to support formats generated by open-source Fiji plugins like TrackMate. These updates are available in our GitHub repository, and we will describe this functionality in detail in the revised manuscript to ensure compatibility with segmented and tracked data from various open-source platforms.

      Comment: The number of cells (e.g. per behavioural cluster), and the number of independent mice, represented in each result figure, is not included in the figure legends and are difficult to ascertain from the methods.

      We appreciate the reviewer's constructive feedback regarding the clarity of the number and type of replicates used in our analyses. In the revised manuscript, we will include detailed information in the figure legends regarding the number of cells (e.g., per behavioral cluster) and the number of independent mice represented in each result figure to ensure transparency.

      Comment: The data used to test the pipeline in this manuscript is currently not available, making it difficult to assess its usability. It would be important to include this for researchers to use as a 'training dataset'.

      As stated above we acknowledge that this was an oversight on our part and thank the reviewer for pointing this out. To address this, we have now added demo data to our GitHub repository (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler/tree/BEHAV3D_TP-v2.0/demo_datasets). In the upcoming revised manuscript, we will also make sure to reference this addition. Additionally, we intend to provide both original and processed IVM movie samples to support users in navigating the complete pipeline effectively.

      Comment: Precisely how the BEHAV3D-TP large-scale phenotyping module can map large-scale spatial phenotyping data generated using LSR-3D imaging data and Cytomap to 3D intravital imaging movies is unclear. Further details in the text and methods would be beneficial to aid understanding.

      We appreciate the reviewer’s comment and will provide additional details in the text and methods of the revised manuscript to clarify how the BEHAV3D-TP module maps LSR-3D and Cytomap data to 3D intravital imaging movies.

      Comment: The analysis provides only preliminary evidence in support of the authors' conclusions on DMG cell migratory behaviours and their relationship with components of the tumour microenvironment. Conclusions should therefore be tempered in the absence of additional experiments and controls.

      We appreciate the reviewer’s comment and acknowledge that our conclusions should be tempered due to the preliminary nature of our evidence. To be able to directly analyze the impact of the brain tumor microenvironment on cancer cell behavior, we will include a new set of analyses in the revised manuscript. Specifically, we will utilize BEHAV3D-TP to analyze existing IVM data from adult gliomas with and without macrophage depletion (Alieva et al, Scientific Reports, 2017; https://doi.org/10.1038/s41598-017-07660-4 ) to evaluate the differences in heterogeneous cell populations under these conditions. Since this analysis pertains to a different tumor type, we will revise our conclusions accordingly and emphasize the necessity for additional experiments and controls to further validate our findings on DMG cell migratory behaviors and their relationship with the tumor microenvironment.

      Reviewer #2:

      Comment: The strength of democratizing this kind of analysis is undercut by the reliance upon Imaris for segmentation, so it would be nice if this was changed to an open-source option for track generation.

      As noted in our previous response to Reviewer #1, we would like to point out that although Imaris is a commercial software, it is widely used in the intravital microscopy (IVM) community due to its user-friendly interface. One of its key advantages, which we also utilized, is semi-automated data tracking that allows for manual corrections in 3D—a process that can be more challenging in other open-source software with less effective data visualization.

      However, we recognize that enhancing our pipeline's compatibility with open-source options is important. To this end, we have already updated our tool to support data formats generated by open-source Fiji plugins like TrackMate, improving compatibility with various segmentation and tracking pipelines (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ). We will describe these updates in the revised manuscript to clarify our study's scope and the available image processing options.

      Comment: The main issue is with the interpretation of the biological data in Figure 3 where ANOVA was used to analyse the proportional distribution of different clusters. Firstly the n is not listed so it is unclear if this represents an n of 3 where each mouse is an individual or whether each track is being treated as a test unit. If the latter this is seriously flawed as these tracks can't be treated as independent. Also, a more appropriate test would be something like a Chi-squared test or Fisher's exact test. Also, no error bars are included on the stacked bar graphs making interpretation impossible. Ultimately this is severely flawed and also appears to show very small differences which may be statistically different but may not represent biologically important findings. This would need further study.

      We appreciate the reviewer’s insightful comments regarding the interpretation of the biological data in Figure 3. To clarify, each mouse serves as an independent unit in this analysis. We believe that ANOVA is the appropriate test for comparing the proportions of different behavioral signatures across the tumor microenvironment (TME) regions identified by large-scale phenotyping. However, we acknowledge that using a stacked bar plot may have been misleading. While a Chi-squared test could show differences in the distribution of behavioral signatures, it would not indicate which specific signatures are responsible for those differences. Therefore, in the revised manuscript, we will retain the ANOVA analysis but will represent the proportions using a bar chart that clearly illustrates multiple conditions for each behavioral cluster. We also appreciate the reviewer’s concern regarding the transparency of our data. In the revised manuscript, we will include the number of replicates for all figures to enhance clarity and understanding.

      Comment:  Figure 4 has similar statistical issues in that the n is not listed and, again, it is unclear whether they are treating each cell track as independent which, again, would be inappropriate. The best practice for this type of data would be the use of super plots as outlined in Lord et al. (2020) JCI - SuperPlots: Communicating reproducibility and variability in cell biology.

      We appreciate the reviewer’s comments and suggestions regarding Figure 4. In the revised manuscript, we will clarify the number of replicates used and our approach to treating cell tracks as independent units. We will implement super-plots where appropriate, to enhance the communication of reproducibility and variability in our data.

      Comment: The main issue that this raises is that the large-scale phenotyping module and the heterogeneity module appear designed to produce these statistical analyses that are used in these figures and, if they are based on the assumption that each track is independent, then this will produce inappropriate analyses as a default.

      We appreciate the reviewer’s comment, though we find ourselves unsure about the specific concern being raised. To clarify, each mouse is treated as an independent unit in our analyses. For each large-scale phenotyping region, we measure the proportion of tumor cells displaying a specific behavioral phenotype independently for each mouse. These proportions are then used for statistical analysis. We hope this explanation provides clarity, and we will adjust the manuscript to better convey this methodology.

      Reviewer #3:

      Comment: The most challenging task of analyzing 3D time-lapse imaging data is to accurately segment and track the individual cells in 3D over a long time duration. BEHAV3D Tumor Profiler did not provide any new advancement in this regard, and instead relies on commercial software, Imaris, for this critical step. Imaris is known to have a very high error rate when used for analyzing 3D time-lapse data. In the Methods section, the authors themselves stated that "Tumor cell tracks were manually corrected to ensure accurate tracking". Based on our own experience of using Imaris, such manual correction is tedious and often required for every time step of the movie. Therefore, Imaris is not a satisfactory tool for analyzing 3D time-lapse data. Moreover, Imaris is expensive and many research labs probably can't afford to buy it. The fact that BEHAV3D Tumor Profiler critically depends on the faulty ImarisTrack module makes it unclear whether the BEHAV3D tool or the results are reliable.

      If the authors want to "democratize the analysis of heterogeneous cancer cell behaviors", they should perform image segmentation and tracking using open-source codes (e.g., Cellpose, Stardisk & 3DCellTracker) and not rely on the expensive and inaccurate ImarisTrack Module for the image analysis step of BEHAV3D.

      We appreciate the reviewer’s comments on the challenges of segmenting and tracking individual cells in 3D time-lapse imaging data. As mentioned previously, our primary focus is to develop an analytical tool for comprehensive data analysis rather than developing tools for image processing. To enhance accessibility, we have updated our tool to support data formats from open-source Fiji plugins, such as TrackMate, which will benefit users without access to commercial software (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler?tab=readme-ov-file#data-input ).

      While we recognize the limitations of Imaris, it remains widely used in the intravital microscopy community due to its user-friendly interface for 3D visualization and semi-automated segmentation capabilities. Since no perfect tracking method currently exist, we utilized Imaris for its ability to allow manual corrections of faulty tracks, ensuring the reliability of our results. This approach was the best available option when we began our analysis, allowing us to obtain accurate results efficiently.

      In the revised manuscript, we will clarify our methodology and provide information on both Imaris and alternative processing options to strengthen the reliability of our findings.

      Comment: The authors developed a "Heterogeneity module" to extract distinctive tumor migratory phenotypes from the cell tracks quantified by Imaris. The cell tracks of the individual tumor cells are all quite short, indicating relatively low motility of the tumor cells. It's unclear whether such short migratory tracks are sufficient to warrant the PCA analysis to identify the 7 distinctive migratory phenotypes shown in Figure 2d. It's also unclear whether these 7 migratory phenotypes correspond to unique functional phenotypes.

      For the 7 distinctive motility clusters, the authors should provide a more detailed analysis of the differences between them. It's unclear whether the difference in retreating, slow retreating, erratic, static, slow, slow invading, and invading correspond to functional difference of the tumor cells.

      While some tumor cells exhibit limited motility, indicated by short tracks, others demonstrate significant migratory capabilities. This variability in tumor cell behavior is a central focus of our analysis, and our tool is specifically designed to identify and distinguish these differences. Our PCA analysis effectively captures this variability, as illustrated in Figure 2 d-f. It differentiates between cells exhibiting varying degrees of migratory behavior, including both highly migratory and less migratory phenotypes, as well as their directionality relative to the tumor core and the persistence of their movements. Thus, we believe that our approach provides valuable insights into the distinct migratory phenotypes within the tumor microenvironment. We will clarify these aspects further in the revised manuscript to enhance the reader's understanding of our findings.

      While our current manuscript does not provide explicit evidence linking each motility cluster to functional differences among the tumor cells, it is important to note that the state of the field supports the idea that cell dynamics can predict cell states and phenotypes. Research conducted by ourselves (Dekkers, Alieva et al., Nat Biotech, 2023) and others, such as Craiciuc et al. (Nature, 2022) and Freckmann et al. (Nat Comm, 2022) has shown that variations in cell motility patterns are indicative of underlying functional characteristics. For instance, cell morphodynamic features have been shown to reflect differences in cell types, T cell targeting states, tumor metastatic potential, and drug resistance states. In the revised manuscript, we will reference relevant studies to underscore the biological significance of these behaviors. By doing so, we hope to clarify the potential implications of our findings and strengthen the overall narrative of our research.

      Comment: Using only motility to classify tumor cell behaviours in the tumor microenvironment (TME) is probably not sufficient to capture the tumor cell difference. There are also other non-tumor cell types in the TME. If the authors aim to develop a computational tool that can elucidate tumor cell behaviors in the TME, they should consider other tumor cell features, e.g., morphology, proliferation state, and tumor cell interaction with other cell types, e.g., fibroblasts and distinct immune cells.

      The authors should expand the scale of tumor behavior features to classify the tumor phenotype clusters, e.g., to include tumor morphology, proliferation state, and tumor cell interaction with other TME cell types.

      We believe that using dynamic features alone is sufficient to capture differences in tumor behavior, as demonstrated by our results in Figure 2. However, we appreciate the reviewer’s suggestion to consider additional features, such as cell morphology and interactions with other cell types, to finetune our analyses. To this end, we have adapted our pipeline to be compatible with various features present in the data (https://github.com/imAIgene-Dream3D/BEHAV3D_Tumor_Profiler/tree/BEHAV3D_TP-v2.0?tab=readme-ov-file#feature-selection ). We will emphasize this in the revised manuscript. However, we would like to point out that not all features may provide informative insights and that a wide range of features can instead introduce biologically irrelevant noise, making interpretation more challenging. For instance, in 3D microscopy, the z-axis resolution is typically lower, which can lead to artifacts like elongation in that direction. Adding morphological features that capture this may skew the analysis. Therefore, we believe that incorporating additional features should be approached with caution. We will clarify these considerations in the revised manuscript to better guide users in utilizing our computational tool effectively. We will also reference the use of unbiased feature selection techniques, such as bootstrapping methods, to identify biologically relevant features based on the conditions provided (D.G. Aragones et al, Computers in Biology and Medicine (2024)).

      Comment: The authors have already published two papers on BEHAV3D [Alieva M et al. Nat Protoc. 2024 Jul;19(7): 2052-2084; Dekkers JF, et al. Nat Biotechnol. 2023 Jan;41(1):60-69]. Although the previous two papers used BEHAV3D to analyze T cells, the basic pipeline and computational steps are similar, in particular regarding cell segmentation and tracking. The addition of a "Heterogeneity module" based on PCA analysis does not make a significant advancement in terms of image analysis and quantification.

      We want to emphasize that we have no intention of duplicating our previous publications. In this manuscript, we have consistently cited our foundational papers, where BEHAV3D was first developed for T cell migratory analysis in in vitro settings. In the introduction, we clearly state that our earlier work inspired us to adopt a similar approach for analyzing cell behavior in intravital microscopy (IVM) data, addressing the specific needs and complexities of analyzing tumor cell behaviors in the tumor microenvironment.

      Importantly, our new work provides several key advancements: 1) a pipeline specifically adapted for intravital microscopy (IVM) data; 2) integration of spatial characteristics from both large-scale and small-scale phenotyping; and 3) a zero-code approach designed to empower researchers without coding skills to effectively utilize the tool. We believe that these enhancements represent meaningful progress in the analysis of cell behaviors within the tumor microenvironment which will be valuable for the IVM community. We will ensure that these points are clearly articulated in the revised manuscript.

    1. eLife Assessment

      The study identifies the adhesion G-protein-coupled receptor A3 (ADGRA3) as a potential target for activating adaptive thermogenesis in white and brown adipose tissue, providing valuable information for scientists in the field of adipose tissue biology and metabolism. Although the authors have addressed some concerns raised by reviewers, the interpretations remain somewhat limited, and the work is deemed incomplete. The evidence supporting ADGRA3's role in thermogenesis is insufficient, necessitating more rigorous experiments to validate the receptor's relevance in adipose tissue. Additionally, the lack of experiments using primary cultures, despite feedback from multiple reviewers, highlights significant shortcomings.

    2. Reviewer #1 (Public review):

      Summary:

      This article identifies ADGR3 as a candidate GPCR for mediating beige fat development. The authors use human expression data from Human Protein Atlas and Gtex databases and combine this with experiments performed in mice and a murine cell line. They refer to a GPCR bioactivity screening tool PRESTO-Salsa, with which it was found that Hesperetin activates ADGR3. From their experiments, authors conclude that Hesperetin activates ADGR3, inducing a Gs-PKA-CREB axis resulting in adipose thermogenesis.

      Strengths:

      The authors analyze human data from public databases and perform functional studies in mouse models. They identify a new GPCR with a role in thermogenic activation of adipocytes.

      Considerations:

      Selection of ADGRA3 as a candidate GPCR relevant for mediating beiging in humans:

      The authors identify GPCRs that are expressed more highly in murine iBAT compared to iWAT in response to cold and assess which of these GPCRs are expressed in human subcutaneous or visceral adipocytes. Although this strategy will identify GPCRs that are expressed at higher levels in brown fat compared to beige and thus possibly more active in thermogenic function, the relevance in choosing GPCRs that also are expressed in unstimulated human white adipocytes should be considered. Thermogenic activity is not normally present in human white adipocytes. It would have strengthened the GPCR selection if the authors instead had assessed the intersection with human brown adipocytes that were activated with norepinephrine.

      Strategy to investigate the role of ADGRA3 in WAT beiging:

      Having identified ADGRA3 as their candidate receptor, the authors investigated the receptor in mouse models, the murine inguinal adipocyte cell line 3T3 and in human subcutaneous adipose progenitors (HAdsc) differentiated in vitro. Calling the human cells "beige" is a stretch as these cells are derived from a white adipose depot. The authors do observe regulation in UCP1 and abundance of mitochondria following modification of ADGRA3 in the cells. However, in future studies, it should be considered if the receptor rather plays a role in differentiation per se, and perhaps not specifically in thermogenic differentiation/activity.

      According to the Human Protein Atlas and Gtex databases, ADGRA3 is not only expressed in adipocytes, but also in other tissues and cell types. The authors address this by measuring the expression in a panel of these tissues, demonstrating a knockdown not only in the adipose tissue, but also in the liver and less pronounced in the muscle (Figure S2). It should thus be emphasized that the decreased TG levels in serum and liver in the mice might in fact depend on Adgra3 overexpression in the liver. Even though this might not have been the purpose of the experiment, it is important to highlight this as it could serve as hypothesis building for future studies of the function of this receptor.

    3. Reviewer #2 (Public review):

      Based on bioinformatics and expression analysis using mouse and human samples, the authors claim that the adhesion G-protein coupled receptor ADGRA3 may be a valuable target for increasing thermogenic activity and metabolic health. Genetic approaches to deplete ADGRA3 expression in vitro resulted in reduced expression of thermogenic genes including Ucp1, reduced basal respiration and metabolic activity as reflected by reduced glucose uptake and triglyceride accumulation. In line, nanoparticle delivery of shAdgra3 constructs is associated with increased body weight, reduced thermogenic gene expression in white and brown adipose tissue (WAT, BAT), and impaired glucose and insulin tolerance. On the other hand, ADGRA3 overexpression is associated with an improved metabolic profile in vitro and in vivo, which can be explained by increasing the activity of the well-established Gs-PKA-CREB axis. Notably, a computational screen suggested that ADGRA3 is activated by hesperetin. This metabolite is a derivative of the major citrus flavonoid hesperidin and has been described to promote metabolic health. Using appropriate in vitro and in vivo studies, the authors show that hesperitin supplementation is associated with increased thermogenesis, UCP1 levels in WAT and BAT, and improved glucose tolerance, an effect that was attenuated in the absence of ADGRA3 expression.

      Comments on revised version:<br /> In my opinion, the critical points I raised were not adequately addressed, neither in the revision nor in the response to the reviewer. Therefore, my initial assessment has not changed, the main claims are only partially supported by the data presented.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Zhao et al. explored the function of adhesion G protein-coupled receptor A3 (ADGRA3) in thermogenic fat biology.

      Strengths:

      Through both in vivo and in vitro studies, the authors found that the gain function of ADGRA3 leads to browning of white fat and ameliorates insulin resistance.

      Comments on revised version:

      The revised manuscript by Zhao et al. has limited improvement. The authors refused to perform revised experiments using primary cultures even though two reviewers pointed out the same weakness (3T3-L1 adipocytes are unsuitable). Using infrared thermography to measure body temperature is also problematic.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This article identifies ADGR3 as a candidate GPCR for mediating beige fat development. The authors use human expression data from the Human protein atlas and Gtex databases and combine this with experiments performed in mice and a murine cell line. They refer to a GPCR bioactivity screening tool PRESTO-Salsa, with which it was found that Hesperetin activates ADGR3. From their experiments, authors conclude that Hesperetin activates ADGR3, inducing a Gs-PKA-CREB axis resulting in adipose thermogenesis.

      Strengths:

      The authors analyze human data from public databases and perform functional studies in mouse models. They identify a new GPCR with a role in the thermogenic activation of adipocytes.

      Weaknesses:

      (1) Selection of ADGRA3 as a candidate GPCR relevant for mediating beiging in humans:

      The authors identify genes upregulated in iBAT compared to iWAT in response to cold, and among these differentially expressed genes, they identify highly expressed GPCRs in human white adipocytes (visceral or subcutaneous). Finally, among these genes, they select a GPCR not previously studied in the literature.

      If the authors are interested in beiging, why do they not focus on genes upregulated in iWAT (the depot where beiging is described to occur in mice), comparing thermoneutral to cold-induced genes? I would expect that genes induced in iWAT in response to cold would be extremely relevant targets for beiging. With their strategy, the authors exclude receptors that are induced in the tissue where beiging is actually described to occur.

      Furthermore, the authors are comparing genes upregulated in cold in BAT (but not WAT) to highly expressed genes in human white adipocytes during thermoneutrality. Overall, the authors fail to discuss the logic behind their strategy and the obvious limitations of it.

      Thanks for your valuable advice. In this study, we focus on genes that exhibited higher expression in BAT compared to iWAT under cold stimulation conditions, as these genes might play a role in adipose thermogenesis. Regarding the genes you mentioned that iWAT upregulates following cold stimulation, we did identify other intriguing targets in these genes in another ongoing study, albeit not encompassed within the scope of this study. Moreover, instead of making a comparison, we intersected 27 GPCR coding genes that were highly expressed in BAT compared to iWAT with genes that were highly expressed in human adipocytes (Figure 1C).

      With your suggestions, we realized that the description of the screening strategy in the manuscript was not clear enough, so we made the following supplement:

      “…dataset obtained from the Gene Expression Omnibus (GEO) database. Additionally, we utilized the human subcutaneous adipocytes dataset (Figure 1C, red) and human visceral adipocytes dataset (Figure 1C, purple) from the human protein atlas database to obtain genes that are highly expressed in human white adipocytes. The GSE118849 dataset comprises samples of brown adipose tissue (BAT) and inguinal white adipose tissue (iWAT) obtained from mice subjected to a 72-hour cold exposure at a temperature of 4℃.

      A total of 1134 differentially expressed genes (DEGs) that exhibited up-regulation in BAT compared to iWAT under cold stimulation were identified in the analysis, which might play a role in adipose thermogenesis. These DEGs were further screened to identify highly…”

      (2) Relevance of ADGRA3 and comparison to established literature:

      There has been a lot of literature and discussion about which receptor should be targeted in humans to recruit thermogenic fat. The current article unfortunately does not discuss this literature nor explain how it relates to their findings. For example, O'Mara et al (PMID: 31961826) demonstrated that chronic stimulation with the B3 adrenergic agonist, Mirabegron, resulted in the recruitment of thermogenic fat and improvement in insulin sensitivity and cholesterol. Later, Blondin et al (PMID: 32755608), highlighted the B2 adrenergic receptor as the main activation path of thermogenic fat in humans. There is also a recent report on an agonist activating B2 and B3 simultaneously (PMID: 38796310). Thus, to bring the literature forward, it would be beneficial if the current manuscript compared their identified activation path with the activation of these already established receptors and discussed their findings in relation to previous studies.

      Thanks to your suggestion. We have included a supplementary discussion on the relevant human adipose thermogenic receptors in the discussion section, as presented below:

      “The induction of beige fat has been investigated as a potentially effective therapeutic approach in combating obesity [23]. A clinical trial revealed that treatment with the chronic β3-AR agonist mirabegron leads to an increase in human brown fat, HDL cholesterol, and insulin sensitivity [24]. Subsequently, Blondin et al discovered that oral administration of mirabegron only elicits an increase in BAT thermogenesis when administered at the maximal allowable dose, indicating that human brown adipocyte thermogenesis is primarily driven by β2-adrenoceptor (β2-AR) stimulation [11]. Consistent with this finding, we found much higher levels of ADRB2 expression in human white adipose tissue than ADRB3 (Figure S1E). Furthermore, a recent study has demonstrated that simultaneous activation of β2-AR and β3-AR enhances whole-body metabolism through beneficial effects on skeletal muscle and BAT [25].”

      In Figures 1d and e, the authors show the expression of ADGRA3 in comparison to the expression of ADRB3. In human brown adipocytes, ADRB2 has been shown to be the main receptor through which adrenergic activation occurs (PMID: 32755608), thus authors should show the relative expression of this gene as well.

      We wholeheartedly endorse the proposal to augment the ADRB2 expression data in Figures 1D and E. However, it is regrettable to note that the pertinent databases (PRJNA66167 and PRJEB4337) are deficient in ADRB2 expression information. Fortunately, the GTEx database houses the ADRB2 expression data. Consequently, we have integrated these crucial data into Figure S1E.

      (3) Strategy to investigate the role of ADGRA3 in WAT beiging:

      Having identified ADGRA3 as their candidate receptor, the authors proceed with investigations of this receptor in mouse models and the murine inguinal adipocyte cell line 3T3.

      First of all, in Figure 1D, the authors show a substantially lower expression of ADGRA3 compared to ADRB3. It could thus be argued that a mouse would not be the best model system for studying this receptor. It would be interesting to see data from experiments in human adipocytes.

      Thanks for your helpful advice. We induced human adipose-derived mesenchymal stem cells (hADSCs) into adipocytes to evaluate the effect of ADGRA3 on human adipocytes (Figure 8).

      Moreover, if the authors are interested in inducing beiging, why do they show expression in iBAT and not iWAT?

      Maybe the description of this article wasn't clear enough, but we did show the expression and effects of ADGRA3 in iWAT and BAT (Author response image 1, Figure 3F-J and Figure 4F-J).

      Author response image 1.

      The authors perform in vivo experiments using intraperitoneal injections of shRNA or overexpression CMV-driven vectors and report effects on body temperature and glucose metabolism. It is here important to note that ADGRA3 is not uniquely expressed in adipocytes. A major advantage of databases like the Human Protein Atlas and Gtex, is that they give an overview of the gene expression across tissues and cell types. When looking up ADGRA3 in these databases, it is expressed in subcutaneous and visceral adipocytes. However, other cell types and tissues demonstrate an even higher expression. In the Human protein atlas, the enhanced cell types are astrocytes and hepatocytes. In the Gtex database tissues with the highest expression are Brain, Liver, and Thyroid.

      With this information in mind, IP injections for modification of ADGRA3 receptor expression could be expected to affect any of these tissues and cells.

      The manuscript report changes body temperature. However, temperature is regulated by the brain and also affected by thyroid activity. Did the authors measure the levels of circulating thyroid hormones? Gene expression changes in the brain? The authors report that Adgra3 overexpression decreased the TG level in serum and liver. The liver could be the primary targeted organ here, and the adipose effects might be secondary. The data would be easier to interpret if authors reported the effects on the liver, thyroid, and brain, and the gene expression across tissues should be discussed in the article.

      Thank you for your valuable advice. We supplemented the results of the effect of local BAT injection of Adgra3 OE on thermogenic genes (Figures S5G-H), the levels of circulating thyroid hormones (Figures S2H, S4F and S5B) and the effects of Adgra3 overexpression/knockdown on Adgra3 expression levels (Figures S2A-B and S4B-C) in multiple tissues as well as discussed in the article, as follows:

      “Given the consideration that the non-targeted nanoparticle approach utilized in this study for modulating Adgra3 expression levels in vivo alter Adgra3 expression in tissues beyond adipose tissue (Figures S2A-B and S4B-C), notably the liver and skeletal muscle, the construction of Adgra3 adipose tissue-specific knockout/overexpression mouse models is imperative for a more nuanced understanding of the precise mechanisms underlying the influence of on adipose thermogenesis. We will employ more sophisticated models in subsequent studies to further elucidate the effects of ADGRA3 on adipose thermogenesis and metabolic homeostasis. Nevertheless, our findings underlie a potential therapeutic feature of…”

      Finally, the identification of Hesperetin using the PRESTO-Salsa tool, and how specific the effect of Hesperetin is on ADGRA3, is currently unclear. This should be better discussed, and authors should consider measuring the established effects of Hesperetin in their model systems, including apoptosis.

      Thanks for your suggestion. We have further discussed the relevant content and added it in the discussion section as follows:

      “Previously, the influence of hesperetin on ADGRA3 has remained unreported. In this study, we screened hesperetin as a potential agonist for ADGRA3 by using the PRESTO-Salsa tool as well as discovered that hesperetin has an agonist effect on ADGRA3 through a series of experiments. This study focuses on the regulatory effect of hesperetin on adipose thermogenesis and explores whether this effect is dependent upon ADGRA3. As such, we refrained from conducting further investigations into other potential effects of hesperidin, including its potential role in antioxidant and in apoptosis.”

      Reviewer #2 (Public Review):

      Based on bioinformatics and expression analysis using mouse and human samples, the authors claim that the adhesion G-protein coupled receptor ADGRA3 may be a valuable target for increasing thermogenic activity and metabolic health. Genetic approaches to deplete ADGRA3 expression in vitro resulted in reduced expression of thermogenic genes including Ucp1, reduced basal respiration, and metabolic activity as reflected by reduced glucose uptake and triglyceride accumulation. In line, nanoparticle delivery of shAdgra3 constructs is associated with increased body weight, reduced thermogenic gene expression in white and brown adipose tissue (WAT, BAT), and impaired glucose and insulin tolerance. On the other hand, ADGRA3 overexpression is associated with an improved metabolic profile in vitro and in vivo, which can be explained by increasing the activity of the well-established Gs-PKA-CREB axis. Notably, a computational screen suggested that ADGRA3 is activated by hesperetin. This metabolite is a derivative of the major citrus flavonoid hesperidin and has been described to promote metabolic health. Using appropriate in vitro and in vivo studies, the authors show that hesperetin supplementation is associated with increased thermogenesis, UCP1 levels in WAT and BAT, and improved glucose tolerance, an effect that was attenuated in the absence of ADGRA3 expression.

      Overall, the data suggest that ADGRA3 is a constitutively active Gs-coupled receptor that improves metabolism by activating adaptive thermogenesis in WAT and BAT. The conclusions of the paper are partly supported by the data, but some experimental approaches need further clarification.

      (1) The in vivo approaches to modulate Adgra3 expression in mice are carried out using non-targeted nanoparticle-based approaches. The authors do not provide details of the composition of the nanomaterials, but it is highly likely that other metabolically active organs such as the liver are targeted. This is critical because Adgre3 is expressed in many organs, including the liver, adrenal glands, and gastrointestinal system. Therefore, many of the observed metabolic effects could be indirect, for example by modulating bile acids or corticosterone levels. Consistent with this, after digestion in the gastrointestinal tract, hesperetin is rapidly metabolized in intestinal and liver cells. Thus, hesperetin levels in the systemic circulation are likely to be insufficient to activate Adgra3 in thermogenic adipocytes/precursors. Overall, the authors need to repeat the key metabolic experiments in adipose-specific Adgra3 knockout/overexpression models to validate the reliability of the in vivo results. In addition, to validate the relevance of hesperetin supplementation for adaptive thermogenesis in BAT and WAT vivo, the levels of hesperetin present in the systemic circulation should be quantified.

      Thank you for your valuable advice. Unfortunately, we could not perform quantitative determination of hesperetin concentration in the systemic circulation because we had used the serum of hesperetin-treated mice for the quantitative determination of serum insulin, fT4 and TG. According to your other suggestions, we supplemented the results of the effect of local BAT injection of Adgra3 OE on thermogenic genes (Figures S5G-H), the levels of circulating thyroid hormones (Figures S2H, S4F and S5B) and the effects of Adgra3 overexpression/knockdown on Adgra3 expression levels (Figures S2A-B and S4B-C) in multiple tissues as well as discussed in the article, as follows:

      “Given the consideration that the non-targeted nanoparticle approach utilized in this study for modulating Adgra3 expression levels in vivo alter Adgra3 expression in tissues beyond adipose tissue (Figures S2A-B and S4B-C), notably the liver and skeletal muscle, the construction of Adgra3 adipose tissue-specific knockout/overexpression mouse models is imperative for a more nuanced understanding of the precise mechanisms underlying the influence of on adipose thermogenesis. We will employ more sophisticated models in subsequent studies to further elucidate the effects of ADGRA3 on adipose thermogenesis and metabolic homeostasis. Nevertheless, our findings underlie a potential therapeutic feature of…”

      (2) Standard measurements for energy balance are not presented. Quantitative data on energy expenditure, e.g. by indirect calorimetry, and food intake are missing and need to be included to validate the authors' claims.

      We are in full agreement with your proposal. Regrettably, owing to the constraints of experimental facilities, we are presently unable to access quantitative data pertaining to the energy expenditure of animals. However, we believe that the present results can also partially support the idea that ADGRA3 promotes energy metabolism and the results of the effect of ADGRA3 on food intake were shown in Figure S2C and Figure S5A respectively.

      (3) The thermographic images used to determine the BAT temperature are not very convincing. The distance and angle between the thermal camera and the BAT have a significant effect on the determination of the temperature, which is not taken into account, at least in the images presented.

      Thank you very much for pointing out the lack of our method description. According to the methods of literatures (Xia, Bo et al. PLoS biology. 2020. doi:10.1371/journal.pbio.3000688) and (Warner, Amy et al. PNAS. 2013. doi:10.1073/pnas.1310300110), the same batch of representative infrared images of mice were all captured using a thermal imaging camera (FLIR ONE PRO), measured at the same distance perpendicular to the plane on which the mice were located. We have supplemented this description in the Materials and Methods section, as shown below:

      “2.20. Infrared Thermography.

      BAT temperature was measured at room temperature by infrared thermography according to previous publications [22, 23]. The same batch of representative infrared images of mice were all captured using a thermal imaging camera (FLIR ONE PRO), measured at the same distance perpendicular to the plane on which the mice were located. To quantify interscapular region temperature, the average surface temperature from a region of the interscapular BAT was taken with FLIR Tools software.”

      (4) The 3T3-L1 cell line is not an adequate cell culture model to study thermogenic adipocyte differentiation. To validate their results, the key experiments showing that ADGRA3 expression modulates thermogenic marker expression in a hesperetin-dependent manner need to be performed in a reliable model, e.g. primary murine adipocytes.

      Induction of 3T3L1 cell line into white adipocytes is indeed not suitable for studying thermogenic adipocyte differentiation. However, with reference to previous studies (Wei, Gang et al. Cell metabolism. 2021. doi: 10.1016/j.cmet.2021.08.012 ) and (Bae IS, Kim SH. Int J Mol Sci. 2019. doi: 10.3390/ijms20246128), 3T3-L1 cell line was used to differentiate into beige-like adipocytes in this study, and many studies believe that this method is suitable for studying the thermogenic effect of adipocytes in vitro. Meanwhile, we provided a more detailed description of the induction of beige-like adipocytes by 3T3-L1 in the Materials and Methods section and induced human adipose-derived stem cells (hADSC) into adipocytes to evaluate the effect of ADGRA3 on human adipocytes (Figure 8).

      “…supplemented with 10% FBS. Confluent 3T3-L1 pre-adipocytes were induced into mature beige-like adipocytes with 0.5 mM isobutyl methylxanthine (IBMX), 1 μM dexamethasone, 5 μg/ml insulin, 1 nM 3, 3', 5-Triiodo-L-thyronine (T3), 125 μM indomethacin and 1 μM rosiglitazone in high-glucose DMEM containing 10% FBS for 2 days, then treated with high-glucose DMEM containing 5 μg/ml insulin, 1 nM T3, 1 μM rosiglitazone and 10% FBS for 6 days and cultured with high-glucose DMEM containing 10% FBS for 2 days. hADSCs were seeded on plates coated with 0.1% gelatin and culture and grown to confluence in human mesenchymal stem cells (hMSCs) specialized culture medium (ZQ-1320). Confluent hADSCs were induced into mature human adipocytes with adipogenic induction medium (PCM-I-004) according to the manufacturer’s instructions.”

      (5) The experimental setup only allows the measurement of basal cellular respiration. More advanced approaches are needed to define the contribution of ADGRA3 versus classical adrenergic receptors to UCP1-dependent thermogenesis.

      Thanks for your suggestion. The maximum oxygen consumption rate of the cells was also measured (Figures 2G and 2N) by adding FCCP, an uncoupler of oxidative phosphorylation (OXPHOS) in mitochondria.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Zhao et al. explored the function of adhesion G protein-coupled receptor A3 (ADGRA3) in thermogenic fat biology.

      Strengths:

      Through both in vivo and in vitro studies, the authors found that the gain function of ADGRA3 leads to browning of white fat and ameliorates insulin resistance.

      Weaknesses:

      There are several lines of weak methodologies such as using 3T3-L1 adipocytes and intraperitoneal(i.p.) injection of virus. Moreover, as the authors stated that ADGRA3 is constitutively active, how could the authors then identify a chemical ligand?

      (1) Primary cultured cells should be used to perform gain and loss function analysis of ADGRA3, instead of using 3T3-L1. It is impossible to detect Ucp1 expression in 3T3-L1 cells.

      Induction of 3T3L1 cell line into white adipocytes is indeed difficult for detecting UCP1 expression. However, with reference to previous studies (Wei, Gang et al. Cell metabolism. 2021. doi:10.1016/j.cmet.2021.08.012) and (Bae IS, Kim SH. Int J Mol Sci. 2019. doi:10.3390/ijms20246128), 3T3-L1 cell line was used to differentiate into beige-like adipocytes in this study, and many studies believe that this method is suitable for studying the thermogenic effect of adipocytes in vitro. Meanwhile, we provided a more detailed description of the induction of beige-like adipocytes by 3T3-L1 in the Materials and Methods section and induced human adipose-derived stem cells (hADSC) into adipocytes to evaluate the effect of ADGRA3 on human adipocytes (Figure 8).

      “…supplemented with 10% FBS. Confluent 3T3-L1 pre-adipocytes were induced into mature beige-like adipocytes with 0.5 mM isobutyl methylxanthine (IBMX), 1 μM dexamethasone, 5 μg/ml insulin, 1 nM 3, 3', 5-Triiodo-L-thyronine (T3), 125 μM indomethacin and 1 μM rosiglitazone in high-glucose DMEM containing 10% FBS for 2 days, then treated with high-glucose DMEM containing 5 μg/ml insulin, 1 nM T3, 1 μM rosiglitazone and 10% FBS for 6 days and cultured with high-glucose DMEM containing 10% FBS for 2 days. hADSCs were seeded on plates coated with 0.1% gelatin and culture and grown to confluence in human mesenchymal stem cells (hMSCs) specialized culture medium (ZQ-1320). Confluent hADSCs were induced into mature human adipocytes with adipogenic induction medium (PCM-I-004) according to the manufacturer’s instructions.”

      (2) For virus treatment, the authors should consider performing local tissue injection, rather than IP injection. If it is IP injection, have the authors checked other tissues to validate whether the phenotype is fat-specific?

      Thank you for your valuable advice. We supplemented the results of the effect of local BAT injection of Adgra3 OE on thermogenic genes (Figures S5G-H) and the effects of Adgra3 overexpression/knockdown on Adgra3 expression levels (Figures S2A-B and S4B-C) in other tissues.

      (3) The authors should clarify how constitutively active GPCR needs further ligands.

      Thank you for your suggestion. In fact, we only identified hesperetin as a potential agonist of ADGRA3 rather than a ligand. The results also indicate that overexpression of ADGRA3 without additional hesperetin is sufficient to activate downstream PKA signaling pathways through constitutive activity (Figure 5). Recently, Chen et al identified oleic ethanolamine (OEA) as a potential endogenous agonist of GPR3, which is also a constitutively active GPCR. Overall, the high constitutive activity of constitutively active GPCRs arises from the combined effects of stimulation by endogenous agonists and their basal coupling with Gs.

      As for why we screened and identified potential agonists of ADGRA3, we hope to find more convenient pathways for its clinical application than gene overexpression, as described in the article:      

      “Considering the difficulty of overexpressing ADGRA3 in clinical application, hesperetin was screened as a potential agonist of ADGRA3 by PRESTO-Salsa database (Figure 6A). The…”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments

      The title appears to be overstated as no clinical trials were performed and experiments were not even performed in human brown adipocytes.

      Thank you for your critical suggestion, therefore we have added the experimental results of human adipocytes (Figure 8) and revised the title to “Constitutively active receptor ADGRA3 signaling induces adipose thermogenesis”.

      Please specify n-number and what are replicates or independent experiments. Please also state if any outliers were excluded and why.

      Thanks for your valuable suggestion. We have added a description of the n-number in the Figure legends section, number of independent experiments and exclusion criteria for outliers in the Materials and Methods section, as follows:

      “…of tissue samples. Cohorts of ≥4 mice per genotype or treatment were assembled for all in vivo studies. All in vivo studies were repeated 2-3 independent times. All procedures related to…”

      “…μM H-89) was added to 3T3-L1 mature beige-like adipocytes for 48 hours. All in vitro studies were repeated 2-3 independent times.”

      “All data are presented as mean ± SEM. In this study, outliers that met the three-sigma rule were excluded from analysis, with the exception of those presented in Figure S1E. Given the possibility that the outliers in Figure S1E represent extreme expressions of the inherent variability within the population sample, we have chosen to retain these specific outliers for further analysis. Student’s t-test was used to compare two groups. One-way analysis of…”

      Authors use Infrared Thermography to measure body temperature. Depending on the distance between the mouse and the camera, the mouse needs to be at the same spot.

      Thank you very much for pointing out the lack of our method description. According to the methods of literatures (Xia, Bo et al. PLoS biology. 2020. doi:10.1371/journal.pbio.3000688) and (Warner, Amy et al. PNAS. 2013. doi:10.1073/pnas.1310300110), the same batch of representative infrared images of mice were all captured using a thermal imaging camera (FLIR ONE PRO), measured at the same distance perpendicular to the plane on which the mice were located. We have supplemented this description in the Materials and Methods section, as shown below:

      “2.20. Infrared Thermography.

      BAT temperature was measured at room temperature by infrared thermography according to previous publications [22, 23]. The same batch of representative infrared images of mice were all captured using a thermal imaging camera (FLIR ONE PRO), measured at the same distance perpendicular to the plane on which the mice were located. To quantify interscapular region temperature, the average surface temperature from a region of the interscapular BAT was taken with FLIR Tools software.”

      Please discuss the limitations of the experiments and discuss the relevant literature.

      Thanks for your recommendations. We discussed the limitations of the experiments and the relevant literature in the discussion section, as follows:

      “The induction of beige fat has been investigated as a potentially effective therapeutic approach in combating obesity [23]. A clinical trial revealed that treatment with the chronic β3-AR agonist mirabegron leads to an increase in human brown fat, HDL cholesterol, and insulin sensitivity [24]. Subsequently, Blondin et al discovered that oral administration of mirabegron only elicits an increase in BAT thermogenesis when administered at the maximal allowable dose, indicating that human brown adipocyte thermogenesis is primarily driven by β2-adrenoceptor (β2-AR) stimulation [11]. Consistent with this finding, we found much higher levels of ADRB2 expression in human white adipose tissue than ADRB3 (Figure S1E). Furthermore, a recent study has demonstrated that simultaneous activation of β2-AR and β3-AR enhances whole-body metabolism through beneficial effects on skeletal muscle and BAT [25].”

      “Given the consideration that the non-targeted nanoparticle approach utilized in this study for modulating Adgra3 expression levels in vivo alter Adgra3 expression in tissues beyond adipose tissue (Figures S2A-B and S4B-C), notably the liver and skeletal muscle, the construction of Adgra3 adipose tissue-specific knockout/overexpression mouse models is imperative for a more nuanced understanding of the precise mechanisms underlying the influence of on adipose thermogenesis. We will employ more sophisticated models in subsequent studies to further elucidate the effects of ADGRA3 on adipose thermogenesis and metabolic homeostasis. Nevertheless, our findings underlie a potential therapeutic feature of…”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1 (Public Review):

      Summary

      The mammalian Shieldin complex consisting of REV7 (aka MAD2L2, MAD2B) and SHLD1-3 affects pathway usage in DSB repair favoring non-homologous end-joining (NHEJ) at the expense of homologous recombination (HR) by blocking resection and/or priming fill-in DNA synthesis to maintain or generate near blunt ends suitable for NHEJ. While the budding yeast Saccharomyces cerevisiae does not have homologs to SHLD1-3, it does have Rev7, which was identified to function in conjunction with Rev3 in the translesion DNA polymerase zeta. Testing the hypothesis that Rev7 also affects DSB resection in budding yeast, the work identified a direct interaction between Rev7 and the Rad50-Mre11-Xrs2 complex by two-hybrid and direct protein interaction experiments. Deletion analysis identified that the 42 amino acid C-terminal region was necessary and sufficient for the 2-hybrid interaction. Direct biochemical analysis of the 42 aa peptide was not possible. Rev7 deficient cells were found to be sensitive to HU only in synergy with G2 tetraplex forming DNA. Importantly, the 42 aa peptide alone suppressed this phenotype. Biochemical analysis with full-length Rev7 and a C-terminal truncation lacking the 42 aa region shows G4-specific DNA binding that is abolished in the C-terminal truncation and with a substrate containing mutations to prevent G4 formation. Rev7 lacks nuclease activity but inhibits the dsDNA exonuclease activity of Mre11. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition suggesting the involvement of additional binding sites besides the 42 aa region. Also, the Mre11 ssDNA endonuclease activity is inhibited by Rev7 but not the degradation of linear ssDNA. Rev7 does not affect ATP binding by Rad50 but inhibits in a concentration-dependent manner the Rad50 ATPase activity. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition but significantly less than the full-length protein.

      Using an established plasmid-based NHEJ assay, the authors provide strong evidence that Rev7 affects NEHJ, showing a four-fold reduction in this assay. The mutations in the other Pol zeta subunits, Rev3 and Rev1, show a significantly smaller effect (~25% reduction). A strain expressing only the Rev7 C-terminal 42 aa peptide showed no NHEJ defect, while the truncation protein lacking this region exhibited a smaller defect than the deletion of REV7. The conclusion that Rev7 supports NHEJ mainly through the 42 aa region was validated using a chromosomal NHEJ assay. The effect on HR was assessed using a plasmid:chromosome system containing G4 forming DNA. The rev7 deletion strain showed an increase in HR in this system in the presence and absence of HU. Cells expressing the 42 aa peptide were indistinguishable from the wild type as were cells expressing the Rev7 truncation lacking the 42 aa region. The authors conclude that Rev7 suppresses HR, but the context appears to be system-specific and the conclusion that Rev7 abolished HR repair of DSBs is unwarranted and overly broad.

      Strength

      This is a well-written manuscript with many well-executed experiments that suggest that Rev7 inhibits MRX-mediated resection to favor NEHJ during DSB repair. This finding is novel and provides insight into the potential mechanism of how the human Shieldin complex might antagonize resection.

      We thank Reviewer 1 for their comprehensive summary of our work. The Reviewers' recognition that our manuscript is “well-written” with “many well-executed experiments” and our findings are “novel” is greatly appreciated.

      Weaknesses

      The nuclease experiments were conducted using manganese as a divalent cation, and it is unclear whether there is an effect with the more physiological magnesium cation. Additional controls for the ATPase and nuclease experiments to eliminate non-specific effects would be helpful. Evidence for an effect on resection in cells is lacking. The major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified, as only a highly specialized assay is used that does not warrant the broad conclusion drawn. Specifically, the results that the Rev7 C terminal truncation lacking the 42 aa region still suppresses HR is unexpected and unexplained. The effect of Rev7 on G4 metabolism is underdeveloped and distracts from the main results that Rev7 modulated MRX activity. The authors should consider removing this part and develop a more complete story on this later.

      We have addressed each point identified as “Weaknesses” by the reviewer, as described below:

      The nuclease experiments were conducted using manganese as a divalent cation, and it is unclear whether there is an effect with the more physiological magnesium cation.

      We acknowledge the Reviewer’s concern and apologize for not having been clear in our first submission.  However, several studies have demonstrated that Mre11 exhibits all three DNase activities, namely single-stranded endonuclease, double-stranded exonuclease and DNA hairpin opening only in the presence of Mn²⁺ but not with other divalent cations, such as magnesium or calcium (Paull and Gellert, Mol. Cell 1998; 2000; Usui et al., Cell 1998; Ghosal and Muniyappa, JMB, 2007; Arora et al., Mol Cell Biol. 2017). For this reason, Mn²⁺ was used as a cofactor for the Mre11 nuclease assays. We have clarified this in the revised manuscript. As a side note, Mg2+ serves as a cofactor for Rad50’s ATPase activity.  

      Additional controls for the ATPase and nuclease experiments to eliminate non-specific effects would be helpful.

      We thank the Reviewer for raising this important point, as it led us to evaluate and confirm the specificity of Rev7 and exclude its potential non-specific effects. To this end, we have performed additional experiments, which showed that (a) the S. cerevisiae Dmc1 ATPase activity was not affected by Rev7, contrary to its inhibitory effect on Rad50 and (b) Rev7 had no discernible impact on the endonucleolytic activity of S. cerevisiae Sae2, whereas it inhibits DNase activities of Mre11. Thus, the lack of inhibitory effects on the ATPase activity of Dmc1 and nuclease activity of Sae2 confirm the specificity of Rev7 for Mre11 and Rad50 subunits. We have included this new data in Figure 6H and 6J and in Figure 5 –figure supplement 1, respectively, in the revised manuscript.

      Evidence for an effect on resection in cells is lacking. The major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified, as only a highly specialized assay is used that does not warrant the broad conclusion drawn.

      We agree with the Reviewer that in vivo evidence demonstrating the inhibitory effect of REV7 on DNA end resection was lacking in the first submission. Reviewer 2 and 3 have also raised point. We now measured the rate of DNA end resection using a qPCR-based assay (Mimitou and Symington, EMBO J. 2010; Gnugge et al., Mol. Cell 2023). The results revealed that deletion of REV7 led to an enhancement in the rate of DNA end resection at a DSB site inflicted by HO endonuclease (Figure 9—figure supplement 3), providing direct evidence that loss of REV7 contributes to increase in DNA end resection at the DSBs.

      Specifically, the results that the Rev7 C-terminal truncation lacking the 42 aa region still suppresses HR is unexpected and unexplained.

      This is a fair point, and we thank the reviewer for raising it. Although the interaction of Rev7-C1 in the yeast two-hybrid assays was not apparent, surprisingly, it partially suppressed HR (Figure 9). In line with this, biochemical assays showed that it exerts partial inhibitory effect on the Mre11 nuclease (Figure 5) and Rad50 ATPase (Figure 6) activities compared with the full-length Rev7. Consistent with vitro data, the AF2 models revealed that, in addition to the C-terminal 42-aa region, residues in the N-terminal region of Rev7 also interact with the Mre11 and Rad50 subunits (Figure 2—figure supplement 2).

      The effect of Rev7 on G4 metabolism is underdeveloped and distracts from the main results that Rev7 modulated MRX activity. The authors should consider removing this part and develop a more complete story on this later.

      We agree with the reviewer’s comment “that the effect of Rev7 on G4 DNA metabolism is underdeveloped and distracts” from the central theme of the present paper, and suggested that we develop this part as a complete story later. This point has also been raised by Reviewer 2 and 3 and, therefore, Figures and associated text were removed in the revised version of the manuscript.

      Reviewer 2 (Public Review):

      In this study, Badugu et al investigate the Rev7 roles in regulating the Mre11-Rad50-Xrs2 complex and in the metabolism of G4 structures. The authors also try to make a conclusion that REV7 can regulate the DSB repair choice between homologous recombination and non-homologous end joining.

      The major observations of this study are:

      (1) Rev7 interacts with the individual components of the MRX complex in a two-hybrid assay and in a protein-protein interaction assay (microscale thermophoresisi) in vitro.

      (2) Modeling using AlphaFold-Multimier also indicated that Rev7 can interact with Mre11 and Rad50.

      (3) Using a two-hybrid assay, a 42 C terminal domain in Rev7 responsible for the interaction with MRX was identified.

      (4) Rev7 inhibits Mre11 nuclease and Rad50 ATPase activities in vitro.

      (5) Rev 7 promotes NHEJ in plasmid cutting/relegation assay.

      (6) Rev7 inhibits recombination between chromosomal ura3-1 allele and plasmid ura3 allele containing G4 structure.

      (7) Using an assay developed in V. Zakian's lab, it was found that rev7 mutants grow poorly when both G4 is present in the genome and yeast are treated with HU.

      (8) In vitro, purified Rev7 binds to G4-containing substrates.

      In general, a lot of experiments have been conducted, but the major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified.

      We appreciate Reviewer 2 for comprehensive assessment of our manuscript and their insightful comments. However, we believe that the data (Figure 7-9) in our manuscript, together with new data (Figure 9- figure supplement 2 and 3) in the revised manuscript, clearly demonstrate that Rev7 regulates the choice between HR and NHEJ.

      (1) Two stories that do not overlap (regulation of MRX by Rev7 and Rev7's role in G4 metabolism) are brought under one umbrella in this work. There is no connection unless the authors demonstrate that Rev7 inhibits the cleavage of G4 structures by the MRX complex.

      We agree with the reviewer’s point that the themes associated with the regulation of the functions of MRX subunits by Rev7 and its role G4 DNA metabolism do not overlap. This concern has also been expressed by Reviewer 1 and 3. According to their suggestion, we have deleted all figures and text describing the role of Rev7 in G4 DNA metabolism from the revised manuscript.

      (2) The authors cannot conclude based on the recombination assay between G4-containing 2-micron plasmid and chromosomal ura3-1 that Rev7 "completely abolishes DSB-induced HR". First of all, there is no evidence that DSBs are formed at G4. Why is there no induction of recombination when cells are treated with HU? Second, as the authors showed, Rev7 binds to G4, therefore it is not clear if the observed effects are the result of Rev7 interaction with G4 or its impact on HR. The established HO-based assays where the speed of resection can be monitored (e.g., Mimitou and Symington, 2010) have to be used to justify the conclusion that Rev7 inhibits MRX nuclease activity in vivo.

      We thank the Reviewer for the insightful comments and drawing our attention to the inference "completely abolishes DSB-induced HR". We have we have rephrased the conclusion, and replaced it with “REV7 gene product plays an anti-recombinogenic role during HR”. Then, the reviewer refers to lack of “evidence that DSBs are formed at G4”. At this point, unfortunately, our attempts to identify DSB at the G4 DNA site in the 2-micron plasmid did not provide a clear answer to this question. This might be related to the existence of myriad DNases in the cell and technical issues associated with the isolation of low-abundant, linearized 2-micron plasmid molecules. Because of these reasons, we cannot provide any data on DSB at the G4 site in the 2-micron plasmid.

      The reviewer then correctly points out “Why is there no induction of recombination when cells are treated with HU?” These findings are consistent with previous studies which have shown that Mre11-deficient cells are sensitivity to HU, resulting in cell death (Tittel-Elmer et al., EMBO J. 28, 1142-1156, 2009; Hamilton and Maizels, PLoS One, 5, e15387, 2010). However, a novel finding of our study is that ura3-1 rev7D cells and ura3-1 cells expressing Rev7-42 amino acid peptide (to limited extent) produce Ura3+ papillae. We have included this information in the Results section and adjusted the text to make this point clear to the reader.

      In the same paragraph, the Reviewer expresses a concern about the interaction of Rev7 with G4 DNA substrates and its impact on HR. As discussed above, in response to your comment (1) and a similar comment of Reviewer 1 and 3, we have deleted all figures and text describing the role of Rev7 in G4 DNA metabolism in the revised manuscript. The reviewer specifically refers to a study by Mimitou and Symington, 2010 in which the speed DNA end resection at the HO endonuclease-inflicted DSB was quantified. We have carried out the suggested experiment and the results are presented in Figure 9─figure supplement 3.

      Reviewer 3 (Public Review):

      Summary:

      REV7 facilitates the recruitment of Shieldin complex and thereby inhibits end resection and controls DSB repair choice in metazoan cells. Puzzlingly, Shieldin is absent in many organisms and it is unknown if and how Rev7 regulates DSB repair in these cells. The authors surmised that yeast Rev7 physically interacts with Mre11/Rad50/Xrs2 (MRX), the short-range resection nuclease complex, and tested this premise using yeast two-hybrid (Y2H) and microscale thermophoresis (MST). The results convincingly showed that the individual subunits of MRX interact robustly with Rev7. AlphaFold Multimer modelling followed by Y2H confirmed that the carboxy-terminal 42 amino acid is essential for interaction with MR and G4 DNA binding by REV7. The mutant rev7 lacking the binding interface (Rev7-C1) to MR shows moderate inhibition to the nuclease and the ATPase activity of Mre11/Rad50 in biochemical assays. Deletion of REV7 also causes a mild reduction in NHEJ using both plasmid and chromosome-based assays and increases mitotic recombination between chromosomal ura3-01 and the plasmid ura3 allele interrupted by G4. The authors concluded that Rev7 facilitates NHEJ and antagonizes HR even in budding yeast, but it achieves this by blocking Mre11 nuclease and Rad50 ATPase.

      Weaknesses

      There are many strengths to the studies and the broad types of well-established assays were used to deduce the conclusion. Nevertheless, I have several concerns about the validity of experimental settings due to the lack of several key controls essential to interpret the experimental results. The manuscript also needs a few additional functional assays to reach the accurate conclusions as proposed.

      We are happy that the Reviewer has found “many strengths” in our manuscript and further noted that “results convincingly showed that the individual subunits of MRX interact robustly with Rev7”. We greatly appreciate the Reviewer for these encouraging words, and for specific suggestions that helped us to improve the manuscript. As suggested, we have performed additional experiments including key controls and the data is presented in the revised manuscript.

      (1) AlphaFold model predicts that Mre11-Rev7 and Rad50-Rev7 binding interfaces overlap and Rev7 might bind only to Mre11 or Rad50 at a time. Interestingly, however, Rev7 appears dimerized (Figure 1). Since the MR complex also forms with 2M and 2R in the complex, it should still be possible if REV7 can interact with both M and R in the MR complex. The author should perform MST using MR complex instead of individual MR components. The authors should also analyze if Rev7-C1 is indeed deficient in interaction with MR individually and with complex using MST assay.

      Thank you for the valuable suggestion. As requested, MST titration experiments have been performed to examine the affinity of purified GFP-tagged Rev7-C1 for the Mre11, Rad50 and MR complex. The results revealed that Rev7-C1 binds to the Mre11 and Rad50 subunits with about 3- and 8.8-fold reduced affinity, respectively; whereas it binds to the MR complex with ~5.6-fold reduced affinity compared with full-length Rev7. The data is shown in Figure 1─figure supplement 4A-C.

      (2) The nuclease and the ATPase assays require additional controls. Does Rev7 inhibit the other nuclease or ATPase non-specifically? Are these outcomes due to the non-specific or promiscuous activity of Rev7? In Figure 6, the effect of REV7 on the ATP binding of Rad50 could be hard to assess because the maximum Rad50 level (1 mM) was used in the experiments. The author should use the suboptimal level of Rad50 to check if REV7 still does not influence ATP binding by Rad50.

      We thank the Reviewer for these valuable comments (Reviewer 1 has raised similar issues). Thus, we performed additional control experiments and the results indicate that (a) the ATPase activity of S. cerevisiae Dmc1 was not affected by Rev7 and (b) Rev7 does not inhibit the endonucleolytic activity of S. cerevisiae Sae2. The results are depicted in Figure 6H and 6J and Figure 5 –figure supplement 1A-D, respectively.

      As suggested by the Reviewer, using suboptimal levels of Rad50 (0.2 mM), we carried out experiments to test the effect of varying concentrations of Rev7 on the ability of Rad50 to bind ATP and catalyse its hydrolysis. The results showed that Rev7 had no discernible effect on its ability to bind ATP, even at concentrations 30 times higher than the concentration of Rad50 (Figure 6B and 6D). However, Rev7 suppresses the ATPase activity of Rad50, but not that of Dmc1, in a concentration-dependent manner (Figure G, 6J).  

      (3) The moderate deficiency in NHEJ using plasmid-based assay in REV7 deleted cells can be attributed to aberrant cell cycle or mating type in rev7 deleted cells. The authors should demonstrate that rev7 deleted cells retain largely normal cell cycle patterns and the mating type phenotypes. The author should also analyze the breakpoints in plasmid-based NHEJ assays in all mutants, especially from rev7 and rev7-C1 cells.

      We appreciate the Reviewer's critical and insightful comment. We monitored cell-cycle progression of both wild-type and rev7D cells over time using FACS. The results revealed that the cell cycle profiles and mating type phenotypes rev7D cells were similar to the wild type cells. The data is presented in Figure 7-figure supplement 1. This indicates that rev7D cells do not possess aberrant cell cycle or mating type defects as compared with the wild-type cells.

      We find the second point raised by the Reviewer although is intriguing, its relevance to the current study is unclear. In our view, identification of breakpoints using plasmid-based NHEJ assays in all the mutants will require a significant amount of time, and the insight that we may gain is unlikely to add to the central theme of this paper.  Moreover, we know for sure that Rev7 has no DNA cleavage/nicking activity.

      (4) It is puzzling why the authors did not analyze end resection defects in rev7 deleted cells after a DSB. The author should employ the widely used resection assay after a HO break in rev3, rev7, and mre11 rev7 cells as described previously.

      Thank you for the suggestion. Reviewer 1 also has raised this point. As suggested, we have analysed end resection in the rev7D cells at a HO inflicted DSB site using a qPCR assay (Mimitou and Symington, EMBO J. 2010; Gnugge et al., Mol. Cell 2023). The results revealed that deletion of REV7 led to an enhancement in the rate of DNA end resection at a DSB inflicted by HO endonuclease (Figure 9—figure supplement 3),

      (5) Is it possible that Rev7 also contributes to NHEJ as the part of TLS polymerase complex? Although NHEJ largely depends on Pol4, the authors should not rule out that the observed NHEJ defect in rev7 cells is due at least partially to its TLS defect. In fact, both rev3 or rev1 cells are partially defective in NHEJ (Figure 7). Rev7-C1 is less deficient in NHEJ than REV7 deletion. These results predict that rev7-C1, rev3 should be as defective as the rev7 deletion. Additionally, the authors should examine if Rev7-C1 might be deficient in TLS. In this regard, does rev7-C1 reduce TLS and TLS-dependent mutagenesis? Is it dominant? The authors should also check if Rev3 or Rev1 are stable in Rev7 deleted or rev7-C1 cells by immunoblot assays.

      We agree with the possibility that Rev7 may play a role in translesion DNA synthesis and TLS-dependent mutagenesis. Accordingly, Rev7-C1 might be deficient in TLS. While we do not rule out such scenarios, we respectfully suggest that this is outside the scope of the current manuscript. This manuscript focuses on the role of Rev7 in NHEJ and HR pathways, not on translesion DNA synthesis. Nevertheless, we recognise the importance of this line of investigation, and we will certainly consider this suggestion in our future work. Thank you.

      (6) Due to the G4 DNA and G4 binding activity of REV7, it is not clear which class of events the authors are measuring in plasmid-chromosome recombination assay in Figure 9. Do they measure G4 instability or the integrity of recombination or both in rev7 deleted cells? Instead, the effect of rev7 deletion or rev7-C1 on recombination should be measured directly by more standard mitotic recombination assays like mating type switch or his3 repeat recombination.

      We appreciate the Reviewer for highlighting this important point and would like to take the opportunity to clarify the rationale behind plasmid-chromosome recombination assay, as previously described (Paeschke et al., Cell 145, 678, 2011). In this assay, we are measuring the rate of Ura+ papillae formation arising from integration of the targeting plasmid into the genome at the ura3-1 locus of wild-type and rev7D cells. Analysis of PCR-generated DNA fragments indicate that pFAT10-G4 plasmid integrates at the ura3-1 genomic locus of rev7D cells, but not in the wild-type cells (Figure 9-figure supplement 2). Further, we also measured the stability of G4 DNA and the results indicate that it is stable in rev7D cells.

      Recommendations for the authors:

      Reviewer 1 (Recommendations for the authors):

      (1) Title: The word 'choice' implies a regulator. Is that the model here? Alternatively, is it pathway properties that define the preference of usage?

      This is an excellent suggestion. In the revised submission, we rephrased the title “Saccharomyces cerevisiae Rev7 promotes non-homologous end-joining by inhibiting Mre11 nuclease and Rad50 ATPase activities and Homologous recombination.”

      (2) Line 83, Introduction: Titia De Lange proposed an alternative/complementary model for Shieldin and REV7 to support fill-in by DNA polymerases including Pol alpha. This should be discussed.

      We thank the reviewer for pointing out that we have not discussed the work from Titia De Lange’s research group. We have now added new sentences to the Introduction to describe the alternative model involving Polα-primase fill-in synthesis (p3.2.7).

      (3) Line 131: The paragraph title needs to change. 2-hybrid assays cannot establish direct interaction especially when analyzing yeast proteins by yeast 2-hybrid. I agree that direct interaction is established by other means later.

      Per the Reviewer’s suggestion, we have deleted the word “directly” from the title of the paragraph.

      (4) Figure 1 D-F: The purity of the Rev7-GFP fusion is shown in Figure S1, and the purity of the Rad50, Mre11, and Xrs2 subunits as assessed by PAGE should be shown as well.

      Following this suggestion, we have included images of Coomassie blue-stained SDS-polyacrylamide gels (Figure 1-figure supplement 1), which show the purity and size of GFP tagged Rev7, Rad50, Mre11, Xrs2, Rev1, Sae2 and Dmc1 proteins.

      (5) Please check the Kd values. In the graph in D, the differences between Rad50, Mre11, and Xrs2 look much larger than the values in F suggest.

      This is a fair point and we appreciate the reviewer for highlighting. The differences between the binding profiles of the Rad50, Mre11, and Xrs2 with Rev7 as shown in the previous version of the manuscript were not obvious because of cluttering of binding curves. Therefore, the binding profiles of interacting pair of proteins were plotted separately to highlight the differences (Figure 1—figure supplement 3). Further, we rigorously analysed the dataset to ascertain the binding affinities and found that the Kd values obtained were in good agreement with the values shown in Figure 1D.

      (6) Figure 1S3: Please label the bands.

      In the revised manuscript, the protein bands in Figure1-figure (previously Figure 1S3) are identified with their names.

      (7) Line 195: Change Figure 1 to Figure 1S4.

      We have introduced the correction in the revised manuscript.

      (8) Line 202: The minimal interaction domain of 42 aa is only described in the next paragraph. The description anticipates a result about the 42 aa fragment that has not been shown to this point. Please reorder results or descriptions to make this coherent.

      We have implemented the change, as per the Reviewer’s suggestion.

      (9) Figure 2: The two-hybrid analysis in Figures 1 and 2 also identifies Rev7 self-interaction, which is not discussed. This serves as another control against the artifact of the truncation proteins and should be discussed.

      We have now discussed the significance of Rev7 self-interaction in the Y2H experiments wherever relevant in the text.

      (10) Is the 42 aa fragment sufficient to elicit a two-hybrid signal?

      We thank the reviewer for this insightful comment. To test this premise, we expressed the terminal 42 amino acid sequence of Rev7 using bait pGBKT7 vector. The results revealed that the 42 residue fragment of ScRev7 alone is sufficient for a two-hybrid interaction with the MRX subunits (Figure 2-figure supplement 1).

      (11) Line 289: Why are the EMSA conditions described as physiological? As per Material and Methods, the reaction mixtures contain 20 mM Tris-HCl (pH 7.5), 0.1 mM DTT, 0.2 mg/ml BSA, and 5% glycerol, which is far from physiological.

      As suggested by all three reviewers, the data showing the interaction of Rev7 and its truncation derivative Rev7-C1 with G4 DNA has been deleted in the revised version of the manuscript.

      (12) Figure 4C: The figure needs to increase in size. The plotting symbols are not all visible, and it is undefined what the black squares represent.

      Following the reviewer's suggestion, Figure 4C has been omitted in the revised version of the manuscript.

      (13) Figure 5: The MRX nuclease assays were conducted in the presence of Manganese. Has the more physiological divalent cation magnesium been tested?

      This has been addressed in response to the query of Reviewer 1 (Public Review). As noted above, Mre11 exhibits DNase activities only in the presence of Mn²⁺.

      (14) In Figure 5D, lane 2: What is the concentration of Rev7?

      We appreciate the reviewer for catching this. The concentration of ScRev7 used for the reaction shown in Figure 5D, lane 2 was 2 μM, as specified in the Figure legend.

      (15) Figure 6 legend: Lane 1620 "same as in lane "Is there a "1" missing?

      We thank the reviewer for pointing out the typographical error, which has been corrected in the revised manuscript.

      (16) Figure 9: Rev7-C1 lacks the 42 a peptide that is postulated to mediate anti-resection but shows normal HR here. This seems unexpected based on the premise that the 42 aa fragment supports end-joining. Rev7 seems to suppress HR independent of the function of the 42 aa peptide.

      This has been addressed in response to the query posed by Reviewer 1 in the Public Review. We do see that the Rev7-C1 lacking the 42 aa peptide suppresses HR, but the suppression was only partial as compared with the wild type. This is consistent with biochemical assays suggesting that Rev7-C1 exerts partial inhibition on the Mre11 nuclease (Figure 5) and Rad50 ATPase (Figure 6) activities. Further, the AF2 models indicate that, in addition to the C-terminal 42-aa region, other regions of Rev7 also interact with the Mre11 and Rad50 subunits (Figure 2—figure supplement 2), consistent with biochemical and genetic data.

      (17) Line 478: The conclusion that "these findings are consistent with the idea that REV7 completely abolishes DSB-induced HR in S. cerevisiae." is overly broad as the assay

      We agree with the reviewer's assessment. Accordingly, we have rephrased the sentence to soften the claim.

      Line 483ff: Based on the comments on Figure 9, the introductory sentences of the discussion do not seem to be supported by the data, as Rev7 appears to regulate HR independent of the 42 aa peptide.

      Please refer to the response of comment #16 above

      (18) Line 536: Similarly to above 17, the conclusion about the effect of the 42 aa peptide on HR appears unwarranted.

      We have revised the statement to moderate the previously exaggerated claims.

      (19) In all figures, please list in the legend, which exact strains have been used referring to Table S5.

      We have now included mentions of the strains in the figure legend wherever applicable.

      (20) Line 351: linear.

      It is corrected in the revised manuscript.

      Reviewer 2 (Recommendations For The Authors):

      (1) It is very strange and unusual that Rev7 independently binds to all three subunits of the MRX complex, raising a question of how specific these interactions are. At least, it should be a negative control in their YH2 assay and protein-protein interaction assay in vitro that Rev7 does not bind to some other proteins. For example, Sae2 and Rev7 interactions can be tested.

      The reviewer is right that it is important to validate the specificity of Y2H interactions as well as in vitro enzyme assays. These findings are shown in Figure 6 and Figure 5-figure supplement 1.  As suggested by the Reviewer, we included SAE2 in Y2H and MST assays, and Dmc1 and Sae2 in vitro enzyme assays. Our results clearly showed that Sae2 neither interacts with MRX subunits in Y2H assays (Figure 1A-C) nor inhibits the Sae2’s nuclease and Dmc1’s ATPase activities in vitro (Figure 6 and Figure 5-figure supplement 1)

      (2) It is surprising that in the Discussion the authors speculate that Rev7 might recruit Mus81 nuclease for cleavage, completely ignoring their own publication on the cleavage of G4 by MRX.

      We agree with the reviewer, and we have added discussion about MRX (mentioned above by the reviewer) in revised version.

      (3) How does the AlphaFold-Multimer modeling predict the interaction between Rev7 and MRX as a complex? Are the same regions of MRX accessible for the interaction with Rev7 in this case? Similarly, how are the activities of the MRX complex and phosphorylated Sae2 (see P. Cejka's work) affected by Rev7?

      Thank you for pointing this out. In this study, we investigated the interaction between Rev7 and Mre11, and between Rev7 and Rad50 subunits using AF2 algorithm. However, the three-dimensional structure of S. cerevisae MRX-Rev7 complex could not be constructed due to the size limits imposed by AF2 algorithm. Therefore, we are unable to comment on whether the same regions of MRX subunits in the complex are accessible for the interaction with Rev7. That said, AF2 algorithm has recently been used for structural modelling of S. cerevisiae Mre11 (1–533)-Rad50 (1–260 + 1,057–1,312) complex (Nicolas et al., Mol. Cell 84, 2223, 2024). As such, there are no AF2 structural models that cover the whole length of Mre11-Rad50 proteins.

      Regarding the second point raised by the Reviewer, our results suggest that Rev7 does interact with Sae2 in Y2H assays. However, whether phosphorylated Sae2 could potentially affect the interaction between MRX subunits and Rev7 warrants further studies.

      Minor points:

      (1) Figure 1. The labeling of the strains in A and B is genes and in C is proteins.

      The reviewer is correct. We have now corrected the error in the Figure 1 and 2.

      (2) Abstract. Carefully check English grammar.

      We thank the Reviewer for spotting this, which has been corrected in the revised manuscript.

      (3) Line 322 "Further, it has been demonstrated that Mre11 cleaves non-B DNA structures such as DNA hairpins, cruciforms and intra- and inter-molecular G-quadruplex structures)." It has not been shown that Mre11 cuts cruciform structures.

      We thank the referee for spotting this error. Mre11 does not cleave cruciform DNA structures. This error is corrected in the revised manuscript.

      (4) Page 14. Lines 452-455. What does "selective and non-selective media" mean? Is it without and with HU treatment?

      Thanks very much for the comment. In our manuscript, selective medium is composed of SC/-Leu with HU and non-selective medium is without HU. We have clarified this point in the revised version.

      (5) Page 15. Lane 472 "To assess whether increased frequency of HR is due to the instability of G-quadruplex DNA in rev7Δ cells, we examined the length of G4 DNA inserts in the plasmids carrying sequences during HR assay". It is not clear what does mean" during HR assay"? Did you examine the presence of G4 in Ura+ recombinants? If not, this analysis is meaningful.

      The reviewer is correct. We measured the presence of G4 DNA insert in Ura+ recombinants. The text has been appropriately edited to reflect these necessary modifications.

      (6) What is the nature of the ura3-1 allele? Can it revert to URA3 in rev7 mutants?

      The ura3-1 allele (glycine-to-glutamate substitution) reverts to Ura3+ at a low rate of ~2.5 × 10−9 in both orientations (Johnson et al., Mol. Cell 59, 163, 2015)

      (7) From the way that the recombination process is depicted it seems that the authors believe that plasmid should integrate into the chromosome. In reality, in most cases it should be a gene conversion where the G4 sequence (if it indeed induces DSBs) should be replaced by the wild-type segment form ura3-1, integration is not required since it is 2-micron plasmid.

      We apologize for not having made this clearer. The recombination assay with targeting plasmids containing G4 DNA forming sequences was performed as previously described (Paeschke et al., Cell 145, 678, 2011). In this assay, the appearance of Ura+ recombinants arise from the integration of the targeting plasmid bearing ura3G4 allele (with a G4 DNA forming insert) integrates into the genome at the ura3-1 locus. As shown in Author response image 1B, this is confirmed by PCR amplification of the insert in the genomic DNA of wild type and rev7D cells.

      Reviewer 3 (Recommendations For The Authors):

      (1) All Y2H experiments were performed with REV7 fusion to pGBKT7 and MRX to pGADT7. It will be helpful to test if pGAD-Rev7 also interacts with pGBK-Mre11 or Rad50 by Y2H.

      Following the reviewers' suggestions, we performed Y2H experiments in wild-type PJ69-4a cells co-transformed with the pGBKT7 vector expressing MRX subunits and the pGADT7 vector expressing Rev7. The results indicated that Rev7 interacts with Mre11, Rad50 or Xrs2 subunits, indicating that interactions are vector-independent.

      Author response image 1.

      Yeast two hybrid analysis suggest interaction between Rev7 and MRX subunits. PJ69-4A cells were co-transformed with bait vector expressing Rev7 or the Mre11, Rad50 or Xrs2 subunits and prey vector expressing Rev7 protein. Equal number of cells were spotted onto –Trp – Leu and –Trp – Leu –His dropout plates containing 3-AT and images were obtained following 48 h of incubation at 30°C. The data is representative of three independent experiments.

      (2) G4 studies are under-developed and do not add much or even negatively to the manuscript. The author might consider revising the manuscript to improve their integration with better rationales or logic. Alternatively, the authors should consider removing the G4 part for another paper.

      This concern was also raised by Reviewer 1 and 2. Following the suggestions of all reviewers, figures and text related G4 DNA studies have been deleted in the revised manuscript.

    2. eLife Assessment

      This manuscript reports important data providing evidence that a 42 amino acid region of Rev7 is necessary and sufficient for interaction with the Rad50-Mre11-Xrs2 complex in budding yeast. The authors conclude that Rev7 inhibits the Rad50 ATPase and the Mre11 nuclease with the exception of ssDNA exonuclease activity. The convincing data largely support the conclusions, although the effect of Rev7 on homologous recombination is less well documented and the observed effect on resection is moderate. Specifically, the result that the Rev7 C-terminal truncation lacking the 42 amino acid region still suppresses homologous recombination is unexpected and unexplained.

    3. Reviewer #1 (Public review):

      Summary:

      The mammalian Shieldin complex consisting of REV7 (aka MAD2L2, MAD2B) and SHLD1-3 affects pathway usage in DSB repair favoring non-homologous endjoining (NHEJ) at the expense of homologous recombination (HR) by blocking resection and/or priming fill-in DNA synthesis to maintain or generate near blunt ends suitable for NHEJ. While the budding yeast Saccharomyces cerevisiae does not have homologs to SHLD1-3, it does have Rev7, which was identified to function in conjunction with Rev3 in the translesion DNA polymerase zeta. Testing the hypothesis that Rev7 also affect DSB resection in budding yeast, the work identified a direct interaction between Rev7 and the Rad50-Mre11-Xrs2 complex by two-hybrid and direct protein interaction experiments. Deletion analysis identified that the 42 amino acid C-terminal region was necessary and sufficient for the 2-hybrid interaction. Direct biochemical analysis of the 42 aa peptide was not possible. Rev7 deficient cells were found to be sensitive to HU only in synergy with G2 tetraplex forming DNA. Importantly, the 42 aa peptide alone suppressed this phenotype. Biochemical analysis with full-length Rev7 and a C-terminal truncation lacking the 42 aa region shows G4-specific DNA binding that is abolished in the C-terminal truncation and with a substrate containing mutations to prevent G4 formation. Rev7 lacks nuclease activity but inhibits the dsDNA exonuclease activity of Mre11. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition suggesting the involvement of additional binding sites besides the 42 aa region. Also, the Mre11 ssDNA endonuclease activity is inhibited by Rev7 but not the degradation of linear ssDNA. Rev7 does not affect ATP binding by Rad50 but inhibits in a concentration-dependent manner the Rad50 ATPase activity. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition but significantly less than the full-length protein. Using an established plasmid-based NHEJ assay, the authors provide strong evidence that Rev7 affects NEHJ, showing a four-fold reduction in this assay. The mutations in the other Pol zeta subunits, Rev3 and Rev1, show a significantly smaller effect (~25% reduction). A strain expressing only the Rev7 C-terminal 42 aa peptide showed no NHEJ defect, while the truncation protein lacking this region exhibited a smaller defect than the deletion of REV7. The conclusion that Rev7 supports NHEJ mainly through the 42 aa region was validated using a chromosomal NHEJ assay. The effect on HR was assessed using a plasmid:chromosome system containing G4 forming DNA. The rev7 deletion strain showed an increase in HR in this system in the presence and absence of HU. Cells expressing the 42 aa peptide were indistinguishable from wild type as were cells expressing the Rev7 truncation lacking the 42 aa region. The authors conclude that Rev7 suppresses HR, but the context appears to be system-specific and the conclusion that Rev7 abolished HR repair of DSBs is unwarranted and overly broad.

      Strength:

      This is a well-written manuscript with well-executed experiments which suggest that Rev7 inhibits MRX-mediated resection to favor NEHJ during DSB repair. This finding is novel and provides insight into the potential mechanism of how the human Shieldin complex might antagonize resection.

      Weaknesses:

      The nuclease experiments were conducted using manganese as a divalent cation, and it is unclear whether there is an effect with the more physiological magnesium cation. The data largely support the conclusions, although the effect of Rev7 on HR is less well documented, as only a highly specialized assay is used that does not warrant the broad conclusion drawn. Specifically, the results that the Rev7 c-terminal truncation lacking the 42 aa region still suppresses HR is unexpected and unexplained.

      In this revision the authors addressed most of my concerns by text revisions and addition of new data.

      The new two hybrid data showing that the 42 amino acid segment interacts with MRN are valuable. However, it may not be clear to which subunit the 42 aa segment binds, as in the yeast 2H system the chromosomally encoded subunits are present or were the 2H experiments conducted in an MRN deletion background?. This could be acknowledged.

      The material and methods section was updated to indicate use of 5 mM MnCl2 and 5 mM MgCl2 in the exonuclease assay but not the endonuclease assay. Please check if this is correct. Why the difference between both assays? There is a concern that the absence of ATP and Mg affects the endonuclease assay.

      The addition of Dmc1 as a specificity control for the ATPase inhibition is nice and shows a specific effect. The use of Sae2 associated nuclease activity as a specificity control for the nuclease inhibition is problematic. There has been considerable debate about the Sae2 associated nuclease activity, which seems to have been solved by the Cejka lab showing that Sae2 is a cofactor of MRN without intrinsic nuclease activity (e.g. https://pubmed.ncbi.nlm.nih.gov/25231868/). Or do the authors want to suggest that Sae2 has intrinsic nuclease activity? The control may still be useful mentioning that the nuclease is associated but not intrinsic and citing the relevant papers.

    4. Reviewer #2 (Public review):

      In this study, Badugu et al investigate the Rev7 roles in regulating the Mre11-Rad50-Xrs2 complex and in metabolism of G4 structures. The authors also try to make a conclusion that REV7 can regulate the DSB repair choice between homologous recombination and non-homologous end joining.<br /> The major observations of this study are:

      (1) Rev7 interacts with the individual components of the MRX complex in a two-hybrid assay and in a protein-protein interaction assay (microscale thermophoresisi) in vitro.<br /> (2) Modeling using AlphaFold-Multimier also indicated that Rev7 can interact with Mre11 and Rad50.<br /> (3) Using a two-hybrid assay, a 42 C terminal domain in Rev7 responsible for the interaction with MRX was identified.<br /> (4) Rev7 inhibits Mre11 nuclease and Rad50 ATPase activities in vitro.<br /> (5) Rev 7 promotes NHEJ in plasmid cutting/relegation assay.<br /> (6) Rev7 inhibits recombination between chromosomal ura3-1 allele and plasmid ura3 allele containing G4 structure.<br /> (7) Using an assay developed in V. Zakian's lab, it was found that rev7 mutants grow poorly when both G4 is present in the genome and yeast are treated with HU.<br /> (8) In vitro, purified Rev7 binds to G4-containing substrates.

      In general, a lot of experiments have been conducted, but the major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified.

      (1) Two stories that do not overlap (regulation of MRX by Rev7 and Rev7 role in G4 metabolism) are brought under one umbrella in this work. There is no connection unless the authors demonstrate that Rev7 inhibits the cleavage of G4 structures by the MRX complex.

      (2) The authors cannot conclude based on the recombination assay between G4-containing 2-micron plasmid and chromosomal ura3-1 that Rev7" completely abolishes DSB-induced HR". First of all, there is no evidence that DSBs are formed at G4. Why is there no induction of recombination when cells are treated with HU? Second, as the authors showed, Rev7 binds to G4, therefore it is not clear if the observed effects are the result of Rev7 interaction with G4 or impact on HR. The established HO-based assays where the speed of resection can be monitored (e.g., Mimitou and Symington, 2010) have to be used to justify the conclusion that Rev7 inhibits MRX nuclease activity in vivo.

      Comments on the revised version:

      I am satisfied with the revision. Specifically, i) the elimination of the G4 part and ii) the implementation of the HO-endonuclease resection assay described in Mimiou and Symington, 2010 significantly improved the clarity of the work and strengthened the conclusion about the Rev7 interference with DNA resection.

    5. Reviewer #3 (Public review):

      Summary:

      REV7 facilitates the recruitment of Shieldin complex and thereby inhibits end resection and controls DSB repair choice in metazoan cells. Puzzlingly, Shieldin is absent in many organisms, and it is unknown if and how Rev7 regulates DSB repair in these cells. The authors surmised that yeast Rev7 physically interacts with Mre11/Rad50/Xrs2 (MRX), the short-range resection nuclease complex and tested this premise using yeast two hybrid (Y2H) and microscale thermophoresis (MST). The results convincingly showed that the individual subunits of MRX interacts robustly with Rev7. By AlphaFold Multimer modelling followed by Y2H confirmed that the carboxy terminal 42 amino acid is essential for interaction with MR and G4 DNA binding by REV7. The mutant rev7 lacking the binding interface (Rev7-C1) to MR shows moderate inhibition to the nuclease and the ATPase activity of Mre11/Rad50 in biochemical assays. Deletion of REV7 also causes a mild reduction in NHEJ using both plasmid and chromosome-based assays and increases mitotic recombination between chromosomal ura3-01 and the plasmid ura3 allele interrupted by G4. The revision also showed that rev7 deleted cells exhibit mild hyper-resection phenotype at 0.7 and 3 kb from the DSB using qPCR assays. The authors concluded that Rev7 facilitates NHEJ and antagonises HR even in budding yeast, but it achieves this by blocking Mre11 nuclease and Rad50 ATPase.

      Weaknesses:

      There are several strengths to the studies and the broad types of well-established assays were used to deduce the conclusion. Nevertheless, there are notable discrepancies on the mutant phenotypes that were to test the functionality of Rev7-MRX interaction on the repair outcomes, raising concerns on the validity of the proposed model. The manuscript also needs a few additional functional assays to reach the accurate conclusions as proposed. The revision responded to several comments raised by the reviewers, but they are inadequate to address the key concerns and did not offer sufficient and compelling experimental support to the main premise that Rev7-Mre11/Rad50/Xrs2 interactions regulate MRX activities in cells and thereby modulates DSB repair choice in budding yeast.

      (1) AlphaFold model predicts that Mre11-Rev7 and Rad50-Rev7 binding interfaces overlap and Rev7 might bind only to Mre11 or Rad50 at a time. Interestingly, however, Rev7 appears dimerized (Fig.1). Since MR complex also forms with 2M and 2R in the complex, it should still be possible if REV7 can interact both M and R in the MR complex. The author should perform MST using MR complex instead of individual MR components. The authors should also analyze if Rev7-C1 is indeed deficient in interaction with MR individually and with complex using MST assay.

      (2) The nuclease and the ATPase assays require additional controls. Does Rev7 inhibit the other nuclease or ATPase non-specifically? Are these outcomes due to the non-specific or promiscuous activity of Rev7? In fig.6, the effect of REV7 on the ATP binding of Rad50 could be hard to assess because the maximum Rad50 level (1 uM) was used in the experiments. The author should use the suboptimal level of Rad50 to check if REV7 still does not influence ATP binding by Rad50.

      (3) The moderate deficiency in NHEJ using plasmid based assay in REV7 deleted cells can be attributed to aberrant cell cycle or mating type in rev7 deleted cells. The authors should demonstrate that rev7 deleted cells retain largely normal cell cycle pattern and the mating type phenotypes. The author should also analyze the breakpoints in plasmid based NHEJ assays in all mutants especially from rev7 and rev7-C1 cells.

      (4) It is puzzling why the authors did not analyze end resection defects in rev7 deleted cells after a DSB. The author should employ the widely used resection assay after a HO break in rev3, rev7 and mre11 rev7 cells as described previously.

      (5) Is it possible that Rev7 also contributes to NHEJ as the part of TLS polymerase complex? Although NHEJ largely depends on Pol4, the authors should not rule out the possibility if the observed NHEJ defect in rev7 cells are due at least partially to its well-known TLS defect and not all due to their role in MRX activity regulation as the authors proposed. In fact, rev3 or rev1 cells are partially defective in NHEJ (Fig. 7). Rev7-C1 is less deficient in NHEJ than REV7 deletion. These results predict that rev7-C1 rev3 could be more deficient than rev3 or rev7-C1, and such results might indicate that Rev7 contributes to NHEJ by two ways; one by interacting (and modulating) MRX and the other as part of Rev3-Rev7 complex. Additionally, the authors should examine if Rev7-C1 might be deficient in TLS. In this regard, does rev7-C1 reduce TLS and TLS dependent mutagenesis? Is it dominant? The authors should also check if Rev3/Rev1 complexes are stable in Rev7 deleted or rev7-C1 cells by immunoblot assays.

      (6) Due to the G4 DNA and G4 binding activity of REV7, it is not clear which class of events the authors are measuring in plasmid-chromosome recombination assay in Fig.9. Do they measure G4 instability or the integrity of recombination or both in rev7 deleted cells. Instead, the effect of rev7 deletion or rev7-C1 on recombination should be measured directly by more standard mitotic recombination assays like mating type switch or his3 repeat recombination. The revision did not address these concerns, which still makes the interpretation of the provided recombination results difficult.

    1. eLife Assessment

      The authors established a useful syndetome differentiation protocol from human induced pluripotent stem cells, guided by single-cell transcriptomic analysis. Their findings could significantly impact the field, particularly for patients needing tendon cell therapy. However, the evidence presented is currently incomplete, as the authors did not yet test the applicability of their protocol across multiple human induced pluripotent stem cell lines.

    2. Reviewer #1 (Public review):

      Papalamprou et al. established a methodology to differentiate iPSCs to the syndetome stage and validated it by marker gene expression and scRNA-seq analysis. They further found that inhibition of WNT signaling enhanced the homogeneity of the cell population after identifying a group of branching-off cells that overexpressed WNT. Their results will be helpful in developing cell therapy systems for tendon injuries. However, there are several issues to improve the manuscript:

      IPA analysis was performed after scRNA-seq. Although it is knowledge-based software with convenient graphic utilities, it is questionable whether an unbiased genome-level analysis was performed. Therefore, it is not convincing if WNT is the only and best signal for the branching-off marker. Perhaps independent approaches, such as GO, pathway, or module analyses, should be performed to validate the findings.

      According to the method section, two iPSC lines were used for the study. However, throughout the manuscript, it is not clearly described which line was used for which experiment. Did they show similar efficiency in differentiation and in responses to WNTi? It is also worrisome if using only two lines is the norm in the stem cell field. Please provide a rationale for using only two lines, which will restrict the observation of individual-specific differential responses throughout the study.

      How similar are syndetome cells with or without WNTi? It would be interesting to check if there are major DEGs that differentiate these two groups of cells.

      Please discuss the improvement of the current study compared to previous ones (e.g., PMID 36203346, 35083031, 35372337).

    3. Reviewer #2 (Public review):

      Summary:

      Dr. Sheyn and colleagues report the step-wise induction of syndetome-like cells from human induced pluripotent stem cells (iPSCs), following a previously published protocol which they adjusted. The progression of the cells through each stage, i.e. presomitic mesoderm (PSM), somitic mesoderm (SM), sclerotome (SCL), and syndetome (SYN)) is characterized using FACS, RT-qPCR and immunofluorescence staining (IF). The authors performed also single-cell RNA sequencing (scRNAseq) analysis of their step-wise induced cells and identify signaling pathways which are potentially involved in and possibly necessary for syndetome induction. They then optimized their protocol by simultaneous inhibition of BMP and Wnt signaling pathways, which lead to an increase in syndetome induction while inhibiting off target differentiation into neural lineages.

      Strengths:

      The authors conducted scRNAseq analysis of each step of their protocol from iPSCs to syndetome-like cells and employed pathway analysis to uncover further insights into somitic mesoderm (SM) and syndetome (SYN) differentiation. They found that BMP inhibition, in conjunction with the inhibition of WNT signaling, plays a role in driving syndetome differentiation. Analyzing their scRNAseq results, they could improve the syndetome induction efficiency of their protocol from 47.6% to 67%-78% while off-target differentiation into neural lineages could be reduced.

      Weaknesses:

      The authors demonstrated the efficiency of syndetome induction solely by scRNA-seq data analysis before and after pathway inhibition, without using e.g. FACS analysis or immunofluorescence (IF)-staining based assessment. A functional assessment and validation of the induced cells is also completely missing.

    4. Reviewer #3 (Public review):

      Papalamprou et al sought to fine tune existing tenogenic differentiation protocols to develop a robust multi-step differentiation protocol to induce tendon cells from human GMP-ready iPSCs. In so doing, they found that while existing protocols are capable of driving cells towards a syndetome-like fate, the resultant cultures contain highly heterogeneous cell populations with sub-optimal cell survival. Through single cell transcriptomic analysis they identify WNT signaling as a potential driver of an off-target neural population and show that inhibition of WNT signaling at the later 2 stages of differentiation can be used to promote higher efficiency of generation of syndetome-like cells.

      This paper includes a useful paradigm for identifying transcriptional modulators of cell fate during differentiation and a clear example where transcriptional data can be used to guide the chemical modulation of a differentiation protocol to improve cell output. The paper's conclusions are mostly well supported by the data, but the image analysis and discussion need to be improved to strengthen the impact.

      The data outlining the differences between the differentiation outcome of the two tested iPSCs is intriguing, but the authors fail to comment on potential differences between the two iPSC lines that could result in drastically different cell outputs from the same differentiation protocol. This is a critically important point, as the majority of the SCX+ cells generated from the 007i cells using their WNTi protocol were found in the FC subpopulation that failed to form from the 83i line under the same protocol. From the analysis of only these 2 cells lines in vitro, it is difficult to assess whether this WNTi protocol can be broadly used across multiple cell lines to generate tenogenic cells. The authors failed to update the text of the manuscript to reflect the potential differences in the two cell lines and the general applicability of their protocol, but rather just include the description of the proposed explanation in the response to reviewer comments. These critical differences in the response to their protocol and their implications for the applications of this proof-of-concept study should be included in the main text.

      The authors make claims about changes in protein expression but fail to quantify either fluorescence intensity or percent cell expression from their immunofluorescence analyses to substantiate these claims. The authors state in their response to reviewers that immunofluorescence is qualitative but continue to make quantitative statements such as upregulated or downregulated in both the text and legend describing these images. The authors should either perform the quantification of the IFs, use Western blots for protein quantification of their cell cultures, use Flow Cytometry to count cell numbers, or remove these quantitative words from the description of the images. The image quality and staining specificity continue to be a limitation of this study. These claims are not fully supported by the data as presented as it is unclear whether there is increased expression of tendon markers at the protein level or more cells surviving the protocol.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1: IPA analysis was performed after scRNA-seq. Although it is knowledge-based software with convenient graphic utilities, it is questionable whether an unbiased genome-level analysis was performed. Therefore, it is not convincing if WNT is the only and best signal for the branching-off marker. Perhaps independent approaches, such as GO, pathway, or module analyses, should be performed to validate the finding.

      Thanks for your comment. We agree with the reviewer that IPA is a knowledge-based and a hypothesis-driven method. Our hypothesis was that WNT/BMP pathways, among others, are heavily involved in the development of mesenchymal tissues in general and differentiation of tendons specifically. Therefore, we have looked at differentially expressed genes between clusters from a broad array of pathways featured in IPA that could point us towards molecular function that could make a difference. We further corroborated this hypothesis by using WNT inhibitors in subsequent experiments. To address this point, we have supplemented the discussion section with the following remark:

      “This study is not without limitations. The IPA network analysis is a knowledge-based and hypothesis driven platform. We have specifically targeted known pathways to be involved in syndetome differentiation. However, WNT signaling stood out with very specific affinity to the off-target populations and we have verified our findings with experiments proving this hypothesis.”

      Per the reviewer’s suggestion, we also performed a non-biased GO analysis (Supp. Fig. 6). Multiple pathways were detected in the three clusters of interest (Supp. Fig. 6A-C), including integrin-related and TGFβ-related pathways. However, in these three clusters of interest, WNT signaling was also detected as a prominent pathway. Therefore, we could conclude that it plays a pivotal role in the differentiation process. This hypothesis was later corroborated with WNT inhibitor experiments.

      Comment 2: According to the method section, two iPSC lines were used for the study. However, throughout the manuscript, it is not clearly described which line was used for which experiment. Did they show similar efficiency in differentiation and in responses to WNTi? It is also worrisome if using only two lines is the norm in the stem cell field. Please provide a rationale for using only two lines, which will restrict the observation of individual-specific differential responses throughout the study.

      Thanks for your comment. This proof-of-concept study is the first investigation that compares data of an in vitro tenogenic induction protocol that has been tested in more than one human iPSC lines. We agree that line-specific phenomena are difficult to interpret and reproduce. Therefore, it is critical to provide data supporting that the findings can be reproduced in more than one line. Some early studies used one line as proof of concept, however now we realize the need to show that the protocol works in at least one additional line.

      Here we used the GMP-ready iPSC line CS0007iCTR-n5 for all optimization experiments. This newer low passage feeder-free line was generated from PBMCs and was designated as GMP-ready in the manuscript because it has been derived and cultured using cGMP xeno-free components (mTESR plus medium and rhLaminin-521 matrix substrate instead of Matrigel). We then wanted to confirm the application of the optimized protocol using the reference control line CS83iCTR-22n1 which has already been more widely used by our group1-5 and others.6 This line has been derived from fibroblasts and has been grown and expanded using MatrigelTM and mTESR1, followed by mTESR plus media. 

      The question of number of lines needed is stage-dependent. In our opinion at the proof-of-concept level, two lines, one of which has been generated in GMP-like conditions is sufficient. Confirmation with multiple lines becomes more pertinent as we move towards scale-up/manufacturing, where considerations regarding robustness and consistency are raised. However, at this stage, it is crucial to understand the developmental processes that are involved in cell differentiation to ensure a more robust protocol can be modified and adapted later. In future studies, as we move towards clinical translation, it is warranted that the approach presented in this work will be further optimized and subsequently evaluated using at least 3 different cell lines that have been generated from various sources.

      Comment 3: How similar are syndetome cells with or without WNTi? It would be interesting to check if there are major DEGs that differentiate these two groups of cells.

      Thanks for your comment. Single cell RNAseq analysis revealed that treatment with WNTi upregulated tenogenic markers. In SYNWNTi, the expression levels of stage-specific markers COL1A1, COL3A1, SCX, MKX, DCN, BGN, FN1, and TNMD were higher compared to the untreated SYN group, as shown in Figure 5C. Density plots depicted an increase in the number of cells expressing COL1A1, COL3A1, SCX and TNMD in SYNWNTi compared to the SYN group, as illustrated in Figure 5D. Trajectory analysis of the WNTi-treated group revealed the absence of bifurcations observed in the untreated group (Fig. 5E). Therefore, it can be conjured that syndetome cells with and without WNTi are different.

      Comment 4: Please discuss the improvement of the current study compared to previous ones (e.g., PMID 36203346 my study, 35083031- Tsutsumi, 35372337- Yoshimoto).

      Thanks for your comment. In Papalamprou et al (2023)3, we differentiated iPSCs to mesenchymal stromal-like cells (iMSCs), which were then cultured into a 2D dynamic bioreactor for 7 days. In that study, we examined the impact of simultaneous overexpression of the tendon transcription factor Scleraxis (SCX) using a lentiviral vector and mechanical stimulation on the process of tenogenic differentiation. Following 7 days of uniaxial cyclic loading, we observed notable modifications in the morphology and cytoskeleton organization of iPSC-derived MSCs (iMSCs) overexpressing SCX. Additionally, there was an increase in extracellular matrix (ECM) deposition and alignment, along with upregulation of early and late tendon markers. This proof-of-concept study showed that iPSC-derived MSCs could be a viable cell candidate for cell therapy applications and that mechanical stimulation is contributing to the differentiation of iMSCs towards the tenogenic lineage.

      Similarly, Tsutsumi et al7 overexpressed the tendon transcription factor Mohawk (MKX) stably in iPSC-derived MSCs using lentiviral vectors. These cells were then used to seed collagen hydrogels which were mechanically stimulated in a cyclic stretch 3D culture bioreactor for 15 days to create artificial tendon-like tissues, which the authors termed “bio-tendons”. Bio-tendons were then decellularized to remove cellular remnants from the xenogeneic human iPSC-derived cells and were subsequently transplanted in an in vivo Achilles tendon rupture mouse model. The authors reported improved histological and biomechanical properties in the Mkx-bio-tendon mice vs. the GFP-bio-tendon controls, providing another proof-of-concept study in favor of the utilization of iPSC-derived MSCs for tendon cell therapies, while also addressing the immunogenicity of cells of allogeneic/xenogeneic origin. Therefore, the above two studies used tendon transcription factor overexpression and mechanical loading either in 2D or 3D to differentiate MSCs towards the tendon/ligament lineage.

      Yoshimoto et al8 optimized a stepwise iPSC to tenocyte induction protocol using a SCX-GFP transgenic mouse iPSC line, by monitoring GFP expression over time. The group performed scRNA-seq to characterize the induction of mesodermal progenitors towards the tenogenic lineage and to shed light into their developmental trajectory. That study unveiled that Retinoic Acid (RA) signaling activation enhanced chondrogenic differentiation, which was in contrast to the study of Kaji et al (2021), which also used a SCX-GFP mouse iPSC line. Kaji et al inhibited TGF and BMP signaling during the process of mesodermal induction and reported that RA signaling eliminated SCX induction entirely and promoted a switch to neural fate. Yoshimoto et al suggested that variations in mesodermal cell identity could be due to the different methods used for mesodermal differentiation. In contrast to the Kaji et al study, Yoshimoto et al opted to stimulate WNT and block the Hedgehog pathway during mesoderm induction. Loh et al (2016) identified the branchpoint from the primitive streak to either the paraxial mesoderm (PSM) or the lateral plate mesoderm (LPM) as the result of two mutually exclusive signaling conditions. Specifically, they reported that induction of PSM was achieved through BMP suppression and WNT stimulation, while the specification of lateral mesoderm was accomplished by BMP stimulation and WNT suppression, all with concurrent TGFβ suppression/FGF stimulation. Lastly, a similar approach towards PSM induction from primitive streak (TGF off/BMP off/ WNT on/FGF on) has been used by many subsequent studies Matsuda et al (2020),9 Wu et al (2021)10 and Nakajima et al (2021).11 The diversity of the above-mentioned approaches points to the plasticity of mesodermal progenitors and the need for additional studies to better understand mesodermal specification and subsequent induction towards sclerotome and syndetome.   

      In the current study we optimized a stepwise differentiation protocol using xeno-free cGMP ready media and two different cell lines, one of which was cGMP-ready. We used scRNA-seq to characterize the differentiation, which led us to identify off-target cells that were closer to a neural phenotype. We performed pathway analyses and hypothesized that WNT signaling activity might have contributed to the emergence of the off-target cells. To test this, we used a WNT inhibitor (PORCN) to block WNT activity at the SCL stage and at the SYN stage. We found that blockade of WNT signaling at the end of the SM stage and during SCL and SYN induction resulted in a more homogeneous population, while eliminating the neural-like cell cluster. This is the first study that utilized scRNA-seq to shed light into the developmental trajectory of stepwise iPSC to tendon differentiation of human iPSCs and provided a proof-of-concept for the generation of a more homogeneous syndetome population. Further studies are needed to further fine-tune both the process and the final product, as well as elucidate the functionality of iPSC-derived syndetome cells in vitro and in vivo.

      Reviewer 2:

      General concerns: The authors demonstrated the efficiency of syndetome induction solely by scRNA-seq data analysis before and after pathway inhibition, without using e.g. FACS analysis or immunofluorescence (IF)-staining based assessment. A functional assessment and validation of the induced cells is also completely missing.

      We appreciate and agree with the reviewer’s critique regarding further analyses of differentiated iPSC-derived syndetome-like cells, including functional assessment of the differentiated cells. Immunofluorescence was used at all timepoints of induction for phenotype confirmation (Fig. 2,4). Flow cytometry for DLL1 was utilized to benchmark efficient differentiation to PSM (Loh et al,12 Nakajima et al11. Specifically, DLL1 expression was assessed with flow cytometry after 4 days of induction, and was used to optimize the parameter of initial iPSC aggregate seeding density, which has been previously found to be crucial for in vitro differentiation protocols (Loh et al12). Unfortunately, this parameter is usually not reported although it could be critical to establish protocol replication between different lines.

      The function of tendon progenitors is usually reported as response to mechanical cues and the ability to regenerate tendon injuries. In future studies we intend to assess the functionality of the generated syndetome and tendon progenitors and their response to in vitro biomechanical stimulation as previously reported to iMSCSCX+ cells3, 13 and in vivo in a critical tendon defect  similarly to what has been previously reported.2 

      Comment 1: Notably, in Figure 1D, certain PSM markers (TBXT, MSGN1, WNT3A) show higher expression on day 3. If the authors initiate SM induction on day 3 instead of day 4, could this potentially enhance the efficiency of syndetome-like cell induction?

      Thanks for your comment. In the current work, we initially optimized differentiation to PSM via expression of DLL1, whose gene expression peaked at d4. We found that this was influenced by the initial iPSC aggregate seeding density. We wanted to generate a homogeneous DLL1+ population which we assessed via gene expression, flow cytometry, IF and scRNA-seq (Fig. 1D, 2C, 3C and Suppl. Fig.1). Given the fact that different lines might display a diverse developmental timeline, we also confirmed reproducibility of the protocol with a second cell line. We appreciate the reviewer’s suggestion to investigate additional protocol iterations, such as the proposed one at the PSM stage, as we move towards a better understanding of key developmental events during in vitro induction.

      Comment 2:  In the third paragraph of the result section the authors note, "Interestingly, SCX, a prominent tenogenic transcription factor, was significantly downregulated at the SCL stage compared to iPSC, but upregulated during the differentiation from SCL to SYN." Despite this increase, the expression level of SCX in SYN remains lower than that in iPSCs in Fig.1G and Fig.3C. Can the authors provide an explanation for this? Can the authors provide IF data using iPSCs and compare it with in vitro-induced SYN cells? Can the authors provide e.g. additional scRNA-seq data which could support this statement?

      Thank you for your comment. In Fig. 1G, SCX expression in SYN was upregulated compared to SCL, however, it was shown to be similar to iPSCs. This suggests a baseline stochastic expression of SCX possibly stemming from spontaneous differentiation of iPSCs in culture (Fig. 3C). Previous research has shown that tenogenic marker gene expression tends to reduce during postnatal tendon maturation (Yin et al., 2016b14 Grinstein et al., 2019.15 Yoshimoto et al (2022) utilized a transgenic mouse iPSC-SCX-GFP line  to track SCX expression. It was shown that SCX expression peaked after 7d of tenogenic induction and was then decreased at day 14, which marked the end of tenogenic induction. The authors postulated that this pattern of gene expression could either indicate further maturation of tenocytes at subsequent time points, or that the number of non-tenogenic cells increased from T7 to T14.

      In the present work, we showed SCX gene expression upregulation in SYN compared to SCL, as well as significant upregulation of TNMD, EGR1, COL1A1 and COL3A1 (Fig.1G). Supp. Fig.8 has been added to show feature plots of SCX and TNMD expression from SCL, SYN and SYNWNTi.  The significant upregulation of later markers of tenogenic differentiation suggests that the 21 days of tenogenic induction might have matured the cells. Since gene expression analysis only conveys a snapshot of the transcriptional profile of a cell population, it is likely that we might have missed the peak of SCX upregulation (Supp. Fig. 5). Following treatment with the WNT inhibitor, the SYNWNTi group displayed increased SCX expression (% cells expressing SCX) compared to SYN, which might also be due to a more homogeneous population of syndetome-like cells following treatment with WNTi. In the SYNWNTi group, TNMD was shown to be expressed in the SYN cluster, whereas SCX was mostly found in the cluster that was labelled as fibrocartilage (FC) cluster based on the expression of COL2A1/SOX9/FN1/BGN/COL1A1 markers. Due to the fact that SCX+/SOX9+ progenitor cells are able to give rise to both tendon and cartilage (Sugimoto 2013)16, it could be postulated that this cluster contains tendon progenitors. Interestingly, the FC cluster was not observed in the second iPSC line that we tested, which resulted in a more homogeneous induction to syndetome (78.5% vs. 66.9% SYN cells, Supp. Table 1 & Supp. Fig.3). This slight discrepancy between the two lines and more specifically the presence of the FC cluster only in the 007i line, warrants further investigation. Taken together, these data indicate that the tenogenic induction duration could likely be shortened. Further work to assess the time course of SCX expression over the entire tenogenic induction could be used to further optimize the in vitro induction. For instance, a human edited iPSCSCX-GFP+ line could be generated and used to track SCX expression during the entire induction.

      Comment 3: In the fourth paragraph of the result section the authors state, "SM markers (MEOX1, PAX3) and SCL markers (PAX1, PAX9, NKX3.2, SOX9) were upregulated in a stepwise manner." However, the data for MEOX1 and NKX3.2 seems to be missing from Figure 3B-C. The authors should provide this data and/or additional support for their claim.

      Thanks for your comment. Feature plots for MEOX1 and NKX3.2 have been added to the Supplemental information (Supp. Fig. 9).

      Comment 4: In Figures 2B and 2E, the background of the red channel seems extremely high. Are there better images available, particularly for MEOX1? Given the expected high expression of MEOX1 in SM cells, the authors should observe a strong signal in the nucleus of the stained somitic mesoderm-like cells, but that is not the case in the shown figure. The authors should provide separate channel images instead of merged ones for clarity. The antibody which the authors used might not be specific. Can the authors provide images using an antibody which has been shown to work previously e.g. antibody by ATLAS (Cat#: HPA045214)?

      As requested by the reviewer, we have provided separate channels for those images in the Supplement (Supp. Fig. 7). The images show relatively high expression of these markers in SM cells.

      Comment 5: In Fig. 2C and Supplementary Fig. 1, the authors present data from immunofluorescence (IF) staining and FACS analysis using a DLL1 antibody. While FACS analysis indicates an efficiency of 96.2% for DLL1+ cells, this was not clearly observed in their IF data. How can the authors explain this discrepancy? Could the authors quantify their IF data and compare it with the corresponding FACS data?

      Thanks for your comment. We performed flow cytometric analysis of DLL1 expression to optimize cell seeding density using the 007i line. In the present study, we used IF only in a qualitative manner, that is to confirm protein expression of selected markers. It could be noted that the use of poly-lysine coated coverslips, which are needed for IF, might have slightly altered the density of the cells on the coverslip vs. the plate. Lastly, it cannot be ruled out that the different substrate could have influenced their phenotype differentially through matrix interactions and signaling. On the other hand, flow cytometry by nature is a quantitative and single cell approach, whereas IF staining is qualitative. Therefore, for the purpose of this proof-of-concept work, we tend to trust the quantitative data from the flow cytometry results more than semi-quantitative confirmation achieved through IF staining using coverslips. 

      Comment 6: In Fig. 2G, PAX9 is expected to be expressed in the nucleus, but the shown IF staining does not appear to be localized to the nucleus. Could the authors provide improved or alternative images to clarify this? The authors should use antibodies shown to work with high specificity as already reported by other groups.

      Thanks for your comment. Indeed, the staining seems to be mostly cytoplasmic. We have used antibodies that were previously reported3 and repeated the staining, however, the same results were replicated. We can speculate that this transcription factor has additional role in the iPSC-derived cells and might be traveling to the cytoplasm. Unfortunately, we have no evidence to this phenomenon.  

      Comment 7: Why did the authors choose to display day 10 data for SYN induction in Fig. 4A? Could they provide information about the endpoint of their culture at day 21?

      Thank you for your comment. In Fig. 1G we provided gene expression analyses results for several selected early and later tendon markers for the endpoint of our culture, that is day 21. Following scRNA-seq at each stage of the differentiation (iPSC at d0, PSM at d4, SM at d8, SCL at d11 and the endpoint day 32 for SYN), we performed DEG analysis using the IPA platform. We identified activation of genes associated with the WNT signaling pathway in the off-target clusters. We hypothesized that WNT pathway inhibition might block the formation of unwanted fates and induce a more homogeneous differentiation outcome. We thus tested a WNT inhibitor and compared the inhibitor-treated group with a non-treated group. We then assessed selected neural markers during the course of the inhibitor application. In Fig. 4A we presented gene expression of key selected markers at day 21 using qPCR, which was approximately in the middle of the syndetome induction. Since we observed that the inhibitor downregulated the selected neural markers, we then applied the inhibitor until the endpoint of the initial induction and proceeded to analyze the results using scRNA-seq (Fig. 5). Lastly, it should be acknowledged that this was a proof-of-concept study, and additional optimizations are needed regarding the application of the inhibitor (timing, duration, concentration, etc).

      Comment 8: In Supplementary Fig. 5, the authors depicted the expression level of SCX, a SYN marker, which peaked at day 14 and then decreased. By day 21, it reached a level comparable to that of iPSCs. Given this observation, could the authors provide a characterization of the cells at day 21 during SYN induction using IF? What was the rationale behind selecting 21 days for SYN induction? The authors also need to show 'n numbers'; how many times were the experiments repeated independently (independent experiments)?

      Thanks for your comment. During the optimization process, we initially used RT-qPCR to track gene expression of selected tenogenic markers using the 007i line. We found that after 21 days of tenogenic induction there was upregulation of the few established tendon markers, that is COL1A1, COL3A1, EGR1 and quite importantly, the more definitive later tendon marker, TNMD. Thus, we decided to proceed with this protocol prior to testing other compounds including the WNT inhibitor WNT-C59. However, as has been discussed in the manuscript, this extended tenogenic induction resulted in cell attrition without the application of the WNT inhibitor. This phenomenon was ameliorated following WNT inhibition. Thus, it could be postulated that the protocol could be further optimized by shortening tenogenic induction to less than 21 days.

      The experiments that were conducted to optimize the differentiation process were repeated independently at least n=3 times using qPCR and IF using two lines, that is the 007i and the 83i line as described in the manuscript. The scRNAseq analysis represents a population of cells from in vitro differentiation that originated from the same donor line, therefore it was performed on n=1 sample at each stage. However, the effects of inhibitor application (sample SYNWNTi) were also confirmed using a second cell line (83i), thus a total of n=2 independent samples were analyzed.  

      Comment 9: Overall the shown immunofluorescence (IF) data does not appear convincing. Could the authors please provide clearer images, including separate channel images, a bright field image, and magnified views of each staining?

      Thanks for your comment. The separate channels images were added to the supplemental data (Supp. Fig. 7). We agree with the reviewer regarding the limitations of IF staining, especially with the added confounding factor of using poly-lysine coated coverslips. We would like to point out, that in the current work IF staining is not the main finding or the primary outcome measure, and that it is only used to further support the differentiation by providing a qualitative assessment of protein presence and localization. We describe in this paper our thesis regarding the limitations of IF and the need for more high-throughput unbiased approaches to quantification when using IF staining. For instance, spatial transcriptomics combined with mass cytometry or flow cytometry could be used for a more unbiased approach. Thus, in the present manuscript we based our conclusion on the quantitative gene expression, single cell sequencing and flow cytometry.

      Comment 10: As stated by the authors in the manuscript, another research group performed FACS analysis to assess the efficiency of syndetome induction using SCX antibody, and/or quantification of immunofluorescence (IF) with SCX, MKX, COL1A1, or COL2A1 antibodies. Could the authors conduct a comparative analysis of syndetome induction efficiency both before and after protocol optimization, utilizing FACS analysis in conjunction with an SCX reporter line or antibody staining, e.g. quantifying induction efficiency via immunofluorescence (IF) staining with syndetome-specific marker genes?

      Thank you for your comment. As discussed in a previous comment, we agree with the reviewer that the generation of a human iPSC-SCX-GFP line would shed light into SCX expression over the entire course of induction. In the current work we used IF as qualitative confirmation of specific marker expression and we showed the presence of SCX, MKX, COL1 and COL3 in SYNWNTi as well as the absence of neuronal markers. As we also pointed it out in the present manuscript, IF can only be considered as a semi-quantitative assessment burdened with several technical limitations as well as operator bias and lower sensitivity and accuracy compared to flow cytometry or scRNA-seq, unless performed in a more unbiased manner. To further clarify this point, firstly, using poly-lysine coated coverslips for IF staining, results in a different substrate environment compared to the Geltrex-coated plates that were used for the induction. Additionally, we noticed that cells grew overconfluent at the edges of the coverslips. This is an important point, since as we have observed in this work, seeding density is critical for the reproducibility of the protocol. It could further be postulated that a different cell substrate stiffness might also have an effect on this process. In our opinion, in this context IF should rather be used qualitatively and a combination of flow cytometry with scRNAseq should be utilized to draw quantitative conclusions such as induction efficiencies of a certain cell type. Since we also observed inconsistencies with the SCX antibodies we tested, the generation of edited human iPSC lines (such as SCX-GFP, MKX-GFP and TNMD-GFP) would be the preferred approach to further explore the efficiency of differentiation.

      Comment 11: To enhance the paper's significance, the authors should conduct functional validation experiments and proper assessment of their induced syndetome-like cells. They could perform e.g. xeno-transplantation experiments with syndetome cells into SCID-mice or injury models. They could also assess whether the in vitro induced cells could be applied for in vitro tendon/ligament formation.

      Thanks for your comment. For the purpose of this proof-of-concept in vitro study, our primary goal was to initially evaluate a stepwise tenogenic induction protocol using GMP-ready cell lines and chemically defined media. Then, we wanted to utilize the analytical power of scRNA-seq in order to characterize and optimize the protocol, thus focusing on one developmental stage that is not well understood, that of syndetome specification from sclerotome, and hypothesized that by fine-tuning the WNT pathway we would be able to generate a more homogeneous syndetome cell population. We fully agree with the reviewer that the warranted next steps should be to conduct several functional validation experiments, such as in vitro 2D/3D tendon/ligament formation and in vivo transplantation in allogeneic or xenogeneic injury models.

      Comment 12: The authors should also compare their scRNA-seq data with actual human embryo data sets, something which could be done given the recent increase in available human embryo scRNA-seq data sets.

      This is a great idea and intriguing study. Unfortunately, not all data sets are available at the moment and specifically embryonic and MSK scRNA-seq data is very scarce, although growing. We have no access to data sets from human tendon development, and thus will have to leave this comparison for future studies.

      Reviewer 3:

      Comment 1: The data outlining the differences between the differentiation outcome of the two tested iPSCs is intriguing, but the authors fail to comment on potential differences between the two iPSC lines that could result in drastically different cell outputs from the same differentiation protocol. This is a critically important point, as the majority of the SCX+ cells generated from the 007i cells using their WNTi protocol were found in the FC subpopulation that failed to form from the 83i line under the same protocol. From the analysis of only these 2 cell lines in vitro, it is difficult to assess whether this WNTi protocol can be broadly used to generate tenogenic cells.

      Thanks for your comment. This proof-of-concept study is the first investigation that compares data of an in vitro tenogenic induction protocol that has been tested into more than one cell lines. Using unsupervised clustering we identified 11 clusters, which were classified into 6 cell subpopulations. The only observed difference between the two lines was a small subset that was labeled as fibrocartilage (FC), which displayed expression of both tenogenic and chondrogenic markers. This subpopulation was observed in 007i line but not in the 83i line at the end of the SYN induction. Importantly, DEG analysis also showed that it was enriched for SCX. It has been shown that SCX+/SOX9+ progenitors are a distinct multipotent cell group, responsible for the development of SCX−/SOX9+ chondrocytes and SCX+/SOX9− tenocytes/ligamentocytes (Sugimoto 2013)16. As noted in a previous comment (Comment 2 from Reviewer 1), we might have missed SCX upregulation during the 21-day syndetome induction. This can be further supported by Fig. 5E trajectory analysis which shows that this subpopulation (FC) precedes the SYN cell subpopulation. The fact that this subpopulation was present in one line but not the other, might indicate that 83i line resulted in a more mature tendon population. Therefore, we would rather posit that in the case of 83i line, it might not be that the FC subpopulation failed to form, but rather that it was missed in our scRNAseq endpoint analysis which showed that a more homogeneous SYN population was formed (8.7 % in 007i vs. 0.26 % in 83i, Supp. Table 1 & Supp. Fig. 3B). Future studies are warranted to characterize the SYN induction timeline as it pertains to SCX expression followed up by maturation from tenogenic progenitor to tenocytes.

      Comment 2: The authors make claims to changes in protein expression but fail to quantify either fluorescence intensity or percent cell expression from their immunofluorescence analyses to substantiate these claims. These claims are not fully supported by the data as presented as it is unclear whether there is increased expression of tendon markers at the protein level or more cells surviving the protocol. Additionally, in images where 3 channels are merged, it would be helpful to show individual channels where genes are shown in similar spectra (ie. Fig 2I SCX/MKX). Furthermore, the current layout and labelling scheme of Figure 4 makes it very difficult to compare conditions between SYN and SYNWNTi protocols.

      Thanks for your comment. Protein expression at each stage was verified with immunofluorescence cytochemistry whereby cells were cultured onto poly-lysine coated coverslips, which were then fixed, stained and imaged (Fig. 2). However, prior to WNT inhibitor application, we noticed gradual cell attrition in the cultures at the end of differentiation (Fig. 1B, 2I). The images show qualitative differences with and without the WNT inhibitor. This could be attributed to the heterogeneity of the cell population at SCL stage, which was confirmed by scRNA-seq (Fig. 3A). As it has been discussed previously (Reviewer 2 comments 5 & 9), in the current paper we didn’t provide any IF quantitative analysis because of the qualitative nature of the staining technique. In future work another high-resolution imaging modality will be considered like single cell proteomics and flow cytometry or mass cytometry in order to perform a more unbiased quantitative single cell analysis across different stages and samples. Furthermore, we have added single channel images in the supplemental information.

      Comment 3: Individual data points should also be presented for all qPCR experiments (ie. Fig 4A). Biological replicate information is missing from several experiments, particularly the immunofluorescence data, and it is unclear whether the qPCR data was generated from technical or biological replicates.

      Thanks for your comment. We have added additional information regarding replicates in each figure legend. We have also changed Fig. 4A.

      (1) Glaeser JD, Bao X, Kaneda G, et al. iPSC-neural crest derived cells embedded in 3D printable bio-ink promote cranial bone defect repair. Sci Rep. Nov 4 2022;12(1):18701. https://www.ncbi.nlm.nih.gov/pubmed/36333414

      (2) Kaneda G, Chan JL, Castaneda CM, et al. iPSC-derived tenocytes seeded on microgrooved 3D printed scaffolds for Achilles tendon regeneration. J Orthop Res. Oct 2023;41(10):2205-2220. https://www.ncbi.nlm.nih.gov/pubmed/36961351

      (3) Papalamprou A, Yu V, Chen A, et al. Directing iPSC differentiation into iTenocytes using combined scleraxis overexpression and cyclic loading. J Orthop Res. Jun 2023;41(6):1148-1161. https://www.ncbi.nlm.nih.gov/pubmed/36203346

      (4) Sheyn D, Ben-David S, Tawackoli W, et al. Human iPSCs can be differentiated into notochordal cells that reduce intervertebral disc degeneration in a porcine model. Theranostics. 2019;9(25):7506-7524. https://www.ncbi.nlm.nih.gov/pubmed/31695783

      (5) Später T, Kaneda G, Chavez M, et al. Retention of Human iPSC-Derived or Primary Cells Following Xenotransplantation into Rat Immune-Privileged Sites. Bioengineering. 2023;10(9):1049. https://www.mdpi.com/2306-5354/10/9/1049

      (6) Sareen D, O'Rourke JG, Meera P, et al. Targeting RNA foci in iPSC-derived motor neurons from ALS patients with a C9ORF72 repeat expansion. Sci Transl Med. Oct 23 2013;5(208):208ra149. https://www.ncbi.nlm.nih.gov/pubmed/24154603

      (7) Tsutsumi H, Kurimoto R, Nakamichi R, et al. Generation of a tendon-like tissue from human iPS cells. J Tissue Eng. Jan-Dec 2022;13:20417314221074018. https://www.ncbi.nlm.nih.gov/pubmed/35083031

      (8) Yoshimoto Y, Uezumi A, Ikemoto-Uezumi M, et al. Tenogenic Induction From Induced Pluripotent Stem Cells Unveils the Trajectory Towards Tenocyte Differentiation. Front Cell Dev Biol. 2022;10:780038. https://www.ncbi.nlm.nih.gov/pubmed/35372337

      (9) Matsuda M, Yamanaka Y, Uemura M, et al. Recapitulating the human segmentation clock with pluripotent stem cells. Nature. Apr 2020;580(7801):124-129. https://www.ncbi.nlm.nih.gov/pubmed/32238941

      (10) Wu CL, Dicks A, Steward N, et al. Single cell transcriptomic analysis of human pluripotent stem cell chondrogenesis. Nat Commun. Jan 13 2021;12(1):362. https://www.ncbi.nlm.nih.gov/pubmed/33441552

      (11) Nakajima T, Nakahata A, Yamada N, et al. Grafting of iPS cell-derived tenocytes promotes motor function recovery after Achilles tendon rupture. Nat Commun. Aug 18 2021;12(1):5012. https://www.ncbi.nlm.nih.gov/pubmed/34408142

      (12) Loh KM, Chen A, Koh PW, et al. Mapping the Pairwise Choices Leading from Pluripotency to Human Bone, Heart, and Other Mesoderm Cell Types. Cell. Jul 14 2016;166(2):451-467. https://www.ncbi.nlm.nih.gov/pubmed/27419872

      (13) Yu V, Papalamprou A, Sheyn D. Generation of Induced Pluripotent Stem Cell-Derived iTenocytes via Combined Scleraxis Overexpression and 2D Uniaxial Tension. JoVE. 2024/03/01 2024(205):e65837. https://app.jove.com/65837

      (14) Yin Z, Hu JJ, Yang L, et al. Single-cell analysis reveals a nestin(+) tendon stem/progenitor cell population with strong tenogenic potentiality. Sci Adv. Nov 2016;2(11):e1600874. https://www.ncbi.nlm.nih.gov/pubmed/28138519

      (15) Grinstein M, Dingwall HL, O'Connor LD, Zou K, Capellini TD, Galloway JL. A distinct transition from cell growth to physiological homeostasis in the tendon. Elife. Sep 19 2019;8. https://www.ncbi.nlm.nih.gov/pubmed/31535975

      (16) Sugimoto Y, Takimoto A, Akiyama H, et al. Scx+/Sox9+ progenitors contribute to the establishment of the junction between cartilage and tendon/ligament. Development. Jun 2013;140(11):2280-2288. https://www.ncbi.nlm.nih.gov/pubmed/23615282

    1. eLife Assessment

      The study describes a link between beta-amyloid monomers, regulation of microglial activity and assembly of neocortex during development. It brings valuable findings that have theoretical and practical implications in the field of neuronal migration, neuronal ectopia and type II lissencephaly. Unfortunately, the evidence is incomplete and the manuscript would benefit from additional experiments to clarify the relationship between Ric8a and APP and bolster the findings.

    2. Reviewer #1 (Public review):

      Summary:

      The authors want to elucidate which are the mechanisms that regulate the immune response in physiological conditions in cortical development. To achieve this goal, authors used a wide range of mutant mice to analyse the consequences of immune activation in the formation of cortical ectopia in mice.

      Strengths:

      The authors demonstrated that Abeta monomers are anti-inflammatory and inhibit microglial activation. This is a novel result that demonstrates the physiological role of APP in cortical development.

      The current manuscript has been slightly improved by additional experiments and editing of the text (many of the suggestions of the reviewers have not been included). However, the evidence supporting the conclusions of the study is still very weak and inconsistent.

      Remaining weaknesses:

      -There is no evidence that microglia express Emx1. The paper they referred (Zhang et al., 2014) was performed in adult mice so it is not comparable. Moreover, many other papers are saying that Emx1 is not expressed in microglia. Line 175: change in cytokine expression is not a strong evidence to state that Emx1 is expressed in microglia. Fig. S8: It is not clear whether the staining was performed on neuronal primary culture or cortical section? It is also unclear why there is a partial reduction of Ric8a mRNA levels in Emx1-Ric8a cKO and not a completed deletion?

      -NestinCre and Emx1Cre mouse models are targeting the same type of cells in the developing cortex (cortical progenitors, glutamatergic neurons and astrocytes), but with one day difference in expression (Emx1 E9.5 and Nestin E10.5). In fact, previous studies using the same approach (Nestin-Ric8a cKO) found ectopias in the cortex, it is more in line with the results of Emx1-Ric8a cKO shown in the current study. There is no evidence to assume that ric8a deficiency in neural cell lineages is not responsible for basement membrane degradation and ectopia formation in ric8a mutants.

      -Additional experiments should be performed to demonstrate that ectopia formation in Emx1-ric8a cKO mutant mice is due to an increase in immune stimulation and not a cell-autonomous effect. Using double cx3cr1-cre and nestin-cre ric8a mutant mice is not an argument to say that elevated immune activation of ric8a deficient microglia during cortical development is responsible for ectopia formation (line 2012-2013)

      -The similarities between Ric8a cKO and APP cKO mice are not enough evidence to claim that APP and Ric8a are involved in the same anti-inflammatory pathway in microglia.

      -Gel zymography is not the same as Western blot. For the quantification of the relative amount of protein, authors should use western blot and not immunofluorescence intensity as shown in Fig. 5g, h. For western blot, you also load the same amount of protein but you have to normalize your samples with a control protein.

      -The graph of BrdU cell distribution in the mutant mice (Fig. S1 F) shows that there are more BrdU cells in bins 5-7 and less in bin 9, indicating an impaired migration of upper cortical neurons in the mutant mice. The authors claimed there are no differences in migration in the result section but the figure showed significant differences. Panels E, F in Fig S2 show the density of Cux1 and Ctip2 cells per area indicating no changes in the generation of upper and lower cortical neurons, but no information about the migration as authors claimed (lines 117-118). (what is the field for Ctip2 counting?). These experiments cannot rule out the possibility of cell-autonomous effect of Ric8a deletion in glutamatergic neurons or radial glial cells.

    3. Reviewer #2 (Public review):

      Kwon et al. used several conditional KO mice for the deletion of ric8a or app in different cell types. Some of them exhibited pial basement membrane breaches leading to neuronal ectopia in the neocortex.

      I am glad to see that the authors performed some of the requested controls.

      However, a huge problem with this manuscript which has been highlighted in the reviewer's comments but not corrected by the authors, is the claim that "A novel monomeric amyloid beta-activated signaling pathway regulates brain development". They do not have any proof that Abeta is the activating signal in vivo. Whatever they showed in vitro should be confirmed in vivo to make such a strong claim. The authors even recognized it in their responses to reviewers: "we currently do not have evidence that in the developing cortex Abeta monomers play a role in inhibiting microglia". Therefore, their title is misleading, not supported by the data, and must be changed to reflect accurately the results. Maybe something like "Involvement of microglia in the formation of cortical ectopia".

      The abstract is also misleading and must be changed. The abstract is mostly about Abeta, pretending that this is the key part of their findings while they only provide a few in vitro experiments but nothing in vivo.<br /> This is such a bad way to summarize their data. Most of their in vivo data is about Ric8a, then a smaller in vivo part about APP and nothing about Abeta in vivo. But the title "novel monomeric amyloid beta-activated signaling pathway regulates brain development via inhibition of microglia" only mention Abeta. And the Abstract 90% focuses on Abeta.<br /> The first half of the introduction is about Abeta. Why would they focus their paper about Abeta while they basically have only one figure with in vitro data !! This is so deceptive.<br /> It seems that these authors do not fully understand the importance of having their claims supported by solid data.

      (1) The authors did not show in vivo data supporting that Abeta monomers are the key players here.<br /> (2) The authors did not show in vivo data supporting the cytokine secretion data provided in vitro in a model system. They claim that it is not technically feasible to extract the extracellular (secreted) fractions of cytokines from an embryonic brain without causing cell lysis and the release of the intracellular pool. But how about RT-qPCR? After all, they showed that the pathway affects the transcription of several cytokines in microglia in vitro.<br /> (3) The authors did not provide a control experiment to show that the insult induced by LPS injection does not induce the phenotype in the ric8a-foxg1-cre mice.<br /> (4) They did not agree to verify the monomer state of their Abeta monomer preparation, even after addition to the culture medium. Abeta have a strong tendency to polymerize. However, because the authors added the requested result with Ab polymers which gave a different outcome. It is OK with me if they don't do it.<br /> (5) The app-cx3cr1-cre +LPS animals show ectopia only in only subsets of mutants and in most cases only in one of the hemispheres. Experiments examining potential changes in MMP9 are therefore difficult and were not done.

      I don't mind the inability to perform all the suggestions from the reviewers but it is then necessary to tone down or remove the claims that are not supported by the data.<br /> This kind of issue appears several times later in the text too:

      (1) At the end of the introduction "we found that APP and Ric8a form a pathway in microglia that is specifically activated by the monomeric form of Abeta and that this pathway normally inhibits the transcriptional and post-transcriptional expression of immune cytokines by microglia". Data from Abeta and cytokines are only in vitro, so it has to be specified.<br /> (2) Line 282: "Thus, these results indicate that monomeric Abeta possesses a previously unreported anti-inflammatory activity against microglia that strongly inhibits microglial inflammatory activation". Specify in vitro!<br /> (3) Line 322: "We have shown that heightened microglial activation due to mutation in the Abeta monomer-activated APP/Ric8a pathway results in basement membrane degradation and ectopia during cortical development." This is an overstatement. They did not show that Abeta monomers activate the pathway in vivo.<br /> (4) Line 332: "Thus, these results indicate that excessive inflammatory activation of microglia is responsible for ectopia formation in ric8a mutants." This is incorrect. Inhibition of Akt or stat3 does much more than just being pro-inflammatory. This could affect directly migration. The data only show that Akt and/or Stat3 might be involved.<br /> (5) Line 355: "these results indicate this Abeta monomer-regulated anti-inflammatory pathway normally promotes cortical development through suppressing microglial activation and MMP induction.". Another overstatement. There is no proof that Abeta is involved in vivo.<br /> (6) Line 362: "In this article, we have identified a novel microglial anti-inflammatory pathway activated by monomeric Abeta that inhibits microglial cytokine expression and plays essential roles in the normal development of the cerebral cortex". Another overstatement. There is no proof that Abeta is involved in vivo.<br /> (7) Line 365: "this pathway is mediated by APP and the heterotrimeric G protein GEF and molecular chaperone Ric8a in microglia and its activation leads to..." They should mention that its activation was in vitro.<br /> (8) Line 387: "In this study, we have shown that immune over-activation of microglia deficient in a monomeric Ab-regulated pathway results in excessive cortical matrix proteinase activation, leading basement membrane degradation and neuronal ectopia." Another overstatement. There is no support to claim that Abeta is involved in vivo. The immune overactivation was not shown in vivo but only in vitro in a model system that does not even reflect correctly what is happening in vivo due to chronic immune stimulation during in vitro culture.<br /> (9) Line 396: "we have also shown that the anti-inflammatory regulation of microglia in corticogenesis depends on a pathway composed of APP and the heterotrimeric G protein regulator Ric8a." Overstatement. They only showed the anti-inflammatory regulation in vitro and not during corticogenesis.<br /> It is just a matter of rewriting the title, abstract and text in an honest way, in order to make sure that every claim is supported by the data and in some cases acknowledge the weakness of the provided data and describe the multiple interpretations than could be drawn out of them.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors want to elucidate which are the mechanisms that regulate the immune response in physiological conditions in cortical development. To achieve this goal, authors used a wide range of mutant mice to analyse the consequences of immune activation in the formation of cortical ectopia in mice.

      Strengths:

      The authors demonstrated that Abeta monomers are anti-inflammatory and inhibit microglial activation. This is a novel result that demonstrates the physiological role of APP in cortical development.

      Weaknesses:

      -On the other hand, cortical ectopia has been already described in mouse models in which the amyloid signalling has been disrupted (Herms et al., 2004; Guenette et al., 2006), making the current study less novel.

      We agree these previous studies have implicated amyloid precursor protein in cortical ectopia. However, since these studies use whole-body knockouts, they have not implicated the functional roles of specific cell types.  Nor have they identified the specific mechanisms underlying the formation of this unique class of cortical ectopia. In contrast, our studies show that the disruption of a novel Abeta-regulated signaling pathway in microglia is the primary cause of ectopia formation in this class of ectopia mutants. This is the first time that microglia have been specifically implicated in the development of cortical ectopia. We further show that elevated MMP activity and resulting cortical basement membrane degradation is the underlying mechanism leading to ectopia formation.  This is also the first time that MMP activity and basement membrane degradation (instead of maintenance) have been implicated in cortical ectopia development. As such, our results have provided novel insights into the diverse mechanisms underlying cortical ectopia formation in developmental brain disorders.

      One of the molecules analysed is Ric8a, a GTPase activator involved in neuronal development. Authors used the conditional mutant mice Emx1-Ric8a to delete Ric8a from early progenitors and glutamatergic neurons in the pallium. Emx1-Ric8a mutant mice present cortical ectopias and authors attributed this malformation to the increase in inflammatory response due to Ric8a deletion in microglia. Several discordances do not fit this interpretation:

      - The role of Ric8a in cortical development and function has been already described in several papers, but none of them has been cited in the current manuscript (Kask et al., 2015, 2018; Ruisu et al., 2013; Tonissoo et al., 2006).

      We have included reference to the published works on ric8a in cortical development in revision.

      - Ectopia formation in the cortex has been already described in Nestin-Ric8a cKO mice (Kask et al., 2015). In the current manuscript, authors analyzed the same mutant mice (Nestin-Ric8a), but they did not detect any ectopia. Authors should discuss this discordance.

      The expression pattern of nestin-cre is known to vary dependent on factors including transgene insertion site, genetic background, and sex. Early studies show, for example, that the nestin gene promoter drives cre expression in many non-neural tissues in another transgenic line in the FVB/N genetic background (Dubois et al Genesis. 2006 Aug;44(8):355-60. doi: 10.1002/dvg.20226).  The specific nestin-cre line used in Kask et al 2015 has also been shown to be active in brain microglia and lead to increased microglia pro-inflammatory activity upon breeding to a conditional allele of a cholesterol transporter gene (Karasinska et al., Neurobiol Dis. 2013 Jun:54:445-55; Karasinska et al.,  J Neurosci. 2009 Mar 18; 29(11): 3579–3589; Takampri et al., Brain Res. 2009 May 13:1270:10-8). These factors may in part underlie the apparent discrepancy.  We have now incorporated this discussion into the revision.

      - Authors claim that microglia express Emx1, and therefore, Ric8a is deleted in microglia cells. However, the arguments for this assumption are very weak and the evidence suggests that this is not the case. This is an important point considering that authors want to emphasise the role of Ric8a in microglia activation, and therefore, additional experiments should demonstrate that Ric8a is deleted in microglia in Emx1-Ric8a mutant mice.

      We have observed altered mRNA expression of several genes in purified microglia cultured from the emx1-cre mutants (Supplemental Fig. 8), which indicates that ric8a is deleted from microglia and suggests a role of microglial ric8a deficiency in ectopia formation.  This interpretation is further strengthened by the observation that deletion of ric8a from microglia using a microglia-specific cx3cr1-cre results in similar ectopia (Fig. 2). We also have other data supporting this interpretation, including data showing induction of the expression of a cre reporter in brain microglia by emx1-cre and loss of ric8a gene expression in microglia cells isolated from emx1-cre mutants. These data have now been incorporated into the text and in revised Supplemental Fig. 8 (new panels c-c” & d).

      Reviewer #2 (Public Review):

      Kwon et al. used several conditional KO mice for the deletion of ric8a or app in different cell types. Some of them exhibited pial basement membrane breaches leading to neuronal ectopia in the neocortex.

      They first investigated ric8a, a Guanine Nucleotide Exchange Factor for Heterotrimeric G Proteins. They observed the above-mentioned phenotype when ric8a is deleted from microglia and neural cells (ric8a-emx1-cre or dual deletion with cre combination cx3cr1 (in microglia) and nestin (in neural cells)) but not in microglia alone or neural cells alone (whether it is in CR cells (ric8a-Wnt3a-cre), post-mitotic neurons (nex-cre or dlx5/6-cre), or in progenitors and their progeny (nestin-cre or foxg1-cre). They also show that ric8a KO mutant microglia cells stimulated in vitro by LPS exhibit an increased TNFa, IL6 and IL1b secretion compared to controls (Fig 2). They therefore injected LPS in vivo and observed the neuronal ectopia phenotype in the ric8a-cx3cr1-cre (microglial deletion) cortices at P0 (Fig 2). They suggest that ric8a KO in neuronal cells mimics immune stimulation (but we have no clue how ric8a KO in neural cells would induce immune stimulation).

      We agree we do not currently know the precise mechanisms by which mutant microglia are activated in the mutant brain.  However, this does not affect the conclusion that deficiency in the Abeta monomer-regulated APP/Ric8a pathway in microglia is the primary cause of cortical ectopia in these mutants, since we have shown that genetic disruption of this pathway in microglia alone by targeting different pathway components, using cell type specific cre, in several different approaches, all results in similar cortical ectopia phenotypes.  Regarding the source of the immunogens, there are several possibilities which we plan to investigate in future studies. For example, the clearance of apoptotic cells and associated cellular debris is an important physiological process and deficits in this process have been linked to inflammatory diseases throughout life (Doran et al., Nat Rev Immunol. 2020 Apr;20(4):254-267; Boada-Romero et al., Nat Rev Mol Cell Biol. 2020 Jul;21(7):398-414.).  In the embryonic cortex, studies have shown that large numbers of cell death take place starting as early as E12 (Blaschke et al., Development. 1996 Apr;122(4):1165-74; Blaschke et al., J Comp Neurol. 1998 Jun 22;396(1):39-50).  Studies have also shown that radial glia and neuronal progenitors play critical roles in the clearance of apoptotic cells and associated cellular debris in the brain (Lu et al., Nat Cell Biol. 2011 Jul 31;13(9):1076-83; Ginisty et al., Stem Cells. 2015 Feb;33(2):515-25; Amaya et al., J Comp Neurol. 2015 Feb 1;523(2):183-96). Moreover, Ric8a-dependent heterotrimeric G proteins have been found to specifically promote the phagocytic activity of both professional and non-professional phagocytic cells (Billings et al., Sci Signal. 2016 Feb 2;9(413):ra14; Preissler et al., Glia. 2015 Feb;63(2):206-15; Pan et al. Dev Cell. 2016 Feb 22;36(4):428-39; Flak et al. J Clin Invest. 2020 Jan 2;130(1):359-373; Zhang et al., Nat Commun. 2023 Sep 14;14(1):5706).  Thus, it is probable that the failure to promptly clear up apoptotic cells and debris by mutant radial glia may play a role in triggering mutant microglial activation in ric8a-emx1-cre mutants. We have now included these possibilities in the text of the revised manuscript. However, the precise mechanisms remain to be determined in future studies, which, however, do not affect the conclusion of the current study.

      The authors then turned their attention on APP. They observed neuronal ectopia into the marginal zone when APP is deleted in microglia (app-cxcr3-cre) + intraperitoneal LPS injection (they did not show it, but we have to assume there would not be a phenotype without the injection of LPS) (Fig 3). (The phenotype is similar but not identical to ric8a-cx3cr1-cre + LPS. They suggest that the reason is because they had to inject 3 times less LPS due to enhanced immune sensitivity in this genetic background but it is only a hypothesis). After in vitro stimulation by LPS, app mutant microglia show a reduced secretion of TNFa and IL6 but not IL1b (this is the opposite to ric8a-cx3cr1-cre microglia cells) while peritoneal macrophages in culture show increased secretion of TNFa, IL1, IL6 and IL23 (fig 3 and Suppl. Fig 9).

      We have data showing that that app-cxcr3-cre mutants without LPS injection do not show ectopia, which has now been included in the revised supplemental Fig. 9 (new panels c-d).  The reason we employ LPS injection is, in the first place, that we do not see a phenotype without the injection. We agree, and have also stated in the text, that the phenotype of the app mutants is not as severe as that of the ric8a mutant.  Besides the low LPS dosage used, we also suggest that other app family members may compensate since the ectopia in the app family gene mutants reported previously were only observed in app/aplp1/2 triple knockouts, not even in any of the double knockouts (Herms et al., 2004). We have further clarified this point in the text. These possibilities are also not mutually exclusive. Nonetheless, the results clearly show that microglia specific app mutation causes cortical ectopia upon embryonic immune stimulation. They have thus implicated a specifical role of microglial APP in cortical ectopia formation.

      The different response of ric8a and app mutant microglia to LPS results from in vitro culturing of microglia. We have shown that, when acutely isolated macrophages are used, these mutants show changes in the same direction (both increased cytokine secretion) (Fig. 4).  This demonstrates without culturing app mutant microglial lineage cells indeed behave in the same way as ric8a mutant cells.

      The microglia used for analysis in in vitro assays in this study have all been cultured for two weeks before assay. They have thus been under chronic stimulation exposed to dead cells and debris in the culture dish through this period.  Previous studies have shown that dependent on the degree of perturbation to the inflammation-regulating pathways, such exposures can differentially affect microglial cytokine expression, sometimes in an opposite direction from expected.  For example, under chronic immune stimulation, while the trem2+/- microglia, which are heterozygous mutant for the anti-inflammatory Trem2, show elevated pro-inflammatory cytokine expression (as is expected), trem2-/- (null) microglia under the same conditions instead not only do not show increases but for some pro-inflammatory cytokines, actually show decreases in expression (Sayed et al.,, Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):10172-10177).  In several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  Indeed, APP cytoplasmic domain is known to also bind to and signalig through several other proteins including FE65, Mena, and TIP60 (Cao & Sudhof, Science 2001. 293:115-120).  It is likely that in microglia Ric8a-dependent heterotrimeric G proteins may also mediate only a subset of the signaling downstream of APP.  As such, app knockout in microglia may have more severe effects on microglial anti-inflammatory regulation than ric8a knockout.  As a result, upon chronic immune activation, app knockout may lead to a microglial phenotype similar to the trem2 null mutation phenotype as discussed above, while ric8a knockout leads to a phenotype similar to trem2+/- phenotype). This may explain the subdued TNF and IL6 secretion by cultured app (but not ric8a) mutant microglia.

      Amyloid beta (Ab) being one of the molecules binding to APP, the authors showed that Ab40 monomers (they did not test Ab40 oligomers) partially inhibit cytokines (TNFa, IL6, IL1b, MCP-1, IL23a, IL10) secretion in vitro by microglia stimulated by LPS but does not affect secretion by microglia from app-cx3cr1-cre (tested for TNFa, IL6, IL1b, IL23a, IL10) (Fig 4, Suppl fig 10) (but still does it in aplp2-cx3cr1-cre) and does not affect secretion by ric8a-cx3cr1-cre microglia (tested for TNFa and IL6 but still suppress IL1b) (Therefore here is another difference between app and ric8a KO microglia).

      We have tested the effects of Abeta40 oligomers, which induce instead of suppressing microglial cytokine secretion, and have included the data (new panel j in supplemental Fig. 10).  As mentioned above, in several systems, Ric8a-dependent heterotrimeric G proteins have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  We assume that this is likely also true in microglia and that Ric8a-dependent heterotrimeric G proteins may mediate a subset and only a subset of the signaling downstream of APP.  This may explain the difference in the effects of app and ric8a knockout mutation in abolishing the anti-inflammatory effects of Abeta monomers on IL-1b vs TNF/IL-6.  This difference also suggests that TNF/IL-6 and IL-1b secretion must be regulated by different mechanisms in microglia. Indeed, it is well established in immunology that the secretion of IL1b, but not of TNF or IL6, is regulated by inflammasome-dependent mechanisms (see, for example, Proz & Dixit. Nat Rev Immunol. 2016 Jul;16(7):407-20. doi: 10.1038/nri.2016.58).

      The authors injected inhibitors of Akt or Stat3 in the ric8a-emx1-cre cortex and found it suppressed neuronal ectopia (Fig 5, Suppl fig 11). It is not clear whether it suppresses immune stimulation from neuronal cells or immune reaction from microglia cells.

      We agree at present the pharmacological approaches we have taken are not able to distinguish these possibilities.  However, no matter which is the case, our results still implicate a role of excessive microglial activation in the formation of cortical ectopia and support the conclusion of the study.  Thus, while worthwhile of further investigation, this question does not impact the conclusion of the current study. Furthermore, as mentioned, we plan to determine the mechanisms of how ric8a mutation in neural cells induces immune activation in future studies. These results will likely enable us to more specifically address this question.

      Finally, the authors examined the activities of MMP2 and MMP9 in the developing cortex using gelatin gel zymography. The activity and protein levels of MMP9 but not MMP2 in the ric8a-emx1-cre cortex were claimed significantly increased (Fig 5, Suppl fig 12). Unfortunately, they did not show it in the app-cx3cr1-cre +LPS mouse. They make a connection between ric8a deletion and MMP9 but unfortunately do not make the connection between app deletion and MMP9, which is at the center of the pathway claimed to be important here). Then they injected BB94, a broad-spectrum inhibitor of MMPs or an inhibitor specific for MMP9 and 13. They both significantly suppress the number and the size of the ectopia in ric8a mutants (Fig5).

      For all the gelatin gel zymography analysis, we quantify protein concentrations in the cortical lysates using the Bio-Rad Bradford assay kit and load the same amounts of proteins per lane. The results across lanes are all directly comparable. From the quantification, our results clearly show that MMP9 activity levels are increased in the mutants (we have now included whole gel images and quantification in a new supplemental Figure 13).  The similar levels of MMP2 in all lanes also provide an internal control further supporting the observation of a specific change in MMP9.  For this analysis, we focus on the ric8a-emx1-cre mutants since the app-cx3cr1-cre +LPS animals show ectopia only in only subsets of mutants and in most cases only in one of the hemispheres.  Experiments examining potential changes in MMP9 are therefore unlikely to yield meaningful results.  On the other hand, we have clearly shown that the administration of different classes of MMP inhibitors significantly eliminate ectopia in ric8a-emx1-cre mutants. This has strongly implicated a functional contribution of MMPs.

      After reading the manuscript, I still do not know how ric8a in neural cells is involved in the immune inhibition. Is it through the control of Ab monomers? In addition, the authors did not show in vivo data supporting that Ab monomers are the key players here. As the authors said, this is not the only APP interactor. Finally, I still do not know how ric8a is linked to APP in microglia in the model.

      As detailed above, there are several possibilities including potential deficits in the clearance of apoptotic cells and associated debris that may trigger microglial activation in ri8ca-emx1-cre mutants. We will investigate these possibilities in future studies.  We have now incorporated these possibilities in the revised text.  As for the role of Abeta monomers, we have indicated that we currently do not have evidence that in the developing cortex Abeta monomers play a role in inhibiting microglia.  We have also indicated in the manuscript that our conclusion is that a microglial signaling pathway that is activated by Abeta monomers in vitro regulates normal brain development in vivo, not that Abeta monomers themselves regulate brain development.  Regarding the link between Ric8a and APP, the reviewer has missed several major lines of supporting evidence. For example, we have shown that Abeta monomers activate a pathway in microglia that inhibits the secretion of several proinflammatory cytokines including TNF, IL-6, IL-10, and IL-23 (Figure 4 and Supplemental Figures 8-10).  This inhibition is abolished when either app or ric8a gene is deleted from microglia.  This clearly indicates that app and ric8a act in the same genetic pathway (the pathway activated by Abeta monomers) in microglia. We also show that this Abeta monomer-activated pathway also inhibits the transcription of several cytokines in microglia.  This inhibition is also abolished when either app or ric8a gene is deleted from microglia.  This reinforces the conclusion that app and ric8a act in the same pathway in microglia.  Furthermore, cell type specific deletion of app or ric8a from microglia in vivo also results in similar phenotypes of cortical ectopia. Together, these results strongly support the conclusion that app and ric8a act in the same pathway that is activated by Abeta monomers in vitro in microglia. This conclusion is also consistent with published findings that Ric8a dependent heterotrimeric G proteins bind to APP and mediate subsets of APP signaling across different species (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).         

      While several of the findings presented in this manuscript are of potential interest, there are a number of shortcomings. Here are some suggestions that could improve the manuscript and help substantiate the conclusions:

      (1) As the title suggests it, the focus is on Ab and APP functions in microglia. However, the analysis is more focused on ric8a. The connection between ric8a and APP in this study is not investigated, besides the fact that their deletion induces somewhat similar but not identical phenotypes. Showing a similar phenotype is not enough to conclude that they are working on the same pathway. The authors should find a way to make that connection between ric8a and app in the cells investigated here.

      As discussed above, the reviewer misses several major lines of evidence showing that APP and Ric8a acts in the same pathway in microglia.  Besides the similarity of the ectopia phenotypes, for example, we have shown that Abeta monomers activates a pathway in microglia that inhibits the secretion of several proinflammatory cytokines including TNF, IL-6, IL-10, and IL-23 (Figure 4 and Supplemental Figures 8-11).  These inhibitory effects are abolished when either app or ric8a gene is deleted from microglia.  This clearly indicates that app and ric8a act in the same genetic pathway, a pathway that is activated by Abeta monomers in vitro, in microglia. We also show that this Abeta monomer-activated pathway inhibits the transcription of several cytokine genes in microglia.  These effects are again abolished when either app or ric8_a gene is deleted from microglia.  This further reinforces the conclusion that _app and ric8a act in the same pathway in microglia.  Not only so we also show that the same results are true in macrophages.  Thus, these results strongly support the conclusion that app and ric8a act in the same genetic pathway in microglia. This conclusion is also consistent with published findings that Ric8a dependent heterotrimeric G proteins biochemically bind to APP and mediate subsets of APP signaling across different species (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  

      (2) This would help to show the appearance of breaches in the pial basement membrane leading to neuronal ectopia; to investigate laminin debris, cell identity, Wnt pathway for app-cxcr3-cre + LPS injection as you did for ric8a-emx1-cre.

      We have now provided further data on pial basement membrane breaches in the app-cxcr3-cre + LPS animals (new panels e-f” in supplemental Fig 9).  We have not observed any changes in cell identity or Wnt pathway activity in ric8a-emx1-cre mutants.  It is thus of limited value to examine potential changes in these areas in the app-cxcr3-cre + LPS animals.   

      (3) As a control, this would help to show that app-cxcr3-cre without the LPS injection does not display the phenotype.

      We have the data on app-cx3cr1-cre mutants without LPS injection, which show no ectopia.  We have now included the data in the revised supplemental Fig. 9 (new panels c-d).

      (4) This would help to show the activity and protein levels of MMP9 and MMP2 and perform the rescue experiments with the inhibitors in the app-cx3cr1-cre cortex +LPS.

      As discussed above, we focus analysis on the ric8a-emx1-cre mutants since app-cx3cr1-cre +LPS animals show ectopia in only a subset of mutants and in most cases only in one of the hemispheres.  Determining potential changes in MMP9 levels and effects of MMP inhibitors are therefore not likely to yield meaningful data.  On the other hand, we have shown that MMP9 levels are increased and administration of different classes of MMP inhibitors eliminate cortical ectopia in ric8a-emx1-cre mutants.  We have also shown a similar break in the basement membrane in app-cx3cr1-cre +LPS animals (new panels e-f” in supplemental Fig 9). These results together strongly implicates a role played by MMPs.

      (5) Is MMP9 secreted by microglia cells or neural cells?

      Our in situ hybridization data show MMP9 is most highly expressed in a sparse microglia-like cell population in the embryonic cortex, suggesting that microglia may be a major source of MMP9. We have incorporated these data in a new supplemental Fig. 12 (panel a). The precise identity of these cells, however, requires further validation.

      (6) The in vitro evidence indicates that one of the multiple APP interactors, ie Ab40 monomers, is less effective in suppressing the expression of some cytokines by microglia cells mutants for ric8a (TNFa and IL6 but still suppress IL1b) or APP (TNFa, IL6, IL1b, IL23a, IL10) when compared to WT. But there are other interactors for APP. In order to support the claim, it seems crucial to have in vivo data to show that Ab40 monomers are the molecules involved in preventing the breach in the pial basement membrane.

      As addressed in detail above, we have indicated that our conclusion is that a microglial signaling pathway that is activated by Abeta monomers in vitro regulates normal brain development in vivo, not that Abeta monomers themselves regulate brain development in vivo.  We currently do not have evidence that the Abeta monomers play a role in inhibiting microglia during cortical development.  There are candidate ligands for the pathway in the developing cortex, the functional study of which, however, is a major undertaking beyond the scope of the current study.

      (7) In order to claim that this is specific to Ab40 monomers and not oligomers, it is necessary to show that the Ab40 oligomers do not have the same effect in vitro and in vivo. Also, an assay should be done to show that your Ab preparations are pure monomers or oligomers.

      We have tested the effects of Abeta40 oligomers, which induce instead of suppressing microglial cytokine secretion, and have included these data in revision in a new panel j in supplemental Fig. 10. The protocols we use in preparing the monomers and oligomers are standard protocols employed in the field of Alzheimer’s disease research. They have been repeatedly optimized and validated over the past decades.  

      (8) Most of the cytokine secretion assays used microglia cells in culture. Two results draw my attention. Ric8a deletion increases TNFa and IL6 secretion after LPS stimulation in vitro on microglia cells while app deletion decreases their secretion. Then later, papers show that the decrease in IL1b induced by Ab on microglia cells is prevented by APP deletion but not ric8a deletion. Those two pieces of data suggest that ric8a and APP might not be in the same pathway. In addition, the phenotype from app-cxcr3-cre + LPS injection and ric8a-cxcr3-cre + LPS injection are not exactly the same. It could be due to the level of LPS as the author suggests or it might not be. More experiments are needed to prove they are in the same pathway.

      As discussed above, the reviewer misses several major lines of evidence, which strongly support the conclusion that APP and Ric8a act in the same pathway activated by Abeta monomers in microglia (see detailed discussion in point 1 above).  The differential response of TNFa/IL-6 of app and ric8a mutant microglia likely results from chronic immune stimulation during in vitro culturing, which is known to alter microglial cytokine response (see detailed discussion in point 9 below). We have demonstrated that this is indeed the case by showing that, without culturing, acutely isolated app and ric8a mutant macrophages both display elevated TNFa/IL-6 secretion (Figure 4). 

      Regarding the different regulation of TNF/IL-6 vs IL-1b by APP and Ric8a, as discussed above, in several systems, Ric8a-dependent heterotrimeric G proteins (which are degraded in ric8a mutant cortices, see new supplemental Fig. 9) have been shown to act downstream of APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9).  This is likely also the case in microglia and Ric8a-dependent heterotrimeric G proteins may mediate only a subset of the anti-inflammatory signaling activated by APP.  As such, app, mutation may abolish all the inhibitory effects of Abeta monomers (both those on TNF/IL-6 and those on IL-1b), but ric8a mutation may abolish only a subset only those on TNF/IL-6 but not those on IL-1b).  This also suggests that the secretion of TNF/IL-6 and IL-1b must be regulated by different mechanisms in microglia.  Indeed, it is well established in immunology that the secretion of IL1b, but not that of TNF or IL6, is regulated by inflammasome-dependent mechanisms (see, for example, Proz & Dixit. Nat Rev Immunol. 2016 Jul;16(7):407-20. doi: 10.1038/nri.2016.58).

      (9) How do the authors reconcile the reduced TNFa and IL6 secretion upon stimulation of app mutant microglia with the model where app is attenuating immune response in vivo? Line 213 says that microglia exhibit attenuated immune response following chronic stimulation but I don't know if 3 hours of LPS in vitro is a chronic stimulation.

      The reviewer has misunderstood.  The microglia used in this study have all been cultured in vitro for approximately two weeks before assay. They have thus been under chronic stimulation exposed to dead cells and debris in the culture dish.  Dependent on the degree of perturbation to the inflammation-regulating pathways, such exposures are known to change microglial cytokine expression, sometimes in an opposite direction than expected.  For example, under chronic immune stimulation, while the trem2+/- microglia, which are heterozygous mutant for the anti-inflammatory Trem2, show elevated pro-inflammatory cytokine expression, trem2-/- (null) microglia under the same conditions instead not only do not show increases but for some pro-inflammatory cytokines, actually show decreases in expression (Sayed et al.,, Proc Natl Acad Sci U S A. 2018 Oct 2;115(40):10172-10177).  As mentioned, in several systems, Ric8a-dependent heterotrimeric G proteins have also been shown to bind to APP and mediate one of the branches of the signaling activated by APP (Milosch et al., Cell Death Dis. 2014 Aug 28;5(8):e1391; Fogel et al,, Cell Rep. 2014 Jun 12;7(5):1560-1576; Ramaker et al., J Neurosci. 2013 Jun 12;33(24):10165-81; Nishimoto et al., Nature. 1993 Mar 4;362(6415):75-9). Thus, it is likely that in microglia, Ric8a-dependent heterotrimeric G proteins also mediate only a subset of the anti-inflammatory signaling activated by APP.  As such, app knockout in microglia may have more severe effects than ric8a knockout on microglial immune activation, resembling the relationship between trem2 null vs heterozygous mutation discussed above. As such, it is predicted that chronic immune stimulation such as in vitro culturing will result in attenuated pro-inflammatory cytokine expression in app mutant microglia but elevated cytokine expression in ric8a mutant microglia. This may explain why TNF and IL6 secretion by cultured app mutant microglia is subdued, but acutely isolated _a_pp mutant macrophages instead show increased cytokine secretion. The latter may be more representative of the response of app mutant microglia in the absence of chronic stimulation.

      (10) Line 119: In their model, the authors suggest that there is a breach in pial basement membrane but that the phenotype is different from the retraction of the radial fibers due to reduced adhesion. So, could the author discuss to what substrate the radial fibers are attached to, in their model where the pial surface is destroyed?

      Radial glial endfeet normally bind to the basement membrane via cell surface receptors including the integrin and the dystroglycan protein complexes. We observe free radial glial endfeet at the breach sites, apparently without attachment to any basement membrane.  However, we cannot exclude the possibility that there may be residual, broken-off basement membrane components bound to the endfeet that are not detected by the methodology employed. 

      (11) The authors should show that the increased cytokine secretion observed in vitro is also happening in vivo in ric8a-emx1-cre compared to WT mice and compared to ric8a-nestin-cre mice. Or when app is deleted in microglia (app-cxcr3-cre) + LPS injection compared to WT mice +LPS.

      Unfortunately, this is not technically feasible since it is not possible to extract the extracellular (secreted) fractions of cytokines from an embryonic brain without causing cell lysis and the release of the intracellular pool.  This, however, does not affect our conclusion that the Abeta monomer-regulated microglia pathway plays a key role in regulates normal brain development since its genetic disruption, by different approaches, clearly results in brain malformation.

      (12) The authors injected inhibitors of Akt or Stat3 in the ric8a-emx1-cre cortex and found that it suppressed neuronal ectopia (Fig 5, Suppl fig 11). Does it suppress immune stimulation from neuronal cells or immune reaction from microglia cells?

      As discussed above, we agree at present the pharmacological approaches we have taken are not able to distinguish these two possibilities.  However, whichever is true, it does not affect our conclusion.  Also, we plan to determine the mechanisms of how ric8a mutation in neural cells induce immune activation in future studies. These results will likely enable us to adopt specific approaches to address this question.

      (13) Fig 5 and Supplementary fig 12: Please show a tubulin loading control in Fig 5i as you did in suppl fig 12 d (gel zymography). Please provide a gel zymography showing side by side Control, mutant and mutant +DM/S3I treatment. The same request for the MMP9 staining. Please provide statistics for control vs mutant for suppl fig 12c and d..

      We have now included whole gel zymography images with four control and four mutant individual samples as well as quantification in a new supplemental Fig.13 (panels b-c). This clearly shows increases in MMP9, while the MMP2 levels appear similar between controls and mutants. For all of the experiments of gelatin gel zymography, we quantify protein concentrations in the cortical lysates using the Bio-Rad Bradford assay kit and load the same amounts of proteins per lane. The results across lanes are thus all comparable.  The MMP9 staining images for the controls and mutants have also all been taken with the same parameters on the microscope and can be directly compared.  The statistics have now been provided as suggested.

      (14) Please provide the name and the source of the MMP9/13 inhibitor used in this study.

      This inhibitor is MMP-9/MMP-13 inhibitor I (CAS 204140-01-2), from Santa Cruz Biotechnology. This information has been included in revision.

      (15) The results show that deletion of ric8a in microglia and neural cells induced pia membrane breaches but no phenotype is apparent in ric8a deletion in microglia or neural cells alone. Then, the results showed that intraperitoneal injection of LPS induced the phenotype in ric8a-cxcr3-cre mutants. It would be beneficial as a control supporting the model to show that the insult induced by LPS injection does not induce the phenotype in the ric8a-foxg1-cre mice.

      We agree it may potentially be useful to show that LPS injection does not induce ectopia in ric8a-foxg1-cre mice.  Unfortunately, since the ric8a-foxg1-cre mutation shows no phenotype, we are no longer in possession of this line.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      - The information in the abstract and the introduction is only related to app. So, it is very abrupt how authors start the manuscript studying the role of Ric8a, with no information at all about this protein and why the authors want to investigate this role in microglial activation. Later in the manuscript, the authors tried to link Ric8a with app to study the role of app in the inflammatory response and ectopia formation. This link is quite weak as well.

      In the last paragraph of the Introduction, we explain the use of the ric8a mutant and how it leads to discovery of the Abeta monomer-regulated pathway. We have now improved the writing in revision to make these points especially the link between APP and Ric8a-regulated G proteins more clear.  In the Results section, we have also improved the writing on the potential link of Ric8a to APP by highlighting, among others, the fact that ric8a and app pathway mutants are among a unique group of a few mouse mutants (ric8a, app/aplp1/2, and apbb1/2) that show cortical ectopia exclusively in the lateral cortex, while all other cortical ectopia mutants also show severe ectopia are at the cortical midline.  This suggests that similar mechanisms may underlie the ectopia formation in this small group of mutants.

      -In order to validate the mouse model, double immunofluorescence or immunofluorescence+in situ hybridization should be performed to show that microglia express ric8a and that is eliminated in the Emx1-Ric8a mutant mice.

      As mentioned above, we have additional lines of evidence showing that ric8a is deleted from microglia in emx1-cre mutants. This includes data showing induction of the expression of a cre reporter in brain microglia by emx1-cre and loss of ric8a mRNA expression in microglia cells isolated from emx1-cre mutants.  These data have now been included in revised supplemental Fig. 8.

      -In Supplemental Fig. 6, the authors claimed that cell proliferation is normal in Ric8a mutant mice without doing any quantification. They also quantified the angle of mitotic division of progenitors in the ventricular zone, but there are no images for the spindle orientation quantification, and no description of how they did it. In addition, this data is contrary to what has already been published in conditional Ric8a mutant mice (Kask et al., 2015). The Vimentin staining should be improved.

      We have provided quantification of cell proliferation (phospho-histone 3 staining at the ventricular surface) in revised supplemental Fig. 6g, which shows no significant differences in the number of positive cells. We have also provided details on the definition of the angle of cleavage plane orientation in revised supplemental Fig. 6h and in the Methods section.  We are not sure why the results are different from the other study. We were indeed anticipating deficits in mitotic spindle orientation and spent major efforts in the analysis of this potential deficit.  However, based on the data, we could not draw the conclusion.     

      -Analysis of the MMP9 expression should be done by western blot and not by immunofluorescence. In fact, the MMP9 expression shown in Figure 5g,h, does not correspond with RNA expression shown in gene expression atlas like genepaint or the allen atlas, doubting the specificity of the antibody. The expression of Mmp9 is quite low or absent in the cortex at E13.5-E14.5, making this protein very unlikely to be responsible for laminin degradation during development.

      We have performed gelatin gel zymography on MMP2/9, which shows increased MMP9 activity levels in the mutant cortex. This is similar to Western blot analysis (all lanes are loaded with the same amounts of cortical lysates).  We have now included whole gel zymography images with four control and four mutant individual samples as well as quantification in a new supplemental Fig.13 (panels b-c).  The immunofluorescence staining of MMP9, a different type of analysis, was designed as a complementary approach, the results of which also support the interpretation of increases in MMP9 protein.  Regarding MMP9 RNA expression, please also note that MMP9 is secreted, and the protein expression pattern is expected to be different from that of RNA. We have performed wholemount in situ using dissected E13.5 mouse forebrains.  Our data (in new supplemental Fig.13a) show that MMP9 mRNA is strongly expressed in a sparse population of cells many of which appear to align along blood vessels. We suspect these are microglial lineage cells populating the embryonic cortex at this stage (see, for example, Squarzoni et al., Cell Rep. 2014 Sep 11;8(5):1271-9. doi: 10.1016/j.celrep.2014.07.042.).  Our control in situ using a Tnc5 probe also shows that the MMP9 signal is not a result of nonspecific probe binding.  Since the MMP9 expressing cells are very sparse even in the wholemount specimens while most database RNA in situ expression data are obtained using thin sections, we suspect this may be why the signal may have been missed in the databases.  As for functional contributions, we agree that we cannot rule roles played by other MMPs.  However, based on the ectopia suppression data, our results clearly indicate a critical contribution by MMP9/13.

      For MMP9 activity, authors should show the whole membrane with a minimum of three control and three mutant individual samples and with the quantification.<br /> - The graphs should be improved, including individual values and titles of the Y axes.

      We have included whole membrane zymography images with four control and four mutant individual samples as well as quantification in a new supplemental Fig.13b-c.  The graphs have also been improved as suggested.

    1. Author response:

      The following is the authors’ response to the current reviews.

      We are grateful to the reviewers for their positive assessment of the revised version of the article.

      Please find below our answers to the last, minor comments of the reviewers.

      We thank the reviewer for this important comment. In our live imaging experiments, we actually tracked the dorsal and ventral borders of the omp:yfp positive clusters in control and sly mutant embryos. These measurements showed that the omp:yfp positive clusters are more elongated along the DV axis in mutants as compared with control siblings, as seen on fixed samples (data not shown), suggesting that this difference in tissue shape is not due to fixation.

      Reviewer #4 (Public review):

      Summary:

      In this elegant study XX and colleagues use a combination of fixed tissue analyses and live imaging to characterise the role of Laminin in olfactory placode development and neuronal pathfinding in the zebrafish embryo. They describe Laminin dynamics in the developing olfactory placode and adjacent brain structures and identify potential roles for Laminin in facilitating neuronal pathfinding from the olfactory placode to the brain. To test whether Laminin is required for olfactory placode neuronal pathfinding they analyse olfactory system development in a well-established laminin-gamma-1 mutant, in which the laminin-rich basement membrane is disrupted. They show that while the OP still coalesces in the absence of Laminin, Laminin is required to contain OP cells during forebrain flexure during development and maintain separation of the OP and adjacent brain region. They further demonstrate that Laminin is required for growth of OP neurons from the OP-brain interface towards the olfactory bulb. The authors also present data describing that while the Laminin mutant has partial defects in neural crest cell migration towards the developing OP, these NCC defects are unlikely to be the cause of the neuronal pathfinding defects upon loss of Laminin. Altogether the study is extremely well carried out, with careful analysis of high-quality data. Their findings are likely to be of interest to those working on olfactory system development, or with an interest in extracellular matrix in organ morphogenesis, cell migration, and axonal pathfinding.

      Strengths:

      The authors describe for the first time Laminin dynamics during the early development of the olfactory placode and olfactory axon extension. They use an appropriate model to perturb the system (lamc1 zebrafish mutant), and demonstrate novel requirements for Laminin in pathfinding of OP neurons towards the olfactory bulb.

      The study utilises careful and impressive live imaging to draw most of its conclusions, really drawing upon the strengths of the zebrafish model to investigate the role of laminin in OP pathfinding. This imaging is combined with deep learning methodology to characterise and describe phenotypes in their Laminin-perturbed models, along with detailed quantifications of cell behaviours, together providing a relatively complete picture of the impact of loss of Laminin on OP development.

      Weaknesses:

      Some of the statistical tests are performed on experiments where n=2 for each condition (for example the measurements in Figure S2) - in places the data is non-significant, but clear trends are observed, and one wonders whether some experiments are under-powered.

      We initially planned the electron microscopy experiments in order to analyse 3 embryos per genotype per stage. However, because of technical issues we could not perform the measurements in all the cases, explaining why we have n = 2 in some of the graphs. The trends were quite clear, so we chose to keep these data in the article. We believe they nicely complement the immunostaining data assessing basement membrane integrity in control and mutant embryos.


      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors describe the dynamic distribution of laminin in the olfactory system and forebrain. Using immunohistochemistry and transgenic lines, they found that the olfactory system and adjacent brain tissues are enveloped by BMs from the earliest stages of olfactory system assembly. They also found that laminin deposits follow the axonal trajectory of axons. They performed a functional analysis of the sly mutant to analyse the function of laminin γ1 in the development of the zebrafish olfactory system. Their study revealed that laminin enables the shape and position of placodes to be maintained late in the face of major morphogenetic movements in the brain, and its absence promotes the local entry of sensory axons into the brain and their navigation towards the olfactory bulb. 

      Strengths: 

      - They showed that in the sly mutants, no BM staining of laminin and Nidogen could be detected around the OP and the brain. The authors then elegantly used electron microscopy to analyse the ultrastructure of the border between the OP and the brain in control and sly mutant conditions. 

      - To analyse the role of laminin γ1-dependent BMs in OP coalescence, the authors used the cluster size of Tg(neurog1:GFP)+ OP cells at 22 hpf as a marker. They found that the mediolateral dimension increased specifically in the mutants. However, proliferation did not seem to be affected, although apoptosis appeared to increase slightly at a later stage. This increase could therefore be due to a dispersal of cells in the OP. To test this hypothesis, the authors then analysed the cell trajectories and extracted 3D mean square displacements (MSD), a measure of the volume explored by a cell in a given period of time. Their conclusion indicates that although brain cell movements are increased in the absence of BM during coalescence phases, overall OP cell movements occur within normal parameters and allow OPs to condense into compact neuronal clusters in sly mutants. The authors also analysed the dimensions of the clusters composed of OMP+ neurons. Their results show an increase in cluster size along the dorso-ventral axis. These results were to be expected since, compared with BM, early neurog1+ neurons should compact along the medio-lateral axis, and those that are OMP+ essentially along the dorso-ventral axis. In addition to the DV elongation of OP tissue, the authors show the existence of isolated and ectopic (misplaced) YFP+ cells in sly mutants. 

      - To understand the origin of these phenotypes, the authors analysed the dynamic behaviour of brain cells and OPs during forebrain flexion. The authors then quantitatively measured brain versus OPs in the sly mutant and found that the OP-brain boundary was poorly defined in the sly mutant compared with the control. Once again, the methods (cell tracks, brain size, and proliferation/apoptosis, and the shape of the brain/OP boundary) are elegant but the results were expected. 

      - They then analysed the dynamic behaviour of the axon using live imaging. Thus, olfactory axon migration is drastically impaired in sly mutants, demonstrating that Laminin γ1dependent BMs are essential for the growth and navigation of axons from the OP to the olfactory bulb. 

      - The authors therefore performed a quantitative analysis of the loss of function of Laminin γ1. They propose that the BM of the OP prevents its deformation in response to mechanical forces generated by morphogenetic movements of the neighbouring brain. 

      Weaknesses: 

      - The authors did not analyse neurog1 + axonal migration at the level of the single cell and instead made a global analysis. An analysis at the cell level would strengthen their hypotheses.  

      - Rescue experiments by locally inducing Laminin expression would have strengthened the paper. 

      - The paper lacks clarity between the two neuronal populations described (early EONs and late OSNs).  

      - The authors quantitatively measured brain versus OPs in the sly mutant and found that the OP-brain boundary was poorly defined in the sly mutant compared with the control. Once again, the methods (cell tracks, brain size, proliferation/apoptosis, and the shape of the brain/OP boundary) are elegant but the results were expected. 

      - A missing point in the paper is the effect of Laminin γ1 on the migration of cranial NCCs that interact with OP cells. The authors could have analysed the dynamic distribution of neural crest cells in the sly mutant. 

      We thank the reviewer for the overall positive assessment of our work, and we carefully responded to all her/his insightful comments below. Live imaging experiments to (1) visualise exit and entry point formation with only a few axons labelled, (2) characterise the behaviour of single neurog1:GFP-positive neurons/axons during OP coalescence and to (3) analyse the migration of cranial NCC are now included in the revised manuscript to address the reviewer’s questions, and reinforce our initial conclusions.

      Reviewer #2 (Public Review): 

      Summary: 

      This manuscript addresses the role of the extracellular matrix in olfactory development. Despite the importance of these extracellular structures, the specific roles and activities of matrix molecules are still poorly understood. Here, the authors combine live imaging and genetics to examine the role of laminin gamma 1 in multiple steps of olfactory development. The work comprises a descriptive but carefully executed, quantitative assessment of the olfactory phenotypes resulting from loss of laminin gamma. Overall, this is a constructive advance in our understanding of extracellular matrix contributions to olfactory development, with a well-written Discussion with relevance to many other systems. 

      Strengths: 

      The strengths of the manuscript are in the approaches: the authors have combined live imaging, careful quantitative analyses, and molecular genetics. The work presented takes advantage of many zebrafish tools including mutants and transgenics to directly visualize the laminin extracellular matrix in living embryos during the developmental process. 

      Weaknesses: 

      The weaknesses are primarily in the presentation of some of the imaging data. In certain cases, it was not straightforward to evaluate the authors' interpretations and conclusions based on the single confocal sections included in the manuscript. For example, it was difficult to assess the authors' interpretation of when and how laminin openings arise around the olfactory placode and brain during olfactory axon guidance. 

      We thank the reviewer for the overall positive assessment of our work, and we carefully responded to all her/his insightful comments below. To address these comments, live imaging data to visualise exit and entry point formation with a sparse labelling of axons, and z-stacks showing how exit and entry points are organised in 3D, have been added to the revised manuscript.

      Reviewer #3 (Public Review): 

      This is a beautifully presented paper combining live imaging and analysis of mutant phenotypes to elucidate the role of laminin γ1-dependent basement membranes in the development of the zebrafish olfactory placode. The work is clearly illustrated and carefully quantified throughout. There are some very interesting observations based on the analysis of wild-type, laminin γ1, and foxd3 mutant embryos. The authors demonstrate the importance of a Laminin γ1-dependent basement membrane in olfactory placode morphogenesis, and in establishing and maintaining both boundaries and neuronal connections between the brain and the olfactory system. There are some very interesting observations, including the identification of different mechanisms for axons to cross basement membranes, either by taking advantage of incompletely formed membranes at early stages, or by actively perforating the membrane at later ones. 

      This is a valuable and important study but remains quite descriptive. In some cases, hypotheses for mechanisms are stated but are not tested further. For example, the authors propose that olfactory axons must actively disrupt a basement membrane to enter the brain and suggest alternative putative mechanisms for this, but these are not tested experimentally. In addition, the authors propose that the basement membrane of the olfactory placode acts to resist mechanical forces generated by the morphogenetic movement of the developing brain, and thus to prevent passive deformation of the placode, but this is not tested anywhere, for example by preventing or altering the brain movements in the laminin γ1 mutant. 

      We thank the reviewer for the overall positive assessment of our work and for suggesting interesting experiments to attempt in the future, and we carefully responded to all her/his constructive comments below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      In general, it would be easier to draw conclusions and compare data if the authors used similar stages throughout the article. 

      Throughout the article we tried to focus on a series of stages that cover both the coalescence of the OP (up to 24 hpf) and later stages of olfactory system development spanning the brain flexure process (28, 32, 36 hpf). However, for technical reasons it was not always possible to stick to these precise stages in some of our experiments. Also, in Fig. 1E-J, we picked in the movies some images illustrating specific cell or axonal behaviours, and thus the corresponding stages could not match exactly the stage series used in Fig. 1A-D and elsewhere in the article. Nevertheless, this stage heterogeneity does not affect our main conclusions.

      It would be useful to schematise the olfactory placode and the brain in an insert to clearly visualise the system in each figure. 

      We hope that the schematic which was initially presented in Fig. 1K already helps the reader to understand how the system is organised. Although we have not added more schematic views to represent the system in each figure (we think this would make the figures overcrowded), we have added additional legends to point to the OP and the brain in the pictures in order to clarify the localisation of each tissue.

      In the Summary, the authors refer to the integrity of the basement membrane. I don't think there is any attempt to affect basement membrane integrity in the article. It would be important to do so to look at the effect on CNS-PNS separation and axonal elongation. 

      In the Summary, we use the term « integrity of the basement membrane » to mention that we have analysed this integrity in the sly mutant. Given the results of our immunostainings against three main components of the basement membrane (Laminin, Collagen IV and Nidogen), as well as our EM observations, we see the sly mutant as a condition in which the integrity of the basement membrane is strongly affected.

      Rescue experiments by locally inducing Laminin expression would have strengthened the paper. 

      We have attempted to rescue the sly mutant phenotypes by introducing the mutation in the transgenic TgBAC(lamC1:lamC1-sfGFP) background, in which Laminin γ1 tagged with sfGFP is expressed under the control of its own regulatory sequences (Yamaguchi et al., 2022). To do so, we crossed sly+/-;Tg(omp:yfp) fish with sly+/-; Tg(lamC1:LamC1-sfGFP) fish. Surprisingly, while a rescue of the global embryo morphology was observed, no clear rescue of the olfactory system defects could be detected at 36 hpf. This could be due to the fact that the expression level of LamC1-sfGFP obtained with one copy of the transgene is not sufficient to rescue the olfactory system phenotypes, or that the sfGFP tag specifically affects the function of the Laminin 𝛾1 chain during the development of the olfactory system, making it unable to rescue the defects. Given the results of our first attemps, we decided not to continue in this direction.

      (1) Developing OP & brain are surrounded by laminin-containing BM (already described by Torrez-Pas & Whitlock in 2014). 

      "we first noticed the appearance of a continuous Laminin-rich BM surrounding the brain from 14-18 hpf, while around the OP, only discrete Laminin spots were detected at this stage (Fig. 1A, A'). " 

      Around 8ss for Torrez-Pas & Whitlock (before 14 hpf). Can you modify the text, or show an 8ss stage embryo? As far as I know, the authors do not show images at 14hpf. Please correct this sentence or show a 14 hpf picture. 

      The reviewer is right, we do not show any 14 hpf stage in the images and thus have removed this stage in the text and replaced it by 17 hpf.

      In Figure 1A, the labelling of laminin 111 does not appear to be homogeneous along the brain.

      Is this true? 

      At this stage the brain’s BM revealed by the Laminin immunostaining appears fairly continuous (while the OP’s one is clearly dotty and less defined), but indeed very tiny/local interruptions of the signal can been seen along the structure as detected by the reviewer. We thus modified the text to mention these tiny interruptions.

      How is the Laminin antibody used by the authors specific to laminin 111?  

      We thank the reviewer for raising this important point. The immunogen used to produce this rabbit polyclonal antibody is the Laminin protein isolated from the basement membrane of a mouse Engelbreth Holm-Swarm sarcoma (EHS). It is thus likely to recognise several Laminin isoforms and not only Laminin 111. We thus replaced Laminin 111 by Laminin when mentioning this antibody in the text and Figures.

      Please schematise in Figure 1K the stages you have tested and shown here in the article i.e. stages 18 - 22 - 28 -36 hpf using immunohistochemistry and 17-26-27-29-33 and 38 hpf using transgenics for laminin 111 and LamC1 respectively.  

      As suggested by the reviewer, we changed the stages in the schematics for stages we have presented in Figure 1 (analysed either with immunostaining or in live imaging experiments). We chose to represent 17 - 22 - 26 - 33 hpf (and thus adapted some of the schematics for them to match these stages).  

      Please specify in the Figure 1 legend for panels A to D whether this is a 3D projection or a zsection.

      We indicated in the Figure 1 legend that all these images are single z-sections (as well as for panels E-J).

      Furthermore, the schematisation in Fig. 1K does not reflect what the authors show: at 22 hpf laminin 111 labelling appears to be present only near the brain, and no labelling lateral to the olfactory placode and anteriorly and posteriorly. Thus, the schematisation in Figure 1K needs to be modified to reflect what the authors show.

      We agree with the reviewer that the Laminin staining at this stage is observed around the medial region of the OP, but not more laterally. We modified the schematic view accordingly in Figure 1K. Anterior and posterior sides of the OP are not represented in this schematic because we chose to represent a frontal view rather than a dorsal view.

      The authors suggest that" the laminin-rich BM of OP assembles between 18 and 22 hpf, during the late phase of OP coalescence". However, their data indicate that this BM assembles around 28hpf (Figure 1C). Can they clarify this point?

      What we meant with this sentence is that we cleary see two distinct BMs from 22 hpf. However, as noticed by the reviewer, the OP’s BM is only present around the medial/basal regions of the OP and does not surround the whole OP tissue at this stage. We modified the text to clarify this point (in particular by mentioning that the OP’s BM starts to assemble between 18 and 22 hpf), and replaced the image shown in Figure 1B, B’ with a more representative picture (the previous z-section was taken in very dorsal regions of the OP).

      It would be useful to disrupt these cells that have a cytoplasmic expression of Laminin-sfGFP, to analyse their contribution to BM and OP coalescence.

      Indeed it will be interesting in the future to test specifically the role of the cells expressing cytoplasmic Laminin-sfGFP around and within the OP, as proposed by the reviewer. Laser ablation of these cells could be attempted, but due to their very superficial localisation, close to the skin, we believe these ablations (with the protocol/set-up we currently use in the lab) would impair the skin integrity, preventing us to conclude. We consider that the optimisation of this experiment is out of the scope of the present work.

      Tg(-2.0ompb:gapYFP)rw032 marks ciliated olfactory sensory neurons (OSNs) (Sato et al., 2005). The authors should mention this. 

      Please see our detailed response to the next point below.

      Points to be clarified: 

      -Tg(-2.0ompb:gapYFP)rw032 marks ciliated olfactory sensory neurons (OSNs) (Sato et al., 2005). The authors should mention this here. Moreover, the authors refer to "OP neurons" throughout the article. In the development of the olfactory organ, two types of neurons have been described in the literature: early EONs (12hpf-26hpf) and later OSNs. Each could have a specific role in the establishment and maintenance of the BM described by the authors. The authors need to clarify this point as, in Figure 1 for example, they use a marker for Tg(neurog1:GFP) EONs and a marker for ciliated OSNs without distinction. The distinction between EONs and OSNs comes a little late in the text and should be placed higher up. 

      As mentioned by the reviewer, according to the initial view of neurogenesis in the OP, OP neurons are born in two waves. A transient population of unipolar, dendrite-less pioneer neurons would differentiate first, in the ventro-medial region of the OP and elongate their axons dorsally out of the placode, along the brain wall. These pioneer axons would then be used as a scaffold by later born OSNs located in the dorso-lateral rosette to outgrow their axons towards the olfactory bulb (Whitlock and Westerfield, 1998). 

      Another study further characterised OP neurogenesis and showed that the first neurons to differentiate in the OP (the early olfactory neurons or EONs) express the Tg(neurog1:GFP) transgene (Madelaine et al., 2011). As mentioned by the authors in the discussion of this article, neurog1:GFP+ neurons appear much more numerous than the previously described pioneer neurons, and may thus include pioneers but also other neuronal subtypes.

      We would like here to share additional, unpublished observations from our lab that further suggest that the situation is more complex than the pioneer/OSN and EON/OSN nomenclatures. First, in many of our live imaging experiments, we can clearly visualise some neurog1:GFP+ unipolar neurons, initially located in a medial position in the OP, which intercalate and contribute to the dorsolateral rosette (where OSNs are proposed to be located) at the end of OP coalescence, from 22-24 hpf. Second, in fixed tissues, we observed that most neurog1:GFP+ neurons located in the rosette at 32 hpf co-express the Tg(omp:meRFP) transgene (Sato et al., 2005). These observations suggest that at least a subpopulation of neurog1:GFP+ neurons could incorporate in the dorsolateral rosette and become ciliated OSNs during development. We can share these results with the reviewer upon request. Further studies are thus needed to clarify and describe the neuronal subpopulations and lineage relationships in the OP, but this detailed investigation is out of the scope and focus of the present study. 

      An additional complication comes from the fact that, as shown and acknowledged by the authors in Miyasaka et al., 2005, the Tg(omp:meYFP) line (6kb promoter) labels ciliated OSNs in the rosette but also some unipolar, ventral neurons (around 10 neurons at 1 dpf, Miyasaka et al. 2005, Figure 3A, white arrowheads). This was also observed using the 2 kb promoter Tg(omp:meYFP) line (see for instance Miyasaka et al., 2007) and in our study, we can indeed detect these ventro-medial neurons labelled in the Tg(omp:meYFP) line (2 kb promoter), see for instance Figure 1C’, D’ or Movie 6. It is unclear whether these unipolar omp:meYFPpositive cells are pioneer neurons or EONs expressing the omp:meYFP transgene, or OSN progenitors that would be located basally/ventrally in the OP at these stages.

      For all these reasons, we decided to present in the text the current view of neurogenesis in the OP but instead of attributing a definitive identity to the neurons we visualise with the transgenic lines, we prefer to mention them in the manuscript (and in the rest of the response to the reviewers) as neurons expressing neurog1:GFP or omp:meYFP transgenes (or cells/axons/neurons expressing RFP in the Tg(cldnb:Gal4; UAS:RFP) background).

      What we also changed in the text to be more clear on this point:

      - we moved higher up in the text, as suggested by reviewer 1, the description of the current model of neurogenesis in the OP,

      - we mentioned that neurog1:GFP+ neurons are more numerous than the initially described pioneer neurons, as discussed in Madelaine et al., 2011,

      - we wrote more clearly that the Tg(omp:meYFP) line labels ciliated OSNs but also a subset of unipolar, ventral neurons (Miyasaka et al., 2005), and pointed to these ventral neurons in Figure 1C’, D’,

      - in the initial presentation of the current view of OP neurogenesis we renamed neurog1:GFP+ into EONs to be coherent with Madelaine et al., 2011.

      - To visualise pioneer axons, the authors should use an EONS marker such as neurog1 because, to my knowledge, OMP only marks OSN axons and not pioneer axons.  

      To visualise neurog1:GFP+ axons during OP coalescence, we performed live imaging upon injection of the neurog1:GFP plasmid (Blader et al., 2003) in the Tg(cldnb:Gal4; UAS:RFP) background (n = 4 mutants and n = 4 controls from 2 independent experiments). We observed some GFP+ placodal neurons exhibiting retrograde axon extension in both controls and sly mutants. In such experiments it is very difficult to quantify and compare the number of neurons/axons showing specific behaviours between different experimental conditions/genetic background. Indeed, due to the cytoplasmic localisation of GFP, the axons can only be seen in neurons expressing high levels of GFP, and due to the injection the number of such neurons varies a lot in between embryos, even in a given condition. Nevertheless, our qualitative observations reinforce the idea that the basement membrane is not absolutely required for mediolateral movements and retrograde axon extension of neurog1:GFP+ neurons in the OP. We added examples of images extracted from these new live imaging experiments in the revised Fig. S5A, B.

      - The authors should analyse the presence of laminin in the OP and forebrain in conjunction with neural crest cell dynamics (using a Sox10 transgenic line for example) to refine their entry and exit point hypotheses. 

      As described in the answer to the next point, we performed new experiments in which we visualised NCC migration in the Tg(neurog1:GFP) background, which allowed us to analyse the localisation of NCC at the forebrain/OP boundary, in ventral and dorsal positions, both in sly mutant embryos and control siblings.

      - A dynamic analysis of the distribution of neural crest cells in the sly mutant over time and during OP coalescence would be important. 

      The dynamics of zebrafish cranial NCC migration in the vicinity of the OP has been previously analysed using sox10 reporter lines (Harden et al., 2012, Torres-Paz and Whitlock, 2014, Bryan et al., 2020). To address the point raised by the reviewer, we performed live imaging from 16 to 32 hpf on sly mutants and control siblings carrying the Tg(neurog1:GFP) and Tg(UAS:RFP) transgenes and injected with a sox10(7.2):KalTA4 plasmid (Almeida et al., 2015). This allows the mosaic labelling of cells that express or have expressed sox10 during their development which, in the head region at these stages, represents mostly NCC and their derivatives. 3 independent experiments were carried out (n = 4 mutant embryos in which 8 placodes could be analysed; n = 6 control siblings in which 10 placodes could be analysed). A new movie (Movie 9) has been added to the revised article to show representative examples of control and mutant embryos.

      From these new data, we could make the following observations:

      - As expected from previous studies (Harden et al., 2012, Torres-Paz and Whitlock, 2014, Bryan et al., 2020), in control embryos a lot of NCC had already migrated to reach the vicinity of the OP when the movies begin at 16 hpf, and were then seen invading mainly the interface between the eye and the OP (10/10 placodes). Surprisingly, in sly mutants, a lot of motile NCC had also reached the OP region at 16 hpf in all the analysed placodes (8/8), and populated the eye/OP interface in 7/8 placodes (10/10 in controls). Counting NCC or tracking individual NCC during the whole duration of the movies was unfortunately too difficult to achieve in these movies, because of the low level of mosaicism (a high number of cells were labelled) and of the high speed of NCC movements (as compared with the 10 min delta t we chose for the movies). 

      - in some of the control placodes we could detect a few NCC that populated the forebrain/OP interface, either ventrally, close to the exit point of the axons (4/10 placodes), or more dorsally (8/10 placodes). By contrast, in sly mutants, NCC were observed in the dorsal region of the brain/OP boundary in only 2/8 placodes, and in the ventral brain/OP frontier in only 2/8 placodes as well. Interestingly, in these 2 last samples, NCC that had initially populated the ventral region of the brain/OP interface were then expelled from the boundary at later stages.

      We reported these observations in a new Table that is presented in revised Fig. S6B. In addition, instances of NCC migrating at the eye/OP or forebain/OP interfaces are indicated with arrowheads on Movie 9. Previous Figure S6 was splitted into two parts presenting NCC defects in sly mutants (revised Figure S6) and in foxd3 mutants (revised Figure S7).

      Altogether, these new data suggest that the first postero-anterior phase of NCC migration towards the OP, as well as their migration in between the eye and OP tissues, is not fully perturbed in sly mutants. The subset of NCC that populate the OP/forebrain seem to be more specifically affected, as these NCC show defects in their migration to the interface or the maintenance of their position at the interface. Since the crestin marker labels mostly NCC at the OP/forebrain interface at 32 hpf (revised Fig. S6A), this could explain why the crestin ISH signal is almost lost in sly mutants at this stage.

      (2) Laminin distribution suggests a role in olfactory axon development 

      "Laminin 111 immunostaining revealed local disruptions in the membrane enveloping the OP and brain, precisely where YFP+ axons exit the OP (exit point) and enter the brain (entry point) (Fig. 1C-D')." Can the authors quantify this situation? It would be important to analyse this behaviour on the scale of a neuron and thus axonal migration to strengthen the hypotheses. 

      As suggested by the reviewer, to better visualise individual axons at the exit and entry point, we used mosaic red labelling of OP axons. To achieve this sparse labelling, we took advantage of the mosaic expression of a red fluorescent membrane protein observed in the Tg(cldnb:Gal4; UAS:lyn-TagRFP) background. The unpublished Tg(UAS:lyn-TagRFP) line was kindly provided by Marion Rosello and Shahad Albadri from the lab of Filippo Del Bene. We crossed the Tg(cldnb:Gal4; UAS:lyn-TagRFP) line with the TgBAC(lamC1:lamC1-sfGFP) reporter and performed live imaging on 2 embryos/4 placodes, in a frontal view. A new movie (Movie 3 in the revised article) shows examples of exit and entry point formation in this context.This allowed us to visualise the formation of the exit and entry points in more samples (6 embryos and 12 placodes in total when we pool the two strategies for labelling OP axons) and through the visualisation of a small number of axons, and reinforce our initial conclusions. 

      (3) The integrity of BMs around the brain and the OP is affected in the sly mutant 

      Why do the authors analyse the distribution of collagen IV and Nidogen and not proteoglycans and heparan sulphate? 

      We attempted to label more ECM components such as proteoglycans and heparan sulfate, but whole-mount immunostainings did not work in our hands.

      A dynamic analysis of the distribution of neural crest cells in the sly mutant over time and during OP coalescence would be important. 

      See our detailed response to this point above.  

      (4) Role of Laminin γ1-dependent BMs in OP coalescence 

      The authors use the size of the Tg(neurog1:GFP)+ OP cell cluster at 22 hpf as a marker.  The authors should count the number of cells in the OP at the indicated time using a nuclear dye to check that in the sly mutant the number of cells is the same over time. Two time points as analysed in Figure S2 may not be sufficient to quantify proliferation which at these stages should be almost zero according to Whitlock & Westerfield and Madelaine et al.

      Counting the neurog1:GFP+ cell numbers in our existing data was unfortunately impossible, due to the poor quality of the DAPI staining. We are nevertheless confident that the number of cells within neurog1:GFP+ clusters is fairly similar between controls and sly mutants at 22 hpf, since the OP dimensions are the same for AP and DV dimensions, and only slightly different for the ML dimension. In addition, we analysed proliferation and apoptosis within the neurog1:GFP+ cluster at 16 and 21 hpf and observed no difference between controls and mutants.

      (5) Role of Laminin γ1-dependent BMs during the forebrain flexure 

      In Figure 4F at 32hpf, the presence of 77% ectopic OMP+ cells medially should result in an increase in dimensions along the M-L? This is not the case in the article. The authors should clarify this point. 

      As we explained in the Material and Methods, ectopic fluorescent cells (cells that are physically separated from the main cluster) were not taken into account for the measurement of the OP dimensions. This is now also also mentioned in the legends of the Figures (4 and S3) showing the quantifications of OP dimensions.

      Cell distribution also seems to be affected within the OMP+ cluster at 36hpf, with fewer cells laterally and more medially. The authors should analyse the distribution of OMP+ cells in the clusters. in sly mutants and controls to understand whether the modification corresponds to the absence of BM function. 

      On the pictures shown in Figure 4F,G, we agree that omp:meYFP+ cells appear to be more medially distributed in the mutant, however this is not the case in other sections or samples, and is rather specific to the z-section chosen for the Figure. We found that the ML dimension is unchanged in mutants as compared with controls, except for the 28 hpf stage where it is smaller, but this appears to be a transient phenomenon, since no change is detected at earlier or later stages (Figure 4A-D and Figure S3A-L). The difference we observe at 28 hpf is now mentioned in the revised manuscript.

      The conclusions of Figures 4 and S3 would rather be that laminin allows OMP+ cells to be oriented along the medio-lateral axis whereas it would control their position along the dorsoventral axis. The authors should modify the text. It would be useful to map the distribution of OMP+ cells along the dorsoventral and mediolateral axes. The same applies to Neurog1+ cells. An analysis of skin cell movements, for example, would be useful to determine whether the effects are specific.  

      We are confident that the measurements of OP dimensions in AP, DV and ML are sufficient to describe the OP shape defects observed in the sly mutants. Analysing cell distribution along the 3 axes as well as skin cell movements will be interesting to perform in the future but we consider these quantifications as being out of the scope of the present work.

      (6) Laminin γ1-dependent BMs are required to define a robust boundary between the OP and the brain 

      The authors must weigh this conclusion "Laminin γ1-dependent BMs serve to establish a straight boundary between the brain and OP, preventing local mixing and late convergence of the two OPs towards each other during flexion movement." Indeed, they don't really show any local mixing between the brain and OP cells. They would need to quantify in their images (Figure 5A-A' and Figure S4 A-A') the percentage of cells co-labelled by HuC and Tg(cldnb:GFP). 

      We agree with the reviewer and thus replaced « reveal » by « suggest » in the conclusion of this section. 

      (7) Role of Laminin γ1-dependent BMs in olfactory axon development 

      An analysis of the retrograde extension movement in the axons of OMP+ ectopic neurons in the sly1 mutant condition would be useful to validate that the loss of laminin function does not play a role in this event. 

      Indeed, even though we can visualise instances of retrograde extension occurring normally in sly mutants, we can not rule out that this process is affected in a subset of OP neurons, for instance in ectopic cells, which often show no axon or a misoriented axon. We added a sentence to mention this in the revised manuscript.

      Minor comments and typos: 

      Please check and mention the D-V/L-M or A-P/L-M orientation of the images in all figures. 

      This has been checked.

      Legend Figure 1: "distalmost" is missing a space "distal most". 

      We checked and this word can be written without a space.

      Figure 1 panel C: check the orientation (I am not sure that Dorsal is up). 

      We double-checked and confirm that dorsal is up in this panel.

      Movie 1 Legend: "aroung "the OP should be around the OP. 

      Thanks to the reviewer for noticing the typo, we corrected it.

      Reviewer #2 (Recommendations For The Authors):

      The comments below are relatively minor and mostly raise questions regarding images and their presentation in the manuscript. 

      • Figure 1, visualization of exit and entry points: It is a bit difficult to visualize the axon exit and entry points in these images, and in particular, to understand how the exit and entry points in C and D correspond to what is seen in F, F', H, and H'. There appears to be one resolvable break in the staining in C and D, whereas there are two distinct breaks in F-H'. Are these single optical sections? Is it possible to visualize these via 3-dimensional rendering? 

      All the images presented in Figure 1 are single z-sections, which is now indicated in the Figure legend. As noticed by the reviewer, Laminin immunostainings on fixed embryos at 28 and 36 hpf suggested that the exit and entry points are facing each other, as shown in Figure 1C-D’. However, in our live imaging experiments we always observed that the exit point is slightly more ventral than the entry point (of about 10 to 20 µm). This discrepancy could be due to the fixation that precedes the immunostaining procedure, which could modify slightly the size and shape of cells/tissues. We added a sentence on this point in the text. In addition, we added new movies of the LamC1-sfGFP reporter with sparse red axonal labelling (Movie 3, see response to reviewer 1), as well as z-stacks presenting the organisation of exit and entry points in 3D (Movie 4), which should help to better illustrate the mechanisms of exit and entry point formation.

      • Movie 2, p. 6, "small interruptions of the BM were already present near the axon tips, along the ventro-medial wall of the OP." This is a bit difficult to assess since the movie seems to show at least one other small interruption in the BM in addition to the exit point, in particular, one slightly dorsal to the exit point. Was this seen in other samples, or in different optical sections? 

      Indeed the exit and entry points often appear as regions with several, small BM interruptions, rather than single holes in the BM. We now show in revised Movie 4 the two z-stacks (the merge and the single channel for green fluorescence) corresponding to the last time points of the movies showing exit and entry point formation in Movie 2, where several BM interruptions can be seen for both the exit and entry points. We had already mentioned this observation in the legend of Movie 2, and we added a sentence on this point in the main text of the revised manuscript. This is also represented for both exit and entry points in the new schematics in revised Fig. 1K and its legend. 

      • Movie 2, p. 6, "The opening of the entry point through the brain BM was concomitant with the arrival of the RFP+ axons, suggesting that the axons degrade or displace BM components to enter the brain." Similar to the questions regarding the exit point, it was a bit difficult to evaluate this statement. There appears to be a broader region of BM discontinuity more dorsal to the arrowhead in Movie 2. A single-channel movie of just the laminin fluorescence might help to convey the extent of the discontinuity. As with above, was this seen in other samples, or in different optical sections?  

      See our response to the previous comment.

      • Figure 1H, I, "the distal tip of the RFP+ axons migrated in close proximity with the brain's BM." This is again a bit difficult to see, and quite different than what is seen in Figure 4A, in which the axons do not seem close to the BM in this section. Is it possible to visualize this via 3-dimensional rendering? 

      In fixed embryos or in live imaging experiments, we observed that, once entered in the brain, the distal tips (the growth cones) of the axons are located close to the BM of the brain. However, this is not the case of the axon shafts which, as development proceeds, are located further away from the BM. This can clearly be seen at 36 hpf in Figure 1D’ and Figure 4A, as spotted by the reviewer. We modified the text to clarify this point.

      • Figure 2J, J', p. 7, the gap between the OP and brain cells of sly mutants "was most often devoid of electron-dense material." It is difficult to see this loss of electron-dense material in 2J'. The thickness of the space is quantified well and is clearly smaller, but the change in electron-dense material is more difficult to see.  

      We looked at Figure 2 again and it seems clear to us that there is electron-dense material between the plasma membranes in controls, which is practically not seen (rare spots) in the mutants. We added a sentence mentioning that we rarely see electron-dense spots in sly mutants.

      • Figure 5E-F': There are concerns about evaluating the shape of a tissue based on nuclear position. Is there a way to co-stain for cell boundaries (maybe actin?), and then quantify distortion of the dlx+ cell population using the cell boundaries, rather than nuclear staining? 

      We agree with the reviewer that it is not ideal to evaluate the shape of the OP/brain boundary based on a nuclear staining. As explained in the text, we could not use the Tg(eltC:GFP) or Tg(cldnb:Gal4; UAS:RFP) reporter lines for this analysis, due to ectopic or mosaic expression. However we are confident that the segmentation of the Dlx3b immunostaining reflects the organisation of the cells at the OP/brain tissue boundary: in other data sets in which we performed Dlx3b staining with membrane labelling independently of the present study and in the wild type context, we clearly see that cell membranes are juxtaposed to the Dlx3b nuclear staining (in other words, the cytoplasm volume of OP cells is very small). 

      • Figure S5E: It would be helpful to see representative images for each of the categories (Proper axon bundle; Ventral projections; Medial projections) or a schematic to understand how the phenotypes were assessed. 

      To address this point we added a schematic view to illustrate the phenotypes assessed in each column of the table in revised Figure S5G.

      • Figure 6, p. 12, "Laminin gamma 1-dependent BMs are essential for growth and navigation of the axons...": What fraction of the tracked axons managed to exit the OP? Given the quantitative analyses in Figure 6, one might interpret this to mean that laminin gamma 1 is not essential for axon growth (speed and persistence are largely unchanged), but rather, primarily for navigation. 

      As noticed by the reviewer, the speed and persistence of axonal growth cones are largely unchanged in the sly mutants (except for the reduced persistence in the 200-400 min window, and an increased speed in the 800-1000 min window), showing that the growth cones are still motile. However, as shown by the tracks, they tend to wander around within the OP, close to the cell bodies, which results in the end in a perturbed growth of the axons. The navigation issues are rather revealed by the analysis of fixed Tg(omp:meYFP) embryos presented in the table of Figure S5G. We modified the text to separate more clearly the conclusions of the two types of experiments (fixed, transgenic embryos versus live, mosaically labelled embryos).

      Reviewer #3 (Recommendations For The Authors):

      Testing the hypotheses mentioned in the public review will be interesting experiments for a follow-up study, but are not essential revisions for this manuscript. 

      I have only a few minor suggestions for revisions: 

      P8 subheading 'Role of Laminin γ1-dependent BMs in OP coalescence' - since no major role was demonstrated here, this heading should be reworded.  

      We agree with the reviewer and replaced the previous title by « OP coalescence still occurs in the sly mutant ».

      P11, line 3 - the authors conclude that the forebrain is smaller 'due to' the inward convergence of the OPs. I do not think it is possible to assign causation to this when the mutant disrupts Laminin γ1 systemically - it is equally possible that the OPs move inward due to a failure of the brain to form in the normal shape. Thus, the wording should be changed here. (In the Discussion on p15, the authors mention the 'apparent distortion' of the brain, and say that it is 'possibly due' to the inward migration of the placodes', but again this could be toned down.) 

      We agree with the reviewer’s comment and changed the wording of our conclusions in the Results section.

      P11 and Fig. S5 - The table and text seem to be saying opposite things here. The text on p11 (3rd paragraph) indicates that the normal exit point is ventral and that this is disrupted in the mutant, with axons exiting dorsally. However, in the table, at each time point there is a higher % of axons exiting ventrally in the mutant. Please clarify. The table does not provide a % value for axons exiting dorsally - it might help to add a column to show this value. 

      We are grateful to the reviewer for pointing this out, and we apologize for the lack of clarity in the first version of the manuscript. We have modified the text and Figure S5 in order to clarify the different points raised by the reviewer in this comment. The Table in Fig. S5G does not represent the % of axons showing defects, but the % of embryos showing the phenotypes. In addition, an embryo is counted in the ventral or medial projection category if it shows at least one ventral or medial projection (even if its shows a proper bundle). This is now clearly indicated in the title of the columns in the table itself and in the legend. The embryos in which the axons exit dorsally in sly mutants are actually those counted in the left column of the Table (they exit dorsally and form a bundle), as shown by the new schematics added below the table. We also added this information in the title of the left column, and mention in the legend the pictures in which this dorsal exit can be observed in the article (Figures 4B and S3E’). Having more sly mutant embryos with axons exiting dorsally is thus compatible with more embryos showing at least one ventral projection.

      Fig. S6, shows the lack of neural crest cells between the olfactory placode and the brain in both laminin γ1 mutants (without a basement membrane) and foxd3 mutants (which retain the membrane). Comparison of the two mutants here is a neat experiment and the result is striking, demonstrating that it is the basement membrane, and not the neural crest, that is required for correct morphology of the olfactory placode. I think this figure should be presented as a main figure, rather than supplementary.  

      Our new live imaging characterisation of NCC migration in sly mutants and control siblings (Movie 9) revealed that at 32 hpf, in the vicinity of the OP, NCC (or their derivatives) are much more numerous than the subset of NCC showing crestin expression by in situ hybridisation (compare the end of our control movie – 32 hfp, with crestin ISH shown in Figure S6A for instance). 

      Thus, the extent of the NCC migration defects should be analysed in more detail in the foxd3 mutant in the future (using live imaging or other NCC markers), and for this reason we chose to keep this dataset in the supplementary Figures.

      One of the first topics covered in the Discussion section is the potential role of Collagen. I was surprised to see the description on P15 'the dramatic disorganization of the Collagen IV pattern observed by immunofluorescence in the sly mutant', as I hadn't picked this up from the Results section of the paper. I went back to the relevant figure (Fig. 2) and description on p7, which does not give the same impression: 'in sly mutants, Collagen IV immunoreactivity was not totally abolished'. This suggested to me that there was only minor (not dramatic) disorganisation of the Collagen IV. This needs clarification.  

      The linear, BM-like Collagen IV staining was lost in sly mutants, but not the fibrous staining which remained in the form of discrete patches surrounding the OP. We modified the text in the Results section as well as in the Figure 2 legend to clarify our observations made on embryos immunostained for Collagen IV.

      Typos etc 

      P5 - '(ii) above of the neuronal rosette' - delete the word 'of'. 

      P5 two lines below this - ensheathed. 

      P10 - '3 distinct AP levels' (delete s from distincts). 

      P10 - distortion (not distorsion) . 

      P12 - 'From 14 hpf, they' should read 'From 14 hpf, neural crest cells'. 

      P15, line 1 - 'is a consequence of' rather than 'is consecutive of'? 

      P22 'When the data were not normal,' should read 'When the data were not normally distributed,'. 

      We thank the reviewer for noticing these typos and have corrected them.

      General 

      Please number lines in future manuscripts for ease of reference. 

      This has been done.

    2. eLife Assessment

      This important study describes the function of Laminin y1-dependent basement membranes in development of the olfactory placode, including morphogenesis of the placode, boundary formation, and olfactory axonal pathfinding. The study uses elegant live imaging approaches and extensive quantitative analyses, combined with detailed mutant analyses to provide a compelling description of the role of Laminin in olfactory placode development. In addition to the contributions this study makes to understanding olfactory placode development, it will also be of broader interest to individuals studying extracellular matrix regulation of tissue morphogenesis, and neural development including neuronal pathfinding.

    3. Reviewer #1 (Public review):

      The authors describe the dynamic distribution of laminin γ1 in the olfactory system and forebrain. Using immunohistochemistry and transgenic lines, they found that the olfactory system and adjacent brain tissues are enveloped by basement membrane (BMs) from the earliest stages of olfactory system assembly. They also found that laminin deposits follow the axonal trajectory of axons. They performed a functional analysis of the sly mutant to analyse the function of laminin γ1 in the development of the zebrafish olfactory system. Their study revealed that laminin enables the shape and position of olfactory placodes to be maintained late in the face of major morphogenetic movements in the brain, and its absence promotes the local entry of sensory axons into the brain and their navigation towards the olfactory bulb.

      They showed that in the laminin γ1 mutants no BM staining of laminin could be detected around the OP and the brain. The authors then elegantly used electron microscopy to analyse the ultrastructure of the border between the OP and the brain.<br /> The authors performed a quantitative analysis of the loss of function of Laminin γ1 (sly mutants).<br /> Olfactory axon migration is drastically impaired in sly mutants, demonstrating that Laminin γ1-dependent BMs are essential for the growth and navigation of axons from the OP to the olfactory bulb. They propose that the BM of the OP prevents its deformation in response to mechanical forces generated by morphogenetic movements of the neighbouring brain.<br /> Although the results are expected, the experiments carried out and the results are robust and elegant.

    4. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses the role of extracellular matrix in olfactory development. Despite the importance of these extracellular structures, the specific roles and activities of matrix molecules are still poorly understood. Here, the authors combine live imaging and genetics to examine the role of the laminin gamma 1 in multiple steps of olfactory development. The work comprises a descriptive but carefully executed, quantitative assessment of the olfactory phenotypes resulting from loss of laminin gamma 1. Overall, this is a constructive advance in our understanding of extracellular matrix contributions to olfactory development, with a well-written Discussion with relevance to many other systems.

      Strengths:

      The strengths of the manuscript are in the approaches: the authors have combined live imaging, careful quantitative analyses, and molecular genetics. The work presented takes advantage of many zebrafish tools including mutants and transgenics to directly visualize the laminin extracellular matrix in living embryos during the developmental process.

      Weaknesses:

      Weaknesses in the first round of critique were addressed in the revision, and a minor caveat is regarding interpretation of differences in tissue size and shape in fixed samples (comparing mutants and controls); the fixation process can alter these properties and may do so differently between genotypes.

    5. Reviewer #4 (Public review):

      Summary:

      In this elegant study XX and colleagues use a combination of fixed tissue analyses and live imaging to characterise the role of Laminin in olfactory placode development and neuronal pathfinding in the zebrafish embryo. They describe Laminin dynamics in the developing olfactory placode and adjacent brain structures and identify potential roles for Laminin in facilitating neuronal pathfinding from the olfactory placode to the brain. To test whether Laminin is required for olfactory placode neuronal pathfinding they analyse olfactory system development in a well-established laminin-gamma-1 mutant, in which the laminin-rich basement membrane is disrupted. They show that while the OP still coalesces in the absence of Laminin, Laminin is required to contain OP cells during forebrain flexure during development and maintain separation of the OP and adjacent brain region. They further demonstrate that Laminin is required for growth of OP neurons from the OP-brain interface towards the olfactory bulb. The authors also present data describing that while the Laminin mutant has partial defects in neural crest cell migration towards the developing OP, these NCC defects are unlikely to be the cause of the neuronal pathfinding defects upon loss of Laminin. Altogether the study is extremely well carried out, with careful analysis of high-quality data. Their findings are likely to be of interest to those working on olfactory system development, or with an interest in extracellular matrix in organ morphogenesis, cell migration, and axonal pathfinding.

      Strengths:

      The authors describe for the first time Laminin dynamics during the early development of the olfactory placode and olfactory axon extension. They use an appropriate model to perturb the system (lamc1 zebrafish mutant), and demonstrate novel requirements for Laminin in pathfinding of OP neurons towards the olfactory bulb.<br /> The study utilises careful and impressive live imaging to draw most of its conclusions, really drawing upon the strengths of the zebrafish model to investigate the role of laminin in OP pathfinding. This imaging is combined with deep learning methodology to characterise and describe phenotypes in their Laminin-perturbed models, along with detailed quantifications of cell behaviours, together providing a relatively complete picture of the impact of loss of Laminin on OP development.

      Weaknesses:

      Some of the statistical tests are performed on experiments where n=2 for each condition (for example the measurements in Figure S2) - in places the data is non-significant, but clear trends are observed, and one wonders whether some experiments are under-powered.

    1. eLife Assessment

      This important study suggests that the composition of the extracellular matrix in a mouse model of liver fibrosis changes depending on the cause of liver fibrosis. The data could be used as a foundation for future antifibrotic therapies. The strength of evidence is solid with respect to the use of animal models and proteomic analysis. The study provides a helpful inventory of proteins up or down-regulated, but functional analyses are limited and translational data are lacking.

    2. Reviewer #1 (Public review):

      Summary:

      Jirouskova and colleagues in their study have carried out an in-depth proteomic characterization of the dynamics of the liver fibrotic response and the resulting resolution in two distinct models of liver injury: CCl4-induced model of hepatotoxicity and pericentral/bridging liver fibrosis and the DDC feeding model of obstructive cholestasis and periportal fibrosis. They focussed on both the insoluble extracellular matrix (ECM) components as well as the soluble secreted factors produced by hepatic stellate cells (HSCs) and/or portal fibroblasts (PFs). They identified compartment- and time-resolved proteomic signatures in the two models with disease-specific factors or matrisomes. Their study also identified phenotypic differences between the models such as that while the CCl4-induced model induced profound hepatotoxicity followed by resolution, the DDC model induced more lasting liver damage and proteomic changes that resembled advanced human liver fibrosis favouring hepatocarcinogenesis.

      Overall, this comprehensive and very well-conducted study is rigorous and well-planned. The conclusions are supported by compelling studies and analyses. One caveat is the lack of mechanistic experiments to prove causality, but this can be carried out in follow-up studies.

      Strengths:

      (1) A major strength of the study is that the experiments are rigorous and very well conducted. For instance, the authors utilized two models of liver fibrosis to study different aspects of the pathology - hepatotoxicity vs cholestasis. In addition, 4 time points for each model were investigated - 2 for fibrosis development and 2 for fibrosis resolution. They have taken 3 components for proteomic analyses - total lysates, insoluble ECM components as well as the soluble secreted factors. Thus, the authors provide a comprehensive overview of the fibrosis and resolution process in these models.

      (2) Another great strength of the study is that the methodology utilized was able to dissect unique pathways relevant to each model as well as common targets. For example, the authors identified known pathways such as mTOR signalling to be differentially regulated in the CCl4 vs DDC model. mTOR signalling was increased in the DDC model which is associated with hyperproliferation. Thus showing that the approach taken is specific enough to distinguish between the two similar (both induce fibrosis) but distinct mechanisms (hepatotoxicity vs cholestasis) is a strong point of the study.

      Weaknesses:

      (1) The authors themselves propose in their Introduction that the "ECM-associated changes are increasingly perceived as causative, rather than consequential"; however, they have not conducted mechanistic (gain of function/loss of function) studies either in vitro or in vivo from any of their identified targets to truly prove causality. This remains one of the limitations of this study. Thus, future studies should investigate this point in detail. For instance, it would have been intriguing to dissect if knocking out specific genes involved in one specific model or genes common to both would yield distinct phenotypic outcomes.

      (2) The majority of the conclusions are derived primarily from the proteomic analyses. Although well conducted, it would strengthen the study to corroborate some of the major findings by other means such as IHC/IF with the corresponding quantifications and not only representative images.

    3. Reviewer #2 (Public review):

      Summary:

      The authors suggest that ECM abundance and composition change depending on the aetiology of liver fibrosis. To understand this they have investigated the proteome in two models of animal fibrosis and resolution. They suggest their findings could provide a foundation for future anti-fibrotic therapies.

      Strengths:

      The animal models used are widely studied models of liver fibrosis from both parenchymal and biliary damage aspects. Both would allow analysis of resolution. The CCl4 model in particular fully reverts to a 'healthy' liver following cessation of the insult. I am less clear whether/how quickly the ductal plugs clear in DDC models and thus this may not provide the response they are looking for in terms of reversibility. I believe there have been several extensive studies using a transcriptomics approach in assessing genes and cells involved in the CCl4 model of resolution. Even more mutliomic models of general fibrosis progression in many of the mouse models of fibrosis. However, the proteomic approach they have used is robust and they have made some attempts to integrate with cell-type specific signatures from previously published data.

      Although there is minimal data, hepatocyte elasticity is a very interesting part of their study. Additional data and focussed attention on the mechanisms underpinning this would be very insightful.

      Weaknesses:

      As it currently stands, the data, whilst extensive, is primarily focussed on the proteomic data which is fairly descriptive and I am not clear on the additional insight gained in their approach that is not already detailed from the extensive transcriptomic studies. The manuscript overall would benefit from some mechanistic functional insight to provide new additional modes of action relevant to fibrosis progression. Whilst there is some human data presented it is a minimal analysis without quantification that would imply relevance to disease state.

      Although studying disease progression in animals is a fundamental aspect of understanding the full physiological response of fibrotic disease, without more human insight makes any analysis difficult to fulfil their suggestion that these targets identified will be of use to treat human disease.

      Some of the terminology is incorrect while discussing these models of injury used and care should be taken. For example - both models are toxin-induced and I do not think these data have any support that the DDC model has a higher carcinogenic risk. An investigation into the tumour-induced risk would require significant additional models. These types of statements are incorrect and not supported by this study.

    1. eLife Assessment

      This study provides valuable insights into the evolutionary histories and cellular infection responses of two Salmonella Dublin genotypes. While the evidence is compelling, a more phylogenetically diverse bacterial collection would enhance the findings. This research is relevant to scientists studying Salmonella and gastroenteritis-related pathogens.

    2. Reviewer #1 (Public review):

      The manuscript consists of two separate but interlinked investigations: genomic epidemiology and virulence assessment of Salmonella Dublin. ST10 dominates the epidemiological landscape of S. Dublin, while ST74 was uncommonly isolated. Detailed genomic epidemiology of ST10 unfolded the evolutionary history of this common genotype, highlighting clonal expansions linked to each distinct geography. Notably, North American ST10 was associated with more antimicrobial resistance compared to others. The authors also performed long-read sequencing on a subset of isolates (ST10 and ST74) and uncovered a novel recombinant virulence plasmid in ST10 (IncX1/IncFII/IncN). Separately, the authors performed cell invasion and cytotoxicity assays on the two S. Dublin genotypes, showing differential responses between the two STs. ST74 replicates better intracellularly in macrophages compared to ST10, but both STs induced comparable cytotoxicity levels. Comparative genomic analyses between the two genotypes showed certain genetic content unique to each genotype, but no further analyses were conducted to investigate which genetic factors were likely associated with the observed differences. The study provides a comprehensive and novel understanding of the evolution and adaptation of two S. Dublin genotypes, which can inform public health measures.

      The methodology included in both approaches was sound and written in sufficient detail, and data analysis was performed with rigour. Source data were fully presented and accessible to readers. Certain aspects of the manuscript could be clarified and extended to improve the manuscript.

      (1) For epidemiology purposes, it is not clear which human diseases were associated with the genomes included in this manuscript. This is important since S. Dublin can cause invasive bloodstream infections in humans. While such information may be unavailable for public sequences, this should be detailed for the 53 isolates sequenced for this study, especially for isolates selected to perform experiments in vitro.

      (2) The major AMR plasmid in described S. Dublin was the IncC associated with clonal expansion in North America. While this plasmid is not found in the Australian isolates sequenced in this study, the reviewer finds that it is still important to include its characterization, since it carries blaCMY-2 and was sustainedly inherited in ST10 clade 5. If the plasmid structure is already published, the authors should include the accession number in the Main Results.

      (3) The reviewer is concerned that the multiple annotations missing in<br /> (a) plasmid structures in Supplementary Figures 5 & 6, and<br /> (b) genetic content unique to ST10 and ST74 was due to insufficient annotation by Prokka. I would recommend the authors use another annotation tool, such as Bakta (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8743544/) for plasmid annotation, and reconstruction of the pangenome described in Supplementary Figure 10. Since the recombinant virulence plasmid in ST10 is a novel one, I would recommend putting Supplementary Figure 5 as a main figure, with better annotations to show the virulence region, plasmid maintenance/replication, and possible conjugation cluster.

      (4) The authors are lauded for the use of multiple strains of ST10 and ST74 in the in vitro experiment. While results for ST74 were more consistent, readouts from ST10 were more heterogenous (Figure 5, 6). This is interesting as the tested ST10 were mostly clade 1, so ST10 was, as expected, of lower genetic diversity compared to tested ST74 (partly shown in Figure 1D. Could the authors confirm this by constructing an SNP table separately for tested ST10 and ST74? Additionally, the tested ST10 did not represent the phylogenetic diversity of the global epidemiology, and this limitation should be reflected in the Discussion.

      (5) The comparative genomics between ST10 and ST74 can be further improved to allow more interpretation of the experiments. Why were only SPI-1, 2, 6, and 19 included in the search for virulome, how about other SPIs? ST74 lacks SPI-19 and has truncated SPI-6, so what would explain the larger genome size of ST74? Have the authors screened for other SPIs using more well-annotated databases or references (S. Typhi CT18 or S. Typhimurium ST313)? The mismatching between in silico prediction of invasiveness and phenotypes also warrants a brief discussion, perhaps linked to bigger ST74 genome size (as intracellular lifestyle is usually linked with genome degradation).

      (6) On the epidemiology scale, ST10 is more successful, perhaps due to its ongoing adaptation to replication inside GI epithelial cells, favouring shedding. ST74 may tend to cause more invasive disease and less transmission via fecal shedding. The presence of T6SS in ST10 also can benefit its competition with other gut commensals, overcoming gut colonization resistance. The reviewer thinks that these details should be more clearly rephrased in the Discussion, as the results highly suggested different adaptations of two genotypes of the same serovar, leading to different epidemiological success.

    3. Reviewer #2 (Public review):

      This is a comprehensive analysis of Salmonella Dublin genomes that offers insights into the global spread of this pathogen and region-specific traits that are important to understanding its evolution. The phenotyping of isolates of ST10 and ST74 also offers insights into the variability that can be seen in S. Dublin, which is also seen in other Salmonella serovars, and reminds the field that it is important to look beyond lab-adapted strains to truly understand these pathogens. This is a valuable contribution to the field. The only limitation, which the authors also acknowledge, is the bias towards S. Dublin genomes from high-income settings. However, there is no selection bias; this is simply a consequence of publically available sequences.

    1. eLife Assessment

      Following up on their previous work, the authors investigated whether cell-to-cell transmission of HIV-1 activates the CARD8 inflammasome in macrophages. This is important given that inflammasome activation in myeloid cells triggers proinflammatory cytokine release. The data are solid and support the idea that CARD8 is activated by the viral protease and promotes inflammation. However, time-course analyses in primary T cells and macrophages and further information on the specific inflammasome involved would further increase the significance of the study.

    2. Joint Public Review:

      Following up on their previous work, the authors investigated whether cell-to-cell transmission of HIV-1 activates the CARD8 inflammasome in macrophages, an important question given that inflammasome activation in myeloid cells triggers proinflammatory cytokine release. The data support the idea that CARD8 is activated by the viral protease and promotes inflammation. However, time-course analyses in primary T cells and macrophages and further information on the specific inflammasome involved would further increase the significance of the study.

      Strengths:

      The manuscript is well-written and the data is of good quality. The evidence that CARD8 senses the HIV-1 protease in the context of cell-to-cell transmission is important since cell-to-cell transmission is thought to play a key role in viral spread in vivo, and inflammation is a major driver of disease progression. Clean knockout experiments in primary macrophages are a notable strength and the results clearly support the role of CARD8 in protease-dependent sensing of viral spread and the induction of IL1β release and cell death. The finding that HIV-1 strains are resistant to protease inhibitors differ in CARD8 activation and IL1β production is interesting and underscores the potential clinical relevance of these results.

      Weaknesses:

      One weakness is that the authors used T cell lines which might not faithfully reflect the efficiency of HIV-1 production and cell-cell transfer by primary T cells. To assess whether CARD8 is also activated by protease from incoming viral particles earlier time points should be analyzed. Finally, while the authors exclude the role of NLRP3 in IL-1b and the death of macrophages it would be interesting to know whether the effect is still Gasdermin D dependent.

    3. Author response:

      Thank you for the positive and constructive feedback on our manuscript. We appreciate you highlighting the importance of our work advancing our understanding of the molecular etiology of acquired immunodeficiency syndrome (AIDS). To extend and further substantiate the observation that the CARD8 inflammasome is activated in response to viral protease during HIV-1 cell-to-cell transmission, we are in the process of completing additional experiments that are responsive to reviewer feedback, including:

      • Primary CD4+ T cell to monocyte-derived macrophage (MDM) transmission:  We have now repeated the cell-to-cell experiments with HIV-1 transfer from primary CD4+ T cells to primary monocyte-derived macrophages, and our findings are consistent with CARD8-dependent IL-1β release from HIV-1-infected macrophages in this more physiologic context. We are in the process of repeating these experiments with additional donors and will add these results to the revised manuscript.

      • Heterogeneity amongst blood donors: We have now repeated the cell-to-cell transfer and CARD8 knockout in MDMs with additional donors. While we continue to observe heterogeneity amongst donors, the key observation that CARD8 is require for inflammasome responses to HIV-1 infection is consistent. We note that some donors, including the one individual reported in the first submission, have markedly diminished CARD8 activity (to both HIV-1 and VbP).

      • Time course experiments: We did conduct a time course experiment when initially establishing these assays. We have now repeated these experiments with additional timepoints and in the presence or absence of the RT inhibitor nevirapine. The results of these experiments will be included in the revised manuscript.

      • The role of Gasdermin D: We are mostly interested in the release of IL-1β from the infected macrophages due to its potential contribution to myeloid-driven inflammation in PLWH. To date, there is no evidence that any other pore-forming protein other than GSDMD can initiate IL-1β release (and pyroptosis) downstream of CARD8. Nonetheless, we will attempt this experiment with the Gasdermin D inhibitor, disulfiram. 

      We believe these and other experiments will further support the importance of the CARD8 inflammasome in myeloid-driven inflammation in PLWH and look forward to submitting the revision.

    1. eLife Assessment

      This valuable study investigates prey capture by archer fish, showing that even though the visuomotor behavior unfolds very rapidly (within 40-70 ms), it is not hardwired; it can adapt to different simulated physics and different prey shapes. Although there was agreement that the model system, experimental design, and main hypothesis are certainly interesting, opinions were divided on whether the evidence supporting the central claims is incomplete. A more rigorous definition and assessment of "reflex speed", more detailed evidence of stimulus control, and a more detailed analysis of individual subjects could potentially increase confidence in the main conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      The authors test whether the archerfish can modulate the fast response to a falling target. By manipulating the trajectory of the target, they claim that the fish can modulate the fast response. While it is clear from the result that the fish can modulate the fast response, the experimental support for the argument that the fish can do it for a reflex-like behavior is inadequate.

      Strengths:

      Overall, the question that the authors raised in the manuscript is interesting.

      Weaknesses:

      (1) The argument that the fish can modulate reflex-like behavior relies on the claim that the archerfish makes the decision in 40 ms. There is little support for the 40 ms reaction time. The reaction time for the same behavior in Schlegel 2008, is 60-70 ms, and in Tsvilling 2012 about 75 ms, if we take the half height of the maximum as the estimated reaction time in both cases. If we take the peak (or average) of the distribution as an estimation of reaction time, the reaction time is even longer. This number is critical for the analysis the authors perform since if the reaction time is longer, maybe this is not a reflex as claimed. In addition, mentioning the 40 ms in the abstract is overselling the result. The title is also not supported by the results.

      (2) A critical technical issue of the stimulus delivery is not clear. The frame rate is 120 FPS and the target horizontal speed can be up to 1.775 m/s. This produces a target jumping on the screen 15 mm in each frame. This is not a continuous motion. Thus, the similarity between the natural system where the target experiences ballistic trajectory and the experiment here is not clear. Ideally, another type of stimulus delivery system is needed for a project of this kind that requires fast-moving targets (e.g. Reiser, J. Neurosci.Meth. 2008). In addition, the screen is rectangular and not circular, so in some directions, the target vanishes earlier than others. It must produce a bias in the fish response but there is no analysis of this type.

      (3) The results here rely on the ability to measure the error of response in the case of a virtual experiment. It is not clear how this is done since the virtual target does not fall. How do the authors validate that the fish indeed perceives the virtual target as the falling target? Since the deflection is at a later stage of the virtual trajectory, it is not clear what is the actual physics that governs the world of the experiment. Overall, the experimental setup is not well designed.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript studies prey capture by archer fish, which observe the initial values of motion of aerial prey they made fall by spitting on them, and then rapidly turn to reach the ballistic landing point on the water surface. The question raised by the article is whether this incredibly fast decision-making process is hardwired and thus unmodifiable or can be adjusted by experience to follow a new rule, namely that the landing point is deflected from a certain amount of the expected ballistic landing point. The results show that the fish learn the new rule and use it afterward in a variety of novel situations that include height, side, and speed of the prey, and which preserve the speed of the fish's decision. Moreover, a remarkable finding presented in this work is the fact that fish that have learned to use the new rule can relearn to use the ballistic landing point for an object based on its shape (a triangle) while keeping simultaneously the 'deflected rule' for an object differing in shape (a disc); in other words, fish can master simultaneously two decision-making rules based on the different shape of objects.

      Strengths:

      The manuscript relies on a sophisticated and clever experimental design that allows changing the apparent landing point of a virtual prey using a virtual reality system. Several robust controls are provided to demonstrate the reliability and usefulness of the experimental setup.

      Overall, I very much like the idea conveyed by the authors that even stimuli triggering apparently hardwired responses can be relearned in order to be associated with a different response, thus showing the impressive flexibility of circuits that are sometimes considered mediating pure reflexive responses. This is the case - as an additional example - of the main component of the Nasanov pheromone of bees (geraniol), which triggers immediate reflexive attraction and appetitive responses, and which can, nevertheless, be learned by bees in association with an electric shock so that bees end up exhibiting avoidance and the aversive response of sting extension to this odorant (1), which is a fully unnatural situation, and which shows that associative aversive learning is strong enough to override preprogrammed responding, thus reflecting an impressive behavioral flexibility.

      Weaknesses:

      As a general remark, there is some information that I missed and that is mandatory in the analysis of behavioral changes.

      Firstly, the variability in the performances displayed. The authors mentioned that the results reported come from 6 fish (which is a low sample size). How were the individual performances in terms of consistency? Were all fish equally good in adjusting/learning the new rule? How did errors vary according to individual identity? It seems to me that this kind of information should be available as the authors reported that individual fish could be recognized and tracked (see lines 620-635) and is essential for appreciating the flexibility of the system under study.

      Secondly, the speed of the learning process is not properly explained. Admittedly, fish learn in an impressive way the new rule and even two rules simultaneously; yet, how long did they need to achieve this? In the article, Figure 2 mentions that at least 6 training stages (each defined as a block of 60 evaluated turn decisions, which actually shows that the standard term 'Training Block' would be more appropriate) were required for the fish to learn the 'deflected rule'. While this means 360 trials (turning starts), I was left with the question of how long this process lasted. How many hours, days, and weeks were needed for the fish to learn? And as mentioned above, were all fish equally fast in learning? I would appreciate explaining this very important point because learning dynamics is relevant to understanding the flexibility of the system.

      Reference:

      (1) Roussel, E., Padie, S. & Giurfa, M. Aversive learning overcomes appetitive innate responding in honeybees. Anim Cogn 15, 135-141, doi:10.1007/s10071-011-0426-1 (2012).

    4. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      The authors test whether the archerfish can modulate the fast response to a falling target.

      We have not tested whether archerfish can 'modulate the fast response'. We quantitatively test specific hypotheses on the rules used by the fish. For this the accuracy of the decisions is analyzed with respect to specific points that can be calculated precisely in each experiment. The ill-defined term 'modulate' does in no way capture what is done here. This assessment might explain the question, raised by the reviewer, of 'what is the difference of this study and Reinel, 2016' (i.e. Reinel and Schuster, 2016). In that study, all objects were strictly falling ballistically, and latency and accuracy of the turn decisions were determined when the initial motion was not only horizontal but had an additional vertical component of speed. The question of that study was if the need to account to an additional variable (vertical speed) in the decision would affect its latency or accuracy. The study showed that also then archerfish rapidly turn to the later impact point. It also showed that accuracy and latency (defined in this study exactly as in the present study) were not changed by the added degree of freedom. This is a completely different question and by its very nature does not leave the realm of ballistics.

      By manipulating the trajectory of the target, they claim

      that the fish can modulate the fast response.

      While it is clear from the result that the fish can modulate the fast response, the experimental support for the argument that the fish can do it for a reflex-like behavior is inadequate. 

      This is disturbing: The manuscript is full of data that directly report response latency (a parameter that's critical in all experiments) and there are even graphical displays of the distribution of latency (Figs. 2, 5). How fast the responses are, can also already be seen in the first video. Most importantly, the nature of the 40 ms limit has been discovered and has been reported by our group in 2008 (Schlegel and Schuster, 2008, Fig. 4). For easy reference, we attach Schlegel and Schuster, 2008 with the relevant passages marked in yellow. But later studies also using high speed video (ie. typically 500 fps) and simultaneously evaluating accuracy and kinematics (in the same ways as used here!) to address various questions repeatedly report and even graphically represent minimum latencies of 40 ms, e.g. Krupczynski and Schuster, 2013 (e.g. Fig. 2); Reinel and Schuster, 2014; Reinel and Schuster, 2016;  Reinel and Schuster, 2018a, b (e.g. see Fig. 7 in the first part) and report how latency is increased as urgency is decreased (if the fish are too close or time of falling is increased), as temperature is decreased or as viewing conditions and their homogeneity across the tank change. Moreover, even a field study is available (Rischawy, Blum and Schuster, 2015) that shows why the speed is needed. This is because of massive competition with at least some of the competitor fish also be able to turn to the later impact point. So, speed is an absolute necessity if competitors are around. Interestingly, when the fish are isolated, latency goes up and eventually the fish will no longer respond with C-starts (Schlegel and Schuster, 2008).

      Another aspect: considering the introduction it would not even have mattered if not 40 ms but instead 150 ms were the time needed for an accurate start (which is not the case). That would still be faster than an Olympic sprinter responds to a gun shot. Moreoever, please also note that we carefully talk of reflex-speed not of a reflex-behavior (which is as easy to verify as any other if the false statements made).

      Strengths: 

      Overall, the question that the authors raised in the manuscript is interesting. 

      Given the statement of no difference between the present study and Reinel and Schuster, 2016, it is not clear what this assessment refers to.

      Weaknesses: 

      (1) The argument that the fish can modulate reflex-like behavior relies on the claim that the archerfish makes the decision in 40 ms. There is little support for the 40 ms reaction time.

      The 'little support' is a paper in Science in which this important aspect is directly analyzed (Fig. 4 of that paper) and that has been praised by folks like Yadin Dudai (e.g . in Faculty 1000). The support is also data on latency as presented in the present paper. Furthermore, additional publications are available on the reaction time (see above).

      The reaction time for the same behavior in Schlegel 2008, is 60-70 ms, and in Tsvilling 2012 about 75 ms, if we take the half height of the maximum as the estimated reaction time in both cases. If we take the peak (or average) of the distribution as an estimation of reaction time, the reaction time is even longer. This number is critical for the analysis the authors perform since if the reaction time is longer, maybe this is not a reflex as claimed.

      See above.

      In addition, mentioning the 40 ms in the abstract is overselling the result.

      See above.

      Just for completeness: Considering a very interesting point raised by reviewer 2 we add an additional panel to further emphasize the exciting point that accuracy and latency are unrelated in the start decisions. That point was already made in Fig.4 of the paper in Science but can be directly addressed.  

      The title is also not supported by the results. 

      No: the title is clearly supported by the results that are reported in the paper.

      (2) A critical technical issue of the stimulus delivery is not clear.

      The stimulus delivery is described in detail. Most importantly we emphasize (even mentioning frame rate) that all VR setups require experimental confirmation that they work for the species and for the behavior at hand. Ideally, they should elicit the same behavior (in all aspects) as a real stimulus does that the VR approach intends to mimic. Whether VR works in a given animal and for the behavior at hand in that animal cannot be known or postulated a priori. It must be shown in direct critical experiments. Such experiments and the need to perform them are described in detail in Figure 2 and in the text that is associated with that figure.

      The frame rate is 120 FPS and the target horizontal speed can be up to 1.775 m/s. This produces a target jumping on the screen 15 mm in each frame. This is not a continuous motion. Thus, the similarity between the natural system where the target experiences ballistic trajectory and the experiment here is not clear. Ideally, another type of stimulus delivery system is needed for a project of this kind that requires fast-moving targets (e.g. Reiser, J. Neurosci.Meth. 2008).

      See above. It is quite funny that one of the authors of the present study had been involved in developing a setup with a complete panorama of 6000 LEDs (Strauss, Schuster and Götz, 1997; and appropriately cited in Reiser) that has been the basis for Reiser. This panorama was also used to successfully implement VR in freely walking Drosophila (Schuster et al., Curr. Biol., 2002). However, an LED based approach was abandoned because of insufficient spatial resolution (that, in archerfish, is very different from that of Drosophila).

      But the crucial point really is this: Just looking at Figure 2 shows that our approach could not have worked better in any way - it provided the input needed to cause turn decisions that are in all aspects just as those with real objects. Achieving this was not at all trivial and required enormous effort and many failed attempts. But it allows addressing our questions for the first time after 20 years of studying these interesting decisions.

      In addition, the screen is rectangular and not circular, so in some directions, the target vanishes earlier than others. It must produce a bias in the fish response but there is no analysis of this type. 

      Why 'must' it produce a bias? Is it not conceivable that you can only use a circular part of the screen? Briefly, and as could have been checked by quickly looking into the methods section, this is what we did. But still, why would it have mattered in our strictly randomized design? It could have mattered only in a completely silly way of running the experiments in which exclusively long trajectories are shown in one condition and exclusively short ones in another.

      (3) The results here rely on the ability to measure the error of response in the case of a virtual experiment. It is not clear how this is done since the virtual target does not fall.

      Well, of course it does not fall!!! That is the whole point that enables the study, and this is explained in connection with the glass plate experiment of Fig. 1 and quite some text is devoted to say that this is the starting point for the present analysis. The ballistic impact point is calculated (just as explained in our very first paper on the start decisions, Rossel, Corlija and Schuster, 2002) from the initial speed and height of the target, using simple high-school physics and the justification for that is also in that paper. This has been done already more than 20 years ago. How else could that paper have arrived at the conclusion that the fish turned to the virtual impact point even though nothing is falling? We also describe this for the readers of the present study, illustrate how accuracy is determined in Figures, in all videos and in an additional Supplementary Figure. Consulting the paper reveals that orientation of the fish is determined immediately at the end of stage 2 of its C-start and the error directly reports how close continuing in that direction would lead the fish to the (real or virtual) impact point. This measure has also been used since the first paper in 2002 in our lab and it is very useful because it provides an invariant measure that allows pooling all the different conditions (orientation and position of responding fish as well as direction, speed and height of target).

      How do the authors validate that the fish indeed perceives the virtual target as the falling target?

      See above. The fish produce C-starts (whose kinematics are analyzed and reported in Figures), whose latency is measured (from onset of target motion to onset of C-start) and whose accuracy in aligning them to the calculated virtual impact point is measured (see above). Additionally, the errors are also analyzed to other points of interest, for instance landmarks, the ballistic landing point in the re-trained fish or points calculated on the basis of specific hypotheses in the generalization experiments.

      Since the deflection is at a later stage of the virtual trajectory, it is not clear what is the actual physics that governs the world of the experiment.

      As explained in the text what we need is substituting the ballistic connection with another fixed relation between initial target motion and the landing point. This other relation needs to produce a large error in the aims when they remain based on the ballistic virtual landing point. It is directly shown in the key experiments that the fish need not see the deflection but can respond appropriately to the initial motion after training (Figs. 3, 5 and corresponding paragraphs in the text as well as additional movies). Please also note that after training the decision is based on the initial movement. This is shown in the interspersed experiments in which nothing than the initial (pre-deflection) movement was shown.

      Overall, the experimental setup is not well designed. 

      It is obviously designed well enough to mimic the natural situation in every aspect needed (see Fig. 2) and well enough to answer the questions we have asked.

      Reviewer #2 (Public review): 

      Summary: 

      This manuscript studies prey capture by archer fish, which observe the initial values of motion of aerial prey they made fall by spitting on them, and then rapidly turn to reach the ballistic landing point on the water surface. The question raised by the article is whether this incredibly fast decision-making process is hardwired and thus unmodifiable or can be adjusted by experience to follow a new rule, namely that the landing point is deflected from a certain amount of the expected ballistic landing point. The results show that the fish learn the new rule and use it afterward in a variety of novel situations that include height, side, and speed of the prey, and which preserve the speed of the fish's decision. Moreover, a remarkable finding presented in this work is the fact that fish that have learned to use the new rule can relearn to use the ballistic landing point for an object based on its shape (a triangle) while keeping simultaneously the 'deflected rule' for an object differing in shape (a disc); in other words, fish can master simultaneously two decision-making rules based on the different shape of objects. 

      Strengths: 

      The manuscript relies on a sophisticated and clever experimental design that allows changing the apparent landing point of a virtual prey using a virtual reality system. Several robust controls are provided to demonstrate the reliability and usefulness of the experimental setup. 

      Overall, I very much like the idea conveyed by the authors that even stimuli triggering apparently hardwired responses can be relearned in order to be associated with a different response, thus showing the impressive flexibility of circuits that are sometimes considered mediating pure reflexive responses.

      Thank you so much for this precise assessment of what we have shown!

      This is the case - as an additional example - of the main component of the Nasanov pheromone of bees (geraniol), which triggers immediate reflexive attraction and appetitive responses, and which can, nevertheless, be learned by bees in association with an electric shock so that bees end up exhibiting avoidance and the aversive response of sting extension to this odorant (1), which is a fully unnatural situation, and which shows that associative aversive learning is strong enough to override preprogrammed responding, thus reflecting an impressive behavioral flexibility. 

      That's very interesting, thanks.

      Weaknesses: 

      As a general remark, there is some information that I missed and that is mandatory in the analysis of behavioral changes. 

      Firstly, the variability in the performances displayed. The authors mentioned that the results reported come from 6 fish (which is a low sample size). How were the individual performances in terms of consistency? Were all fish equally good in adjusting/learning the new rule? How did errors vary according to individual identity? It seems to me that this kind of information should be available as the authors reported that individual fish could be recognized and tracked (see lines 620-635) and is essential for appreciating the flexibility of the system under study. 

      Secondly, the speed of the learning process is not properly explained. Admittedly, fish learn in an impressive way the new rule and even two rules simultaneously; yet, how long did they need to achieve this? In the article, Figure 2 mentions that at least 6 training stages (each defined as a block of 60 evaluated turn decisions, which actually shows that the standard term 'Training Block' would be more appropriate) were required for the fish to learn the 'deflected rule'. While this means 360 trials (turning starts), I was left with the question of how long this process lasted. How many hours, days, and weeks were needed for the fish to learn? And as mentioned above, were all fish equally fast in learning? I would appreciate explaining this very important point because learning dynamics is relevant to understanding the flexibility of the system. 

      First, it is very important to keep the question in mind that we wanted to clarify: Does the system have the potential to re-tune the decisions to other non-ballistic relations between the input variables and the output? This would have been established if one fish was found capable of doing that. However, we do have sufficient evidence to say that all six fish learned the new law and that at least one (actually four) individual was capable of simultaneously handling the two laws. We will explain this much better (hopefully) in our revised version. We also have to stress that not all archerfish might actually be able to do this and that not all archerfish might learn in the same way, at the same speed, or using the same strategies. These questions are extremely interesting and we therefore definitely will include all evidence that we have. If some individuals are better than others in quickly adjusting, then even observational learning could become a part of the story. However, we needed to make and document the first steps. Understanding these is essential and apparently is difficult enough.

      Reference: 

      (1) Roussel, E., Padie, S. & Giurfa, M. Aversive learning overcomes appetitive innate responding in honeybees. Anim Cogn 15, 135-141, doi:10.1007/s10071-011-0426-1 (2012). 

      Thanks for this reference!

    1. eLife Assessment

      This study provides evidence that cerebellar projections to the thalamus are required for learning and execution of motor skills in the accelerating rotarod task. This important study adds to a growing body of literature on the interactions between the cerebellum, motor cortex, and basal ganglia during motor learning. The data presentation is generally sound, especially the main observations, with some limitations in describing the statistical methods and a lack of support for two segregated cerebello-thalamic pathways, which is incomplete in supporting the overall claim.

    2. Reviewer #1 (Public review):

      This is an interesting manuscript tackling the issue of whether subcircuits of the cerebellum are differentially involved in processes of motor performance, learning, or learning consolidation. The authors focus on cerebellar outputs to the ventrolateral thalamus (VL) and to the centrolateral thalamus (CL), since these thalamic nuclei project to the motor cortex and striatum respectively, and thus might be expected to participate in diverse components of motor control and learning. In mice challenged with an accelerating rotarod, the investigators reduce cerebellar output either broadly, or in projection-specific populations, with CNO targeting DREADD-expressing neurons. They first establish that there are not major control deficits with the treatment regime, finding no differences in basic locomotor behavior, grid test, and fixed-speed rotarod. This is interpreted to allow them to differentiate control from learning, and their inter-relationships. These manipulations are coupled with chronic electrophysiological recordings targeted to the cerebellar nuclei (CN) to control for the efficacy of the CNO manipulation. I found the manuscript intriguing, offering much food for thought, and am confident that it will influence further work on motor learning consolidation. The issue of motor consolidation supported by the cerebellum is timely and interesting, and the claims are novel. There are some limitations to the data presentation and claims, highlighted below, which, if amended, would improve the manuscript.

      (1) Statistical analyses: There is too little information provided about how the Deming regressions, mean points, slopes, and intercepts were compared across conditions. This is important since in the heart of the study when the effects of inactivating CL- vs VL- projecting neurons are being compared to control performance, these statistical methods become paramount. Details of these comparisons and their assumptions should be added to the Methods section. As it stands I barely see information about these tests, and only in the figure legends. I would also like the authors to describe whether there is a criterion for significance in a given correlation to be then compared to another. If I have a weak correlation for a regression model that is non-significant, I would not want to 'compare' that regression to another one since it is already a weak model. The authors should comment on the inclusion criteria for using statistics on regression models.

      (2) The introduction makes the claim that the cerebellar feedback to the forebrain and cortex are functionally segregated. I interpreted this to mean that the cerebellar output neurons are known to project to either VL or CL exclusively (i.e. they do not collateralize). I was unaware of this knowledge and could find no support for the claim in the references provided (Proville 2014; Hintzer 2018; Bosan 2013). Either I am confused as to the authors' meaning or the claim is inaccurate. This point is broader however than some confusion about citation. The study assumes that the CN-CL population and CN-VL population are distinct cells, but to my knowledge, this has not been established. It is difficult to make sense of the data if they are entirely the same populations, unless projection topography differs, but in any event, it is critical to clarify this point: are these different cell types from the nuclei?; how has that been rigorously established?; is there overlap? No overlap? Etc. Results should be interpreted in light of the level of this knowledge of the anatomy in the mouse or rat.

      (3) It is commendable that the authors perform electrophysiology to validate DREADD/CNO. So many investigators don't bother and I really appreciate these data. Would the authors please show the 'wash' in Figure 1a, so that we can see the recovery of the spiking hash after CNO is cleared from the system? This would provide confidence that the signal is not disappearing for reasons of electrode instability or tissue damage/ other.

      (4) I don't think that the "Learning" and "Maintenance" terminology is very helpful and in fact may sow confusion. I would recommend that the authors use a day range " Days 1-3 vs 4-7" or similar, to refer to these epochs. The terminology chosen begs for careful validation, definitions, etc, and seems like it is unlikely uniform across all animals, thus it seems more appropriate to just report it straight, defining the epochs by day. Such original terminology could still be used in the Discussion, with appropriate caveats.

      (5) Minor, but, on the top of page 14 in the Results, the text states, "Suggesting the presence of a 'critical period' in the consolidation of the task". I think this is a non-standard use of 'critical period' and should be removed. If kept, the authors must define what they mean specifically and provide sufficient additional analyses to support the idea. As it stands, the point will sow confusion.

    3. Reviewer #2 (Public review):

      Summary:

      This study examines the contribution of cerebello-thalamic pathways to motor skill learning and consolidation in an accelerating rotarod task. The authors use chemogenetic silencing to manipulate the activity of cerebellar nuclei neurons projecting to two thalamic subregions that target the motor cortex and striatum. By silencing these pathways during different phases of task acquisition (during the task vs after the task), the authors report valuable findings of the involvement of these cerebellar pathways in learning and consolidation.

      Strengths:

      The experiments are well-executed. The authors perform multiple controls and careful analysis to solidly rule out any gross motor deficits caused by their cerebellar nuclei manipulation. The finding that cerebellar projections to the thalamus are required for learning and execution of the accelerating rotarod task adds to a growing body of literature on the interactions between the cerebellum, motor cortex, and basal ganglia during motor learning. The finding that silencing the cerebellar nuclei after a task impairs the consolidation of the learned skill is interesting.

      Weaknesses:

      While the controls for a lack of gross motor deficit are solid, the data seem to show some motor execution deficit when cerebellar nuclei are silenced during task performance. This deficit could potentially impact learning when cerebellar nuclei are silenced during task acquisition. Separately, I find the support for two separate cerebello-thalamic pathways incomplete. The data presented do not clearly show the two pathways are anatomically parallel. The difference in behavioral deficits caused by manipulating these pathways also appears subtle.

    4. Reviewer #3 (Public review):

      Summary:

      Varani et al present important findings regarding the role of distinct cerebellothalamic connections in motor learning and performance. Their key findings are that:<br /> (1) cerebellothalamic connections are important for learning motor skills<br /> (2) cerebellar efferents specifically to the central lateral (CL) thalamus are important for short-term learning<br /> (3) cerebellar efferents specifically to the ventral anterior lateral (VAL) complex are important for offline consolidation of learned skills, and<br /> (4) that once a skill is acquired, cerebellothalamic connections become important for online task performance.

      The authors went to great lengths to separate effects on motor performance from learning, for the most part successfully. While one could argue about some of the specifics, there is little doubt that the CN-CL and CN-VAL pathways play distinct roles in motor learning and performance. An important next step will be to dissect the downstream mechanisms by which these cerebellothalamic pathways mediate motor learning and adaptation.

      Strengths:

      (1) The dissociation between online learning through CN-CL and offline consolidation through CN-VAL is convincing.

      (2) The ability to tease learning apart from performance using their titrated chemogenetic approach is impressive. In particular, their use of multiple motor assays to demonstrate preserved motor function and balance is an important control.

      (3) The evidence supporting the main claims is convincing, with multiple replications of the findings and appropriate controls.

      Weaknesses:

      (1) Despite the care the authors took to demonstrate that their chemogenetic approach does not impair online performance, there is a trend towards impaired rotarod performance at higher speeds in Supplementary Figure 4f, suggesting that there could be subtle changes in motor performance below the level of detection of their assays.

      (2) There is likely some overlap between CN neurons projecting to VAL and CL, somewhat limiting the specificity of their conclusions.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      This is an interesting manuscript tackling the issue of whether subcircuits of the cerebellum are differentially involved in processes of motor performance, learning, or learning consolidation. The authors focus on cerebellar outputs to the ventrolateral thalamus (VL) and to the centrolateral thalamus (CL), since these thalamic nuclei project to the motor cortex and striatum respectively, and thus might be expected to participate in diverse components of motor control and learning. In mice challenged with an accelerating rotarod, the investigators reduce cerebellar output either broadly, or in projection-specific populations, with CNO targeting DREADD-expressing neurons. They first establish that there are not major control deficits with the treatment regime, finding no differences in basic locomotor behavior, grid test, and fixed-speed rotarod. This is interpreted to allow them to differentiate control from learning, and their inter-relationships. These manipulations are coupled with chronic electrophysiological recordings targeted to the cerebellar nuclei (CN) to control for the efficacy of the CNO manipulation. I found the manuscript intriguing, offering much food for thought, and am confident that it will influence further work on motor learning consolidation. The issue of motor consolidation supported by the cerebellum is timely and interesting, and the claims are novel. There are some limitations to the data presentation and claims, highlighted below, which, if amended, would improve the manuscript.

      We thank the reviewer for the positive comments and insightful critics.

      (1.1) Statistical analyses: There is too little information provided about how the Deming regressions, mean points, slopes, and intercepts were compared across conditions. This is important since in the heart of the study when the effects of inactivating CL- vs VL- projecting neurons are being compared to control performance, these statistical methods become paramount. Details of these comparisons and their assumptions should be added to the Methods section. As it stands I barely see information about these tests, and only in the figure legends. I would also like the authors to describe whether there is a criterion for significance in a given correlation to be then compared to another. If I have a weak correlation for a regression model that is non-significant, I would not want to 'compare' that regression to another one since it is already a weak model. The authors should comment on the inclusion criteria for using statistics on regression models.

      Currently the Methods indeed explain that groups are compared by testing differences of distributions of residuals of treatment and control groups around the Deming regression of the control groups: “To test if treatments altered the relationship between initial performance vs learning or daily vs overnight learning, we compared the distribution of signed distance to the control Deming regression line between groups.” But this shall indeed be explained in more details.

      The performance on a given day depends on a cumulative process, so that the average measure of performance is not fully informative on what is learned or what is changed by a treatment (this is further explained in the text p9-10).The challenge is to deal with the multivariate relationships where initial performance, daily learning, and consolidated learning are interdependent. While in control groups these quantities show linear relationships, this is far less the case in treatment groups; this may indeed be due to the variability of the effect of the treatment (efficacy of viral injections) which adds up to the intrinsic variability in the absence of treatment.

      Our choice to see if there is a shift in these relationships following treatments, is to see to which extent treatment points in bivariate comparisons (initial perf x daily learning, daily learning x consolidated learning) are evenly distributed around the control group regression line. We take the presence of a significant difference in the distribution of residuals between the control and treatment group as an indication that the process represented in group is disrupted by the treatment: e.g. if the residuals of the treatment group are lower than those of the control group in the initial performance * daily learning comparison, it indicates that learning is slower (or larger). If the residuals of the treatment group are lower than those of the control group in the daily learning * consolidated learning comparison, it indicates that consolidation is lower. This shall be clarified in a revised version.

      (1.2a) The introduction makes the claim that the cerebellar feedback to the forebrain and cortex are functionally segregated. I interpreted this to mean that the cerebellar output neurons are known to project to either VL or CL exclusively (i.e. they do not collateralize). I was unaware of this knowledge and could find no support for the claim in the references provided (Proville 2014; Hintzer 2018; Bosan 2013). Either I am confused as to the authors' meaning or the claim is inaccurate. This point is broader however than some confusion about citation.

      The references are not cited in the context of collaterals: “They [basal ganglia and cerebellum] send projections back to the cortex via anatomically and functionally segregated channels, which are relayed by predominantly non-overlapping thalamic regions (Bostan, Dum et al. 2013, Proville, Spolidoro et al. 2014, Hintzen, Pelzer et al. 2018). ” Indeed, the thalamic compartments targeted by the basal ganglia and cerebellum are distinct, and in the Proville 2014, we showed some functional segregation of the cerebello-cortical projections (whisker vs orofacial ascending projections). We do not claim that there is a full segregation of the two pathways, there is indeed some known degree of collateralization (see below).

      (1.2b) The study assumes that the CN-CL population and CN-VL population are distinct cells, but to my knowledge, this has not been established. It is difficult to make sense of the data if they are entirely the same populations, unless projection topography differs, but in any event, it is critical to clarify this point: are these different cell types from the nuclei?; how has that been rigorously established?; is there overlap? No overlap? Etc. Results should be interpreted in light of the level of this knowledge of the anatomy in the mouse or rat.

      Actually, the study does not assume that CL-projecting and VAL-projecting neurons are entirely separate populations (actually it is known that there is an overlap), but states that inhibition of neurons following retrograde infections from the CL and VAL do not produce identical results.

      There is indeed a paragraph devoted to the discussion of this point (middle paragraph p20). “Interestingly, both Dentate and Interposed nuclei contain some neurons with collaterals in both VAL and CL thalamic structures (Aumann and Horne 1996, Sakayori, Kato et al. 2019), suggesting that the effect on learning could be mediated by a combined action on the learning process in the striatum (via the CL thalamus) and in the cortex (via the VAL thalamus). However, consistent with (Sakayori, Kato et al. 2019), we found that the manipulations of cerebellar neurons retrogradely targeted either from the CL or from the VAL produced different effects in the task. This indicates that either the distinct functional roles of VAL-projecting of CL-projecting neurons reported in our study is carried by a subset of pathway-specific neurons without collaterals, or that our retrograde infections in VAL and CL preferentially targeted different cerebello-thalamic populations even if these populations had axon terminals in both thalamic regions.”. In other words, we actually know from the literature that there is a degree of collateralization (CN neurons projecting to both VAL and CL, see refs cited above), but as the reviewer says, it does not seem logically possible that the exact same population would have different effects, which are very distinct during the first learning days. The only possible explanation is the CN-CL and CN-VAL retrograde infections recruit somewhat different populations of neurons. This could be due to differences in density of collaterals in CL and VAL of neurons with collaterals in both regions, or presence of CL-projecting neurons without collaterals in VAL, and VAL-projecting neurons without collaterals in CL in addition to the (established) population of neurons with collaterals in both regions. The lesional approach of CN-thalamus neurons in Sakayori et al. 2019 also observed separate effects for CL and VL injections consistent with the differential recruitment of CN populations by retrograde infections.

      This should be improved in a revised version of the manuscript.

      (1.3) It is commendable that the authors perform electrophysiology to validate DREADD/CNO. So many investigators don't bother and I really appreciate these data. Would the authors please show the 'wash' in Figure 1a, so that we can see the recovery of the spiking hash after CNO is cleared from the system? This would provide confidence that the signal is not disappearing for reasons of electrode instability or tissue damage/ other.

      We do not have the wash data on the same day, but there is no significant change in the baseline firing rate across recording days.

      (1.4) I don't think that the "Learning" and "Maintenance" terminology is very helpful and in fact may sow confusion. I would recommend that the authors use a day range " Days 1-3 vs 4-7" or similar, to refer to these epochs. The terminology chosen begs for careful validation, definitions, etc, and seems like it is unlikely uniform across all animals, thus it seems more appropriate to just report it straight, defining the epochs by day. Such original terminology could still be used in the Discussion, with appropriate caveats.

      This shall be indeed corrected in a revised version.

      (1.5) Minor, but, on the top of page 14 in the Results, the text states, "Suggesting the presence of a 'critical period' in the consolidation of the task". I think this is a non-standard use of 'critical period' and should be removed. If kept, the authors must define what they mean specifically and provide sufficient additional analyses to support the idea. As it stands, the point will sow confusion.

      This shall be indeed corrected in a revised version

      Reviewer #2 (Public review):

      Summary:

      This study examines the contribution of cerebello-thalamic pathways to motor skill learning and consolidation in an accelerating rotarod task. The authors use chemogenetic silencing to manipulate the activity of cerebellar nuclei neurons projecting to two thalamic subregions that target the motor cortex and striatum. By silencing these pathways during different phases of task acquisition (during the task vs after the task), the authors report valuable findings of the involvement of these cerebellar pathways in learning and consolidation.

      Strengths:

      The experiments are well-executed. The authors perform multiple controls and careful analysis to solidly rule out any gross motor deficits caused by their cerebellar nuclei manipulation. The finding that cerebellar projections to the thalamus are required for learning and execution of the accelerating rotarod task adds to a growing body of literature on the interactions between the cerebellum, motor cortex, and basal ganglia during motor learning. The finding that silencing the cerebellar nuclei after a task impairs the consolidation of the learned skill is interesting.

      We thank the reviewer for the positive comments and insightful critics below.

      Weaknesses:

      (2.1) While the controls for a lack of gross motor deficit are solid, the data seem to show some motor execution deficit when cerebellar nuclei are silenced during task performance. This deficit could potentially impact learning when cerebellar nuclei are silenced during task acquisition.

      One of our key controls are the tests of the treatment on fixed speed rotarod, which provides the closest conditions to the ones found in the accelerating rotarod (the main difference between the protocols being the slow steady acceleration of rod rotation [+0.12 rpm per s]- in the accelerating version).

      In the CN experiments, we found clear deficits in learning and consolidation while there was no effect on the fixed speed rotarod (performance of the DREAD-CNO are even slightly better than some control groups), consistent with a separation of the effect on learning/consolidation from those on locomotion on a rotarod. However, small but measurable deficits are found at the highest speed in the fixed speed rotarod in the CN-VAL group; there was no significant effect in the CN-CL group, while the CN-CL actually shows lower performances from the second day of learning; we believe this supports our claim that the CN-CL inhibition impacted more the learning process than the motor coordination. In contrast the CN-VAL group only showed significantly lower performance on day 4 of the accelerating rotarod consistent with intact learning abilities. Of note, under CNO, CN-VAL mice could stay for more than a minute and half at 20rpm, while on average they fell from the accelerating rotarod as soon as the rotarod reached the speed of ~19rpm (130s).

      The text currently states “The inhibition of CN-VAL neurons during the task also yielded lower levels of performance in the Maintenance stage,[[NB: day 5-7]] suggesting that these neurons contribute also to learning and retrieval of motor skills, although the mild defect in fixed speed rotarod could indicate the presence of a locomotor deficit, only visible at high speed.” Following the reviewers’ comment, we shall however revise the sentence above in the revised version of the MS to say that we cannot fully disambiguate the execution / learning-retrieval effect at high speed for these mice.

      (2.2a) Separately, I find the support for two separate cerebello-thalamic pathways incomplete. The data presented do not clearly show the two pathways are anatomically parallel.

      As explained above (point 1.2a), it is already known that these pathways overlap to some degree (discussion p 20), but yet their targeting differentially affects the behavior, consistent with separate contributions. A similar finding was observed for a lesional (irreversible) approach in Sakayori et al. 2019.

      (2.2b) The difference in behavioral deficits caused by manipulating these pathways also appears subtle.

      While we agree that after 3-4 days of learning the difference of performance between the groups becomes elusive, we respectfully disagree with the reviewer that in the early stages these differences are negligible and the impact of inhibition on "learning rate" (ie. amount of learning for a given daily initial performance) and consolidation (i.e. overnight retention of daily gain of performance) exhibit different profiles for the two groups (fig 3h vs 3k).

      Reviewer #3 (Public review)

      Summary:

      Varani et al present important findings regarding the role of distinct cerebellothalamic connections in motor learning and performance. Their key findings are that:

      (1) cerebellothalamic connections are important for learning motor skills

      (2) cerebellar efferents specifically to the central lateral (CL) thalamus are important for short-term learning

      (3) cerebellar efferents specifically to the ventral anterior lateral (VAL) complex are important for offline consolidation of learned skills, and

      (4) that once a skill is acquired, cerebellothalamic connections become important for online task performance.

      The authors went to great lengths to separate effects on motor performance from learning, for the most part successfully. While one could argue about some of the specifics, there is little doubt that the CN-CL and CN-VAL pathways play distinct roles in motor learning and performance. An important next step will be to dissect the downstream mechanisms by which these cerebellothalamic pathways mediate motor learning and adaptation.

      Strengths:

      (1) The dissociation between online learning through CN-CL and offline consolidation through CN-VAL is convincing.

      (2) The ability to tease learning apart from performance using their titrated chemogenetic approach is impressive. In particular, their use of multiple motor assays to demonstrate preserved motor function and balance is an important control.

      (3) The evidence supporting the main claims is convincing, with multiple replications of the findings and appropriate controls.

      We thank the reviewer for the positive comments and insightful critics below.

      Weaknesses:

      (3.1) Despite the care the authors took to demonstrate that their chemogenetic approach does not impair online performance, there is a trend towards impaired rotarod performance at higher speeds in Supplementary Figure 4f, suggesting that there could be subtle changes in motor performance below the level of detection of their assays.

      This is also discussed in point 2.1 above. In our view, the fixed speed rotarod is a control very close to the accelerating rotarod condition, with very similar requirements between the two tasks (yet unfortunately rarely tested in accelerating rotarod studies). We do not exclude the presence of motor deficits, but the main argument is that these do not suffice to explain the differences observed in the accelerating rotarod. No detectable deficit was found in the CN group while very clear deficits in learning/consolidation were observed. A mild deficit is only significant in the CN-VAL group, while the deficit is not significant in the fixed-speed rotarod for the CN-CL group which shows the strongest deficit in accelerating rotarod during the first days: e.g. on day 2, the CN-CL group is already below the control group with latencies to fall ~100s (corresponding to immediate fall at ~15rpm) while the fixed speed rotarod performances at 15s of the control and CNO-treated groups show an ability to stay more than 1 min at this speed. The text shall be improved to clarify this point.

      (3.2) There is likely some overlap between CN neurons projecting to VAL and CL, somewhat limiting the specificity of their conclusions.

      There is indeed published evidence for some degree of anatomical overlap, but also for some differential contribution of CN-VAL and CN-CL to the task. The answer to this point is developed in the points 1.2a 2.2a above. Although this point was exposed in the discussion (p20), the text shall be improved in a revised version of the MS to clarify our statement.

    1. eLife Assessment

      This important study advances our understanding of the way neurons in the auditory cortex of mice respond to unpredictable sounds. Through the use of state-of-the-art recording methods, compelling evidence is provided that responses to local and global violations in sound sequences are prediction errors and not simply the consequence of stimulus-specific adaptation. Although the cell-type-specific results are intriguing, further work is needed to establish their reliability.

    2. Reviewer #1 (Public review):

      Summary:

      The authors successfully detected distinct mechanisms signalling prediction violations in the auditory cortex of mice. For this purpose, an auditory pure-tone local-global paradigm was presented to awake and anaesthetised mice. In awake rodents, the authors also evaluated interneuron cell types involved in responses to the interruption of the regularity imposed by local-global sequences. By performing two-photon calcium imaging and single-unit electrophysiology, the authors disentangled three phenomena underlying responses to violations of the distinct local-global regularity levels: Stimulus-specific adaptation, surprise and surprise adaptation. Both stimulus-specific adaptation and surprise-or deviant-evoked responses are observable<br /> under anaesthesia. Altogether, this work advances our understanding of distinct predictive processes computing prediction violations upon the complexity of the regularity imposed by the auditory sequence.

      Strengths:

      it is an elegant study beautifully executed.

      Weaknesses:

      No weaknesses were identified by this reviewer.

    3. Reviewer #2 (Public review):

      Summary:

      Oddball responses are increases in sensory responses when a stimulus is encountered in an unexpected location in a sequence of predictable stimuli. There are two computational interpretations for these responses: stimulus-specific adaptation and prediction errors. In recent years, evidence has accumulated that a significant part of these sequence violation responses cannot be explained simply by stimulus-specific adaptation. The current work elegantly adds to this evidence by using a sequence paradigm based on two levels of sequence violations: "Local" sequence violations of repetitions of identical stimuli, and "global" sequence violations of stimulus sequence patterns. The authors demonstrate that both local and global sequence violation responses are found in L2/3 neurons of the mouse auditory cortex. Using sequences with different inter-stimulus intervals, they further demonstrate that these sequence violation responses cannot be explained by stimulus-specific adaption.

      Strengths:

      The work is based on a very clever use of a sequence violation paradigm (local-global paradigm) and provides convincing evidence for the interpretation that there are at least two types of sequence violation responses and that these cannot be explained by stimulus-specific adaption. Most of the conclusions are based on a large dataset, and are compelling.

      Weaknesses:

      The final part of the paper focuses on the responses of VIP and PV-positive interneurons. The responses of VIP interneurons appear somewhat variable and difficult to interpret (e.g. VIP neurons exhibit omission responses in the A block, but not the B block). The conclusions based on these data appear less solid.

    4. Reviewer #3 (Public review):

      Summary:

      In their manuscript entitled "Parallel mechanisms signal a hierarchy of sequence structure violations in the auditory cortex", Jamali et al. provide evidence for cellular-level mechanisms in the auditory cortex of mice for the encoding of predictive information on different temporal and contextual scales. The study design separates more clearly than previous studies between the effects of local and global deviants and separates their respective effects on the neuronal responses clearly through the use of various contextual conditions and short and long time scales. Further, it identifies a contribution from a small set of VIP interneurons to the detection of omitted sounds, and shows the influence of isofluorane anesthesia on the neural responses.

      Strengths:

      (1) The study provides a rather encompassing set of experimental techniques to study the cellular level responses, using two complementary recording techniques in the same animal and similar cortical location.

      (2) Comparison between awake and anesthetized states are conducted in the same animals, which allows for rather a direct comparison of populations under different conditions, thus reducing sampling variability.

      (3) The set of paradigms is well developed and specifically chosen to provide appropriate and meaningful controls/comparisons, which were missing from previous studies.

      (4) The addition of cell-type specific recordings is valuable and in particular in combination with the contrast of awake and anesthetized animals provides novel insights into the cellular level representation of deviant signals, such as surprise, prediction error, and general adaptation.

      (5) The analysis and presentation of the data are clear and quite complete, yet remain succinct and perform insightful contrasts.

      (6) The study will have an impact on multiple levels, as it introduces important variations in the paradigm and analytical contrasts that both human and animal researchers can pick up and improve their studies. The cell-type-specific results are particularly intriguing, although these would likely require replication before being completely reliable. Further, the study provides a substantial and diverse dataset that others can explore.

      Weaknesses:

      (1) The responses from cells recorded via Neuropixel and 2p differ qualitatively, as noted by the authors, with NP-recorded cells showing much more inhibited/reduced responses between acoustic stimulations. The authors briefly qualify these differences as potentially indicating a sampling issue, however, this matter deserves more detailed consideration in my opinion. Specifically, the authors could try to compare the different depths at which these neurons were sampled or relate the locations in the cortex to each other (as the Neuropixel recordings were collected in the same animals, a subset of the 2p recordings could be compared to the Neuropixel recordings.).

      (2) The current study did not monitor the attentional state of the mouse in relation to the stimulus by either including a behavioral component or pupil monitoring, which could influence the neural responses to deviant stimuli and omissions. .

      (3) Given the complexity and variety of the paradigms, conditions, and analyzed cell-types, the manuscript could profit from a more visual summary figure that provides an easy-to-access overview of what was found.

    5. Author response:

      Reviewer #1 (Public review):

      Summary:

      The authors successfully detected distinct mechanisms signalling prediction violations in the auditory cortex of mice. For this purpose, an auditory pure-tone local-global paradigm was presented to awake and anaesthetised mice. In awake rodents, the authors also evaluated interneuron cell types involved in responses to the interruption of the regularity imposed by local-global sequences. By performing two-photon calcium imaging and single-unit electrophysiology, the authors disentangled three phenomena underlying responses to violations of the distinct local-global regularity levels: Stimulus-specific adaptation, surprise and surprise adaptation. Both stimulus-specific adaptation and surprise-or deviant-evoked responses are observable under anaesthesia. Altogether, this work advances our understanding of distinct predictive processes computing prediction violations upon the complexity of the regularity imposed by the auditory sequence.

      Strengths:

      it is an elegant study beautifully executed.

      Weaknesses:

      No weaknesses were identified by this reviewer.

      Reviewer #2 (Public review):

      Summary:

      Oddball responses are increases in sensory responses when a stimulus is encountered in an unexpected location in a sequence of predictable stimuli. There are two computational interpretations for these responses: stimulus-specific adaptation and prediction errors. In recent years, evidence has accumulated that a significant part of these sequence violation responses cannot be explained simply by stimulus-specific adaptation. The current work elegantly adds to this evidence by using a sequence paradigm based on two levels of sequence violations: "Local" sequence violations of repetitions of identical stimuli, and "global" sequence violations of stimulus sequence patterns. The authors demonstrate that both local and global sequence violation responses are found in L2/3 neurons of the mouse auditory cortex. Using sequences with different inter-stimulus intervals, they further demonstrate that these sequence violation responses cannot be explained by stimulus-specific adaption.

      Strengths:

      The work is based on a very clever use of a sequence violation paradigm (local-global paradigm) and provides convincing evidence for the interpretation that there are at least two types of sequence violation responses and that these cannot be explained by stimulus-specific adaption. Most of the conclusions are based on a large dataset, and are compelling.

      Weaknesses:

      The final part of the paper focuses on the responses of VIP and PV-positive interneurons. The responses of VIP interneurons appear somewhat variable and difficult to interpret (e.g. VIP neurons exhibit omission responses in the A block, but not the B block). The conclusions based on these data appear less solid.

      We agree with the referee that the response modulations observed in  VIP and PV-Positive interneurons are weak and variable. This is indicated in the manuscript. Probably, larger scale recordings are necessary to ascertain fully the presence and distribution of omission responses.

      Reviewer #3 (Public review):

      Summary:

      In their manuscript entitled "Parallel mechanisms signal a hierarchy of sequence structure violations in the auditory cortex", Jamali et al. provide evidence for cellular-level mechanisms in the auditory cortex of mice for the encoding of predictive information on different temporal and contextual scales. The study design separates more clearly than previous studies between the effects of local and global deviants and separates their respective effects on the neuronal responses clearly through the use of various contextual conditions and short and long time scales. Further, it identifies a contribution from a small set of VIP interneurons to the detection of omitted sounds, and shows the influence of isofluorane anesthesia on the neural responses.

      Strengths:

      (1) The study provides a rather encompassing set of experimental techniques to study the cellular level responses, using two complementary recording techniques in the same animal and similar cortical location.

      (2) Comparison between awake and anesthetized states are conducted in the same animals, which allows for rather a direct comparison of populations under different conditions, thus reducing sampling variability.

      (3) The set of paradigms is well developed and specifically chosen to provide appropriate and meaningful controls/comparisons, which were missing from previous studies.

      (4) The addition of cell-type specific recordings is valuable and in particular in combination with the contrast of awake and anesthetized animals provides novel insights into the cellular level representation of deviant signals, such as surprise, prediction error, and general adaptation.

      (5) The analysis and presentation of the data are clear and quite complete, yet remain succinct and perform insightful contrasts.

      (6) The study will have an impact on multiple levels, as it introduces important variations in the paradigm and analytical contrasts that both human and animal researchers can pick up and improve their studies. The cell-type-specific results are particularly intriguing, although these would likely require replication before being completely reliable. Further, the study provides a substantial and diverse dataset that others can explore.

      Weaknesses:

      (1) The responses from cells recorded via Neuropixel and 2p differ qualitatively, as noted by the authors, with NP-recorded cells showing much more inhibited/reduced responses between acoustic stimulations. The authors briefly qualify these differences as potentially indicating a sampling issue, however, this matter deserves more detailed consideration in my opinion. Specifically, the authors could try to compare the different depths at which these neurons were sampled or relate the locations in the cortex to each other (as the Neuropixel recordings were collected in the same animals, a subset of the 2p recordings could be compared to the Neuropixel recordings.).

      We agree with the referee that the sampling issue, which we propose as a possible explanation for the large difference between our Neuropixel and 2P imaging recordings, must be investigated more thoroughly. This is, however, largely outside of the scope of this study. We have reported the depth and location of Neuropixel recordings in Figure S2. The depth is similar for both techniques covering mostly layers 2, 3 and 4. The location spans mostly the primary auditory cortex for two photon imaging and Neuropixel recordings. One Neuropixel recording is located in the ventral secondary auditory cortex. We could not find any evidence that the response to global violations in Neuropixel data stems specifically from this particular recording. 

      (2) The current study did not monitor the attentional state of the mouse in relation to the stimulus by either including a behavioral component or pupil monitoring, which could influence the neural responses to deviant stimuli and omissions.

      As reported by Bekinschtein et al. 2009, the attentional state influences responses to global violation in human subjects. It is extremely difficult to precisely compare attentional states in mice and human subjects. We have performed recordings in mice that had to attend to sound to detect a white noise sound in between the sequence to obtain a reward. This did not lead to increased global violation response. However, as the sequence themselves did not predict reward in this context they may divert attention. Therefore, this result is inconclusive and not worth including in our manuscript. If the sequence predicts rewards, there is a potential confound between violation responses and reward expectations or motor preparation signals. Pupil monitoring could be an alternative which we did not investigate.

      (3) Given the complexity and variety of the paradigms, conditions, and analyzed cell-types, the manuscript could profit from a more visual summary figure that provides an easy-to-access overview of what was found.

      This is an excellent suggestion, although given the complexity and diversity of our observations it may be hard to fit everything in one understandable figure.

    1. eLife Assessment

      This important study partially fills the gap in the knowledge of olfaction at the level of the Anterior Olfactory Nucleus (AON) and Piriform Cortex with functional magnetic resonance imaging, electrophysiology, and modeling. The methods used are convincing. Some of the findings confirm ongoing hypotheses, such as the behavioral importance of AON for odor source discrimination. Other results shed light on the dynamics of the connection between the olfactory system and the rest of the brain.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript combined rat fMRI, optogenetics, and electrophysiology to examine the large-scale functional network of the olfactory system as well as its alteration in an aged rat model.

      Strengths:

      Overall methodology is very solid and the results provided an interesting perspective on large-scale functional network perturbation of the olfactory system.

      Weaknesses:

      The biological relevance and validation of the current results can be improved.

      (1) Figure 1.1, on the top of the figure, CHR2 may be replaced by CHR2-mCherry, as only mCherry is fluorescent. And also, it's somewhat surprising that in AON and Pir regions (where only axon fibers should be labelled as red), most fluorescence appeared dot-like and looked more similar to cell body instead of typical fiber. The authors may want to double-check this.

      (2) The authors primarily presented 1Hz stimulation results. What is the most biologically relevant frequency (e.g., perhaps firing frequency under natural odor stimulation) among all frequencies that were used?

      (3) In Figure 2, the statistical thresholding is confusing: in the figure legend, it was stated that "t > 3.1 corresponding to P < 0.001" but later "further corrected for multiple comparisons with threshold-free cluster enhancement with family-wise error rate (TFCE-FWE) at P < 0.05"? Regardless of the statistical thresholding, such BOLD activation seemed to be widespread (almost whole-brain activation). Does such activation remain specific to the optogenetic stimulation, or something more general (e.g., arousal level change)? Furthermore, how those results (I assume they are group-level results) were obtained was not described very clearly. Is it just a simple average of individual-level results, or (more conventionally) second-level analysis?

      (4) In Figure 2, why use AUC to quantify the activation, not the more conventional beta value in the GLM analysis?

      (5) For Figure 2D, the way that it was quantified can be better described as "relative" activation within one condition, and I don't how to interpret the comparison among the relative fraction of activated regions. Perhaps comparison using percentage change (i.e., beta values) is more straightforward.

      (6) For Figure 3, it may be more convenient for readers to include the results of 1st activation for direct comparison. The current layout makes it difficult to make direct, visual comparisons among all 3 activations. Again I think using beta values (instead of AUC) may be more conventional.

      (7) Can the DCM results (at least part of it) be verified using the current electrophysiological data? For example, the long-range inhibitory effective connectivity of AON is rather intriguing. If that can be verified using ephys. data, it would be really great. In the current form, the DCM and ephys. results seem to be totally unrelated.

      (8) In Figure 6, it would be great if the adaptation of BOLD and ephys. signals can be correlated at the brain region level. The current figure only demonstrated there is adaptation in ephys. signal, but did not show if such adaptation is related to the BOLD adaptation.

    3. Reviewer #2 (Public review):

      Summary:

      Ma and colleagues presented a study on the characterization of brain-wide spatio-temporal impact of olfactory cortical outputs. They take advantage of multi-modal techniques on rats: fMRI, optogenetics, and electrophysiology. In addition, they used cutting-edge analytical techniques and modeling to support and interpret their data. The main findings of the study are:

      (1) The neurons in the Olfactory Bulb (OB) predominantly activate primary olfactory network regions, while stimulation of OB afferents in Anterior Olfactory Nucleus (AON) and Piriform Cortex (Pir) primarily orthodromically activates hippocampal/striatal and limbic networks, respectively.

      (2) Non-specified adaptation or habituation mechanisms may play a significant role in modulating olfactory outputs over subsequent fMRI sessions.

      (3) Artificially induced aging in rats induces profound modification in the functional interaction between olfactory cortices and multiple brain regions.

      The results on AON are of particular interest because of the lack of functional information on this region, despite its recognized importance in shaping OB output and behavior (odor localization tasks).

      Strengths:

      The manuscript is very accurate. The figures are well-crafted, and clear and provide much information with the most appropriate plots and graphics. The study's amount and data quality are remarkable, and the experimental size adequately addresses the scientific questions. I particularly appreciated the details in the description of the methods regarding the missing data and the size of the different animal groups. The supplementary data complete the leading figures and provide information at a single animal level.

      Weaknesses:

      (1) One of the main reasons the Piriform Cx is understudied in rodents is because of the proximity to air, which creates artifacts in fMRI images. This issue becomes more critical at ultra-high magnetic fields, but I would expect it also at 7T. One main achievement of this study is, indeed, the acquisition of fMRI data from Piriform, and this point should be highlighted by showing raw functional data from a rat. The best would be if an fMRI data sample for a rat, no matter which stimulation, is shared on a public repository, like Zenodo or similar. I am curious to check the quality of the BOLD data from such an 'enormous' field of view, particularly in the OB, with a single-shot sequence. Also, the visual inspection of raw data is essential to appreciate how many 0.5 x 0.5 x 1 mm voxels fit into AON, and others analyzed small brain structures, like the amygdala, etc. Was the amygdala entirely visible in BOLD, or did the air in the ear channel make an artifact partially shadowing it?

      (2) Surprisingly, the only information missing in the methods is the post-surgery period and the time between two consecutive fMRI sessions. How much time was accorded to rats to recover from the surgeries, and what time interval between two scans? This information is crucial for interpreting the decrease in most BOLD responses in subsequent recordings. The supposed adaptation should fit into the known time frames for odor adaptation. Usually, fast adaptation does not last for days (and it should be measured within a single experiment: is it the case?), while for long-lasting adaptation the stimulus (odor or opto) should be maintained constantly ON. This does not seem to be the case in this study. The hypothesis, alternative to adaptation, of a less efficient light activation, for example, due to gliosis around the fiber tips, should be discarded with more evidence than the preservation of OB > Pir responses or acknowledged in the manuscript.

      (3) The D-galactose experiments were conducted only after administering the aging molecule, with no baseline/reference data on the same animals. Then, comparisons were made with healthy rats, but the two groups not only can be discriminated with respect to D-galactose administration but also with age (10 VS 18 weeks). A control group for 18-weeks-old rats with no D-galactose treatment would better compare the D-galactose effect and avoid any potential bias from group comparisons of rats at different ages. Do you confirm that D-galactose was injected into each rat 56 times/day in a row, or am I mistaken?

      Overall, if my concerns are addressed, this is outstanding work, and I congratulate the authors.

    4. Author response:

      We appreciate the insightful comments and suggestions, which will significantly improve our work. We will revise the manuscript to address the reviewer’s concerns. Here, we list some of the key aspects of those concerns and our preliminary plans to address them.

      Both reviewers pointed out that we did not sufficiently justify the chosen optogenetic stimulation frequencies. We acknowledge and concur with their assessment, and will discuss it more extensively from a biological perspective (e.g., the neural firing rates in the olfactory bulb, OB, anterior olfactory nucleus, AON, and piriform cortex, Pir, under natural odor stimulation and respiration rhythm). Reviewer #1 suggested using beta values (b) rather than the area under the BOLD signal profile (AUC) to quantify the fMRI activations as they are more conventional for general linear model (GLM) analysis. We are aware of b and have used them for quantification of the amplitude of fMRI activations in our previous rodent fMRI studies1-3. However, in this study, we chose to utilize AUC as it offers a more comprehensive measure of BOLD signal change over time, including shape, duration, and magnitude, thereby capturing the bulk of neural activities and their dynamics throughout the stimulation period. b primarily represents the peak amplitude of BOLD responses (i.e., the % BOLD signal change)4 and can be constrained by the assumptions and limitations of the GLM analysis, such as the shape of the hemodynamic response function (HRF). AUC provides greater flexibility in capturing different aspects of neural responses across various brain regions, such as transient peaks and sustained responses.

      As mentioned by reviewer #1, correlating the adaptation of BOLD and electrophysiology signals at the brain region level would better signify our findings. We will pursue additional analysis to address this in our forthcoming responses. Reviewer #2 would like us to clarify the image and signal quality of our echo planar imaging (EPI)-based fMRI data, especially in the regions close to the air-tissue interface such as OB, Pir, entorhinal cortex and amygdala, and the methodology for some of the experimental protocols implemented in our study. We will show the raw EPI fMRI images from a representative animal and revise the results, discussion, and methods sections of the manuscript to address reviewer #2's concerns.

      In our forthcoming detailed responses to the reviewers' comments and recommendations, we will revise the text, figures, and captions accordingly to address and clarify the questions brought up by both reviewers.

      References

      (1) Gao, P.P., Zhang, J.W., Chan, R.W., Leong, A.T.L. & Wu, E.X. BOLD fMRI study of ultrahigh frequency encoding in the inferior colliculus. Neuroimage 114, 427-437 (2015).

      (2) Leong, A.T.L., Wong, E.C., Wang, X. & Wu, E.X. Hippocampus Modulates Vocalizations Responses at Early Auditory Centers. Neuroimage 270, 119943 (2023).

      (3) Gao, P.P., Zhang, J.W., Fan, S.J., Sanes, D.H. & Wu, E.X. Auditory midbrain processing is differentially modulated by auditory and visual cortices: An auditory fMRI study. Neuroimage 123, 22-32 (2015).

      (4) Goddard, E. & Mullen, K.T. fMRI representational similarity analysis reveals graded preferences for chromatic and achromatic stimulus contrast across human visual cortex. Neuroimage 215, 116780 (2020).

    1. eLife Assessment

      This important collection of over 800 new cell type-specific driver lines will be an invaluable resource for researchers studying associative learning in Drosophila. Thoroughly characterized and well documented, this collection will permit researchers to selectively target neurons that deliver information to, or receive it from, the memory center of the fly brain called the Mushroom Body. Given the wealth of new drivers and the genetic access they provide to over 300 cell types, this compelling work will be of interest not only to researchers studying the mechanisms of associative learning but more generally to those dissecting sensorimotor circuits in the fly nervous system.

    2. Reviewer #1 (Public Review):

      Summary:

      The emergence of Drosophila EM connectomes has revealed numerous neurons within the associative learning circuit. However, these neurons are inaccessible for functional assessment or genetic manipulation in the absence of cell-type-specific drivers. Addressing this knowledge gap, Shuai et al. have screened over 4000 split-GAL4 drivers and correlated them with identified neuron types from the "Hemibrain" EM connectome by matching light microscopy images to neuronal shapes defined by EM. They successfully generated over 800 split-GAL4 drivers and 22 split-LexA drivers covering a substantial number of neuron types across layers of the mushroom body associative learning circuit. They provide new labeling tools for olfactory and non-olfactory sensory inputs to the mushroom body; interneurons connected with dopaminergic neurons and/or mushroom body output neurons; potential reinforcement sensory neurons; and expanded coverage of intrinsic mushroom body neurons. Furthermore, the authors have optimized the GR64f-GAL4 driver into a sugar sensory neuron-specific split-GAL4 driver and functionally validated it as providing a robust optogenetic substitute for sugar reward. Additionally, a driver for putative nociceptive ascending neurons, potentially serving as optogenetic negative reinforcement, is characterized by optogenetic avoidance behavior. The authors also use their very large dataset of neuronal anatomies, covering many example neurons from many brains, to identify neuron instances with atypical morphology. They find many examples of mushroom body neurons with altered neuronal numbers or mistargeting of dendrites or axons and estimate that 1-3% of neurons in each brain may have anatomic peculiarities or malformations. Significantly, the study systematically assesses the individualized existence of MBON08 for the first time. This neuron is a variant shape that sometimes occurs instead of one of two copies of MBON09, and this variation is more common than that in other neuronal classes: 75% of hemispheres have two MBON09's, and 25% have one MBON09 and one MBON08. These newly developed drivers not only expand the repertoire for genetic manipulation of mushroom body-related neurons but also empower researchers to investigate the functions of circuit motifs identified from the connectomes. The authors generously make these flies available to the public. In the foreseeable future, the tools generated in this study will allow important advances in the understanding of learning and memory in Drosophila.

      Strengths:

      (1) After decades of dedicated research on the mushroom body, a consensus has been established that the release of dopamine from DANs modulates the weights of connections between KCs and MBONs. This process updates the association between sensory information and behavioral responses. However, understanding how the unconditioned stimulus is conveyed from sensory neurons to DANs, and the interactions of MBON outputs with innate responses to sensory context remains less clear due to the developmental and anatomic diversity of MBONs and DANs. Additionally, the recurrent connections between MBONs and DANs are reported to be critical for learning. The characterization of split-GAL4 drivers for 30 major interneurons connected with DANs and/or MBONs in this study will significantly contribute to our understanding of recurrent connections in mushroom body function.

      (2) Optogenetic substitutes for real unconditioned stimuli (such as sugar taste or electric shock) are sometimes easier to implement in behavioral assays due to the spatial and temporal specificity with which optogenetic activation can be induced. GR64f-GAL4 has been widely used in the field to activate sugar sensory neurons and mimic sugar reward. However, the authors demonstrate that GR64f-GAL4 drives expression in other neurons not necessary for sugar reward, and the potential activation of these neurons could introduce confounds into training, impairing training efficiency. To address this issue, the authors have elaborated on a series of intersectional drivers with GR64f-GAL4 to dissect subsets of labeled neurons. This approach successfully identified a more specific sugar sensory neuron driver, SS87269, which consistently exhibited optimal training performance and triggered ethologically relevant local searching behaviors. This newly characterized line could serve as an optimized optogenetic tool for sugar reward in future studies.

      (3) MBON08 was first reported by Aso et al. 2014, exhibiting dendritic arborization into both ipsilateral and contralateral γ3 compartments. However, this neuron could not be identified in the previously published Drosophila brain connectomes. In the present study, the existence of MBON08 is confirmed, occurring in one hemisphere of 35% of imaged flies. In brains where MBON08 is present, its dendrite arborization disjointly shares contralateral γ3 compartments with MBON09. This remarkable phenotype potentially serves as a valuable resource for understanding the stochasticity of neurodevelopment and the molecular mechanisms underlying mushroom body lobe compartment formation.

      Comments on revised version:

      I only suggested minor changes, and these have been resolved.

    3. Reviewer #2 (Public Review):

      Summary:

      The article by Shuai et al. describes a comprehensive collection of over 800 split-GAL4 and split-LexA drivers, covering approximately 300 cell types in Drosophila, aimed at advancing the understanding of associative learning. The mushroom body (MB) in the insect brain is central to associative learning, with Kenyon cells (KCs) as primary intrinsic neurons and dopaminergic neurons (DANs) and MB output neurons (MBONs) forming compartmental zones for memory storage and behavior modulation. This study focuses on characterizing sensory input as well as direct upstream connections to the MB both anatomically and, to some extent, behaviorally. Genetic access to specific, sparsely expressed cell types is crucial for investigating the impact of single cells on computational and functional aspects within the circuitry. As such, this new and extensive collection significantly extends the range of targeted cell types related to the MB and will be an outstanding resource to elucidate MB-related processes in the future.

      Strengths:

      The work by Shuai et al. provides novel and essential resources to study MB-related processes and beyond. The resulting tools are publicly available and, together with the linked information, will be foundational for many future studies. The importance and impact of this tool development approach, along with previous ones, for the field cannot be overstated. One of many interesting aspects arises from the anatomical analysis of cell types that are less stereotypical across flies. These discoveries might open new avenues for future investigations into how such asymmetry and individuality arise from development and other factors, and how it impacts the computations performed by the circuitry that contains these elements.

      Comments on revised version:

      From my side they have addressed the few issues I had sufficiently.

    4. Reviewer #3 (Public Review):

      Summary:

      Previous research on the Drosophila mushroom body (MB) has made this structure the best-understood example of an associative memory center in the animal kingdom. This is in no small part due to the generation of cell-type specific driver lines that have allowed consistent and reproducible genetic access to many of the MB's component neurons. The manuscript by Shuai et al. now vastly extends the number of driver lines available to researchers interested in studying learning and memory circuits in the fly. It is an 800-plus collection of new cell-type specific drivers target neurons that either provide input (direct or indirect) to MB neurons or that receive output from them. Many of the new drivers target neurons in sensory pathways that convey conditioned and unconditioned stimuli to the MB. Most drivers are exquisitely selective, and researchers will benefit from the fact that whenever possible, the authors have identified the targeted cell types within the Drosophila connectome. Driver expression patterns are beautifully documented and are publicly available through the Janelia Research Campus's Flylight database where full imaging results can be accessed. Overall, the manuscript significantly augments the number of cell type-specific driver lines available to the Drosophila research community for investigating the cellular mechanisms underlying learning and memory in the fly. Many of the lines will also be useful in dissecting the function of the neural circuits that mediate sensorimotor circuits.

      Strengths:

      The manuscript represents a huge amount of careful work and leverages numerous important developments from the last several years. These include the thousands of recently generated split-Gal4 lines at Janelia and the computational tools for pairing them to make exquisitely specific targeting reagents. In addition, the manuscript takes full advantage of the recently released Drosophila connectomes. Driver expression patterns are beautifully illustrated side-by-side with corresponding skeletonized neurons reconstructed by EM. A comprehensive table of the new lines, their split-Gal4 components, their neuronal targets, and other valuable information will make this collection eminently useful to end-users. In addition to the anatomical characterization, the manuscript also illustrates the functional utility of the new lines in optogenetic experiments. In one example, the authors identify a specific subset of sugar reward neurons that robustly promotes associative learning.

      Comments on revised version:

      Overall, I thought the authors addressed my comments well with the possible exception of what is actually new here. This was the most important thing that I thought should be included in the revision. Although the authors rewrote the paragraph describing the lines presented in the paper, I still can't tell exactly which ones haven't been previously published. Their revised paragraph says that 40 lines have been "previously used," but Supplemental Table 1 shows references for over 200 of the lines, which sounds more reasonable based on papers that have come out.

      Also, in the revised paragraph they state that "All transgenic lines newly generated in this study are listed in Supplementary File 2" but that table lists only the 36 LexA hemidriver lines! Confusingly, this comment cites the same 8 references as are cited for the 40 line that they say were previously published. I am thus only more confused about how many previously uncharacterized lines are presented in this paper.

      Further clarification would be helpful. On the one hand, I think this paper is a very nice summary of a ton of work and brings it all under one umbrella in a way that will be useful for many in the field. In that sense, the manuscript is worth publishing simply as a useful resource even if all the lines were previously published. On the other hand, it would be useful for readers to know which lines were previously characterized in other publications and which ones were not. This information may or may not be in Supplementary Tables 1 and 2 (but I can't tell).

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The emergence of Drosophila EM connectomes has revealed numerous neurons within the associative learning circuit. However, these neurons are inaccessible for functional assessment or genetic manipulation in the absence of cell-type-specific drivers. Addressing this knowledge gap, Shuai et al. have screened over 4000 split-GAL4 drivers and correlated them with identified neuron types from the "Hemibrain" EM connectome by matching light microscopy images to neuronal shapes defined by EM. They successfully generated over 800 split-GAL4 drivers and 22 split-LexA drivers covering a substantial number of neuron types across layers of the mushroom body associative learning circuit. They provide new labeling tools for olfactory and non-olfactory sensory inputs to the mushroom body; interneurons connected with dopaminergic neurons and/or mushroom body output neurons; potential reinforcement sensory neurons; and expanded coverage of intrinsic mushroom body neurons. Furthermore, the authors have optimized the GR64f-GAL4 driver into a sugar sensory neuron-specific split-GAL4 driver and functionally validated it as providing a robust optogenetic substitute for sugar reward. Additionally, a driver for putative nociceptive ascending neurons, potentially serving as optogenetic negative reinforcement, is characterized by optogenetic avoidance behavior. The authors also use their very large dataset of neuronal anatomies, covering many example neurons from many brains, to identify neuron instances with atypical morphology. They find many examples of mushroom body neurons with altered neuronal numbers or mistargeting of dendrites or axons and estimate that 1-3% of neurons in each brain may have anatomic peculiarities or malformations. Significantly, the study systematically assesses the individualized existence of MBON08 for the first time. This neuron is a variant shape that sometimes occurs instead of one of two copies of MBON09, and this variation is more common than that in other neuronal classes: 75% of hemispheres have two MBON09's, and 25% have one MBON09 and one MBON08. These newly developed drivers not only expand the repertoire for genetic manipulation of mushroom body-related neurons but also empower researchers to investigate the functions of circuit motifs identified from the connectomes. The authors generously make these flies available to the public. In the foreseeable future, the tools generated in this study will allow important advances in the understanding of learning and memory in Drosophila.

      Strengths:

      (1) After decades of dedicated research on the mushroom body, a consensus has been established that the release of dopamine from DANs modulates the weights of connections between KCs and MBONs. This process updates the association between sensory information and behavioral responses. However, understanding how the unconditioned stimulus is conveyed from sensory neurons to DANs, and the interactions of MBON outputs with innate responses to sensory context remains less clear due to the developmental and anatomic diversity of MBONs and DANs. Additionally, the recurrent connections between MBONs and DANs are reported to be critical for learning. The characterization of split-GAL4 drivers for 30 major interneurons connected with DANs and/or MBONs in this study will significantly contribute to our understanding of recurrent connections in mushroom body function.

      (2) Optogenetic substitutes for real unconditioned stimuli (such as sugar taste or electric shock) are sometimes easier to implement in behavioral assays due to the spatial and temporal specificity with which optogenetic activation can be induced. GR64f-GAL4 has been widely used in the field to activate sugar sensory neurons and mimic sugar reward. However, the authors demonstrate that GR64f-GAL4 drives expression in other neurons not necessary for sugar reward, and the potential activation of these neurons could introduce confounds into training, impairing training efficiency. To address this issue, the authors have elaborated on a series of intersectional drivers with GR64f-GAL4 to dissect subsets of labeled neurons. This approach successfully identified a more specific sugar sensory neuron driver, SS87269, which consistently exhibited optimal training performance and triggered ethologically relevant local searching behaviors. This newly characterized line could serve as an optimized optogenetic tool for sugar reward in future studies.

      (3) MBON08 was first reported by Aso et al. 2014, exhibiting dendritic arborization into both ipsilateral and contralateral γ3 compartments. However, this neuron could not be identified in the previously published Drosophila brain connectomes. In the present study, the existence of MBON08 is confirmed, occurring in one hemisphere of 35% of imaged flies. In brains where MBON08 is present, its dendrite arborization disjointly shares contralateral γ3 compartments with MBON09. This remarkable phenotype potentially serves as a valuable resource for understanding the stochasticity of neurodevelopment and the molecular mechanisms underlying mushroom body lobe compartment formation.

      Weaknesses:

      There are some minor weaknesses in the paper that can be clarified:

      (1) In Figure 8, the authors trained flies with a 20s, weak optogenetic conditioning first, followed by a 60s, strong optogenetic conditioning. The rationale for using this training paradigm is not explicitly provided.

      These experiments were designed to test if flies could maintain consistent performance with repetitive and intense LED activation, which is essential for experiments involving long training protocols or coactivation of other neurons inside a brain.

      In Figure 8E, if data for training with GR64f-GAL4 using the same paradigm is available, it would be beneficial for readers to compare the learning performance using newly generated split-GAL4 lines with the original GR64f-GAL4, which has been used in many previous research studies. It is noteworthy that in previously published work, repeating training test sessions typically leads to an increase in learning performance in discrimination assays. However, this augmentation is not observed in any of the split-GAL4 lines presented in Figure 8E. The authors may need to discuss possible reasons for this.

      As the reviewer pointed out, many previous studies including ours used the original Gr64f-GAL4 in olfactory conditioning. Figure 1H of Yamada et al., 2023 (https://doi.org/10.7554/eLife.79042) showed such a result, where the first and second-order olfactory conditioning were assayed. Indeed, the first-order conditioning scores were gradually augmented over repeated training. In this experiment, we used low red LED intensity for the optogenetic activation. In the Figure 8E of the present paper, the first memory test was after 3x pairing of 20s odor with five 1s red LED without intermediate tests. Therefore, flies were already sufficiently trained to show a plateau memory level in “Test1”. In the revision of another recent report (Figure 1C-F of Aso et al., 2023; https://doi.org/10.7554/eLife.85756), we included the learning curve data of our best Gr64f-split-GAL4, SS87269. Under a less saturated training conditioning, SS87269 did show learning augmentation over repeated training.

      (2) In line 327, the authors state that in all samples, the β'1 compartment is arborized by MBON09. However, in Figure 11J, the probability of having at least one β'1 compartment not arborized is inferred to be 2%. The authors should address and clarify this conflict in the text to avoid misunderstanding.

      The chance of visualizing MBON08 in MCFO images was 21/209 in total (Figure 11I). If we assume that each of four cells adopt MBON08 development fate at this chance, we can calculate the probability for each case of MBON08/09 cell type composition. From this calculation, we inferred approximately 2% of flies would lack innervations to β'1 compartment in at least one hemisphere. However, we didn't observe a lack of β'1 arborizations in 169 sample flies. If these MBONs independently develop into MBON08 at 21/209 odds, the chance of never observing two MBON08s in either hemisphere of all 169 samples is 3.29%. Therefore, some developmental mechanisms may prevent the emergence of two MBON08 in the same hemisphere.

      In the revised manuscript, we displayed these estimated probability for each case separately, and annotated actual observation on the right side.

      (3) In general, are the samples presented male or female? This sample metadata will be shown when the images are deposited in FlyLight, but it would be useful in the context of this manuscript to describe in the methods whether animals are all one sex or mixed sex, and in some example images (e.g. mAL3A) to note whether the sample is male or female.

      The samples presented in this study are mixed sex, except for Figure 11I, where genders are specified. We provided metadata information of the presented images in Supplemental File 7, and we added a paragraph in the in the method section:

      “Most samples were collected from females, though typically at least one male fly was examined for each driver line. While we noticed certain lines such as SS48900, exhibited distinct expression patterns in females and males, we did not particularly focus on sexual dimorphism, which is analyzed elsewhere (Meissner et al. 2024). Therefore, unless stated otherwise, the presented samples are of mixed gender.

      Detailed metadata, including gender information and the reporter used, can be found in Supplementary File 7.”

      Reviewer #2 (Public Review):

      Summary:

      The article by Shuai et al. describes a comprehensive collection of over 800 split-GAL4 and split-LexA drivers, covering approximately 300 cell types in Drosophila, aimed at advancing the understanding of associative learning. The mushroom body (MB) in the insect brain is central to associative learning, with Kenyon cells (KCs) as primary intrinsic neurons and dopaminergic neurons (DANs) and MB output neurons (MBONs) forming compartmental zones for memory storage and behavior modulation. This study focuses on characterizing sensory input as well as direct upstream connections to the MB both anatomically and, to some extent, behaviorally. Genetic access to specific, sparsely expressed cell types is crucial for investigating the impact of single cells on computational and functional aspects within the circuitry. As such, this new and extensive collection significantly extends the range of targeted cell types related to the MB and will be an outstanding resource to elucidate MB-related processes in the future.

      Strengths:

      The work by Shuai et al. provides novel and essential resources to study MB-related processes and beyond. The resulting tools are publicly available and, together with the linked information, will be foundational for many future studies. The importance and impact of this tool development approach, along with previous ones, for the field cannot be overstated. One of many interesting aspects arises from the anatomical analysis of cell types that are less stereotypical across flies. These discoveries might open new avenues for future investigations into how such asymmetry and individuality arise from development and other factors, and how it impacts the computations performed by the circuitry that contains these elements.

      Weaknesses:

      Providing such an array of tools leaves little to complain about. However, despite the comprehensive genetic access to diverse sensory pathways and MB-connected cell types, the manuscript could be improved by discussing its limitations. For example, the projection neurons from the visual system seem to be underrepresented in the tools produced (or almost absent). A discussion of these omissions could help prevent misunderstandings.

      We internally distributed efforts to produce split-GAL4 lines at Janelia Research Campus. The recent preprint (Nern et al., 2024; doi: https://doi.org/10.1101/2024.04.16.589741) described the full collection of split-GAL4 driver lines in the optic lobe including the visual projection neurons to the mushroom body. We cited this preprint in the revised manuscript by adding a short paragraph of discussion.

      “Although less abundant than the olfactory input, the MB also receives visual information from the visual projection neurons (VPNs) that originate in the medulla and lobula and are targeted to the accessory calyx (Vogt et al. 2016; Li et al. 2020). A recent preprint described the full collection of split-GAL4 driver lines in the optic lobe, which includes the VPNs to the MB (Nern et al. 2024).”

      Additionally, more details on the screening process, particularly the selection of candidate split halves and stable split-GAL4 lines, would provide valuable insights into the methodology and the collection's completeness.

      The details of our split-GAL4 design and screening procedures were described in previous studies (Aso et al., 2014; Dolan et al., 2019). Available data and tools to design split-GAL4 changed over time, and we took different approaches accordingly. Many of split-GAL4 lines presented in this study were designed and screened in parallel to the lines for MBONs and DANs in 2010-2014 when MCFO images of GAL4 drivers and EM connectome were not yet available. With knowledge of where MBONs and DANs project, I (Y.A.) manually examined and annotated thousands of confocal stacks (Jenett et al., 2012; https://doi.org/10.1016/j.celrep.2012.09.011) to find candidate cell types that may concat with them.

      Later I used more advanced computational tools (Otsuna et al., 2018; doi: https://doi.org/10.1101/318006) and MCFO images aligned to the standard brain volume (Meissner et al., 2023; DOI: 10.7554/eLife.80660.). Now, if one needs to further generate split-GAL4 lines for cell type identified in EM connectome data, neuron bridge website (https://neuronbridge.janelia.org/) can be very helpful to provide a list of GAL4 drivers that may label the neuron of interest.

      Reviewer #3 (Public Review):

      Summary:

      Previous research on the Drosophila mushroom body (MB) has made this structure the best-understood example of an associative memory center in the animal kingdom. This is in no small part due to the generation of cell-type specific driver lines that have allowed consistent and reproducible genetic access to many of the MB's component neurons. The manuscript by Shuai et al. now vastly extends the number of driver lines available to researchers interested in studying learning and memory circuits in the fly. It is an 800-plus collection of new cell-type specific drivers target neurons that either provide input (direct or indirect) to MB neurons or that receive output from them. Many of the new drivers target neurons in sensory pathways that convey conditioned and unconditioned stimuli to the MB. Most drivers are exquisitely selective, and researchers will benefit from the fact that whenever possible, the authors have identified the targeted cell types within the Drosophila connectome. Driver expression patterns are beautifully documented and are publicly available through the Janelia Research Campus's Flylight database where full imaging results can be accessed. Overall, the manuscript significantly augments the number of cell type-specific driver lines available to the Drosophila research community for investigating the cellular mechanisms underlying learning and memory in the fly. Many of the lines will also be useful in dissecting the function of the neural circuits that mediate sensorimotor circuits.

      Strengths:

      The manuscript represents a huge amount of careful work and leverages numerous important developments from the last several years. These include the thousands of recently generated split-Gal4 lines at Janelia and the computational tools for pairing them to make exquisitely specific targeting reagents. In addition, the manuscript takes full advantage of the recently released Drosophila connectomes. Driver expression patterns are beautifully illustrated side-by-side with corresponding skeletonized neurons reconstructed by EM. A comprehensive table of the new lines, their split-Gal4 components, their neuronal targets, and other valuable information will make this collection eminently useful to end-users. In addition to the anatomical characterization, the manuscript also illustrates the functional utility of the new lines in optogenetic experiments. In one example, the authors identify a specific subset of sugar reward neurons that robustly promotes associative learning.

      Weaknesses:

      While the manuscript succeeds in making a mass of descriptive detail quite accessible to the reader, the way the collection is initially described - and the new lines categorized - in the text is sometimes confusing. Most of the details can be found elsewhere, but it would be useful to know how many of the lines are being presented for the first time and have not been previously introduced in other publications/contexts.

      We revised the text as below.

      “Among the 828 lines, a subset of 355 lines, collectively labeling at least 319 different cell types, exhibit highly specific and non-redundant expression patterns are likely to be particularly valuable for behavioral experiments. Detailed information, including genotype, expression specificity, matched EM cell type(s), and recommended driver for each cell type, can be found in Supplementary File 1. A small subset of 40 lines from this collection have been previously used in studies (Aso et al., 2023; Dolan et al., 2019; Gao et al., 2019; Scaplen et al., 2021; Schretter et al., 2020; Takagi et al., 2017; Xie et al., 2021; Yamada et al., 2023). All transgenic lines newly generated in this study are listed in Supplementary File 2 (Aso et al., 2023; Dolan et al., 2019; Gao et al., 2019; Scaplen et al., 2021; Schretter et al., 2020; Takagi et al., 2017; Xie et al., 2021; Yamada et al., 2023).”

      And where can the lines be found at Flylight? Are they listed as one collection or as many?

      They are listed as one collection - “Aso 2021” release. It is named “2021” because we released the images and started sharing lines in December of 2021 without a descriptive paper. We added a sentence in the Methods section.

      “All splitGAL4 lines can be found at flylight database under “Aso 2021” release, and fly strains can be requested from Janelia or the Bloomington stock center.”

      Also, the authors say that some of the lines were included in the collection despite not necessarily targeting the intended type of neuron (presumably one that is involved in learning and memory). What percentage of the collection falls into this category?

      We do not have a good record of split-GAL4 screening to calculate the chance to intersect unintended cell types, but it was rather rare. Those unintended cell types can still be a part of circuits for associative learning (e.g. olfactory projection neurons) or totally unrelated cell types. For instance, among a new collection of split-LexA lines using Gr43a-LexADBD hemidriver (Figure 7-figure supplement 2), one line specifically intersected T1 neurons in the optic lobe despite that the AD line was selected to intersect sugar sensory neurons. We suspect that this is due to ectopic expression of Gr43a-LexADBD. Nonetheless, we included it in the paper because cell-type-specific Split-LexA driver for T1 will be useful irrespective of whether the expression of Gr43a gene is expressed in T1 or not.

      And what about the lines that the authors say they included in the collection despite a lack of specificity? How many lines does this represent?

      For a short answer, there are about 100 lines in the collection that lack the specificity for behavioral experiments.

      We ranked specificity of split-GAL4 drivers in the Supplementary File 1. Rank 2 are the ideal lines, Rank 1 are less ideal but acceptable, and Rank 0 is not suitable for activation screening in behavioral experiments. Out of the 828 split-GAL4 lines reported here, there are 413, 305 and 103 lines in rank2, rank1 and rank0 categories respectively. 7 lines are not ranked for specificity because only flipout expression data are available.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      As mentioned elsewhere and in addition to the minor points below, it is advisable for the authors to elaborate on the details of the screening process. Furthermore, a discussion about the circuits not targeted by their research, such as the visual projection neurons, would be beneficial.

      See the response above to Reviewer #2’s public review.

      Line 32-33: The citations are very fly-centric. the authors might want to consider reviews on the MB of other insect species regarding learning and memory.

      We additionally cited Rybak and Menzel 2017’s book chapter on honey bee mushroom body.

      Line 43-44: Citations should be added, e.g. Séjourné et al. (2011), Pai et al. (2013), Plaçais et al. (2013).

      Citation added

      Line 50-52: Citation Hulse et al. (2021) should be added.

      Citation added

      Line 162: In this part, it might be valuable for the reader to understand which of these PNs are actually connecting with KCs.

      A full list of cell types within the MB were provided in Supplementary File 4 of the revised manuscript. See also response to Reviewer 3, Lines 150-1.

      Line 179: Citation Burke et al. (2012) should be mentioned.

      Citation added

      Line 181: Thermogenic might be thermogenetic.

      Corrected

      Line 189: Citations add Otto et al. (2020) and Felsenberg et al. (2018).

      Citations added

      Line 208ff: The authors should consider discussing why they did not use other GR and IR promoters. For example, Gr5a is prominent in sugar-sensing, while Ir76b could be a reinforcement signal related to yeast food (Steck et al., 2018; Ganguly et al., 2017; see also Corfas et al., 2019 for local search).

      We focused on the Gr64f promoter because of its relatively broad expression and successful use of Gr64f-GAL4 for fictive reward experiment. We added the Split-LexA lines with Gr43a and Gr66a promoters (Figure 7-figure supplement 2). Other gustatory sensory neurons also have the potential to be reinforcement signals, but we just did not have the bandwidth to cover them all.

      Line 319: Consider citing Linneweber et al. (2020) for a neurodevelopmental account of such individuality.

      We added a sentence and cited this reference.

      “On the other hand, the neurodevelopmental origin of neuronal morphology appeared to have functional significance on behavioral individuality (Linneweber et al. 2020).”

      Line 352: Citation add Hulse et al. (2021).

      Citations added

      Line 356ff: The utility and value of Split-LexA may not be apparent to non-expert readers. Moreover, how were LexADBDs chosen for creating these lines?

      We have added an introductory sentence at the beginning of the paragraph and explained that these split-LexA lines were a conversion of split-GAL4 lines that were published in 2014 and frequently used in studying the mushroom body circuit.

      “Split-GAL4 lines enable cell-type-specific manipulation, but some experiments require independent manipulation of two cell types. Split-GAL4 lines can be converted into split-LexA lines by replacing the GAL4 DNA binding domain with that of LexA (Ting et al., 2011). To broaden the utility of the split-GAL4 lines that have been frequently used since the publication in 2014 (Aso et al., 2014a), we have generated over 20 LexADBD lines to test the conversions of split-GAL4 to split-LexA. The majority (22 out of 34) of the resulting split-LexA lines exhibited very similar expression patterns to their corresponding original split-GAL4 lines (Figure 12).”

      Line 374: Italicize Drosophila melanogaster.

      Revised as suggested.

      Reviewer #3 (Recommendations For The Authors):

      Major Comments:

      As mentioned in the Public Review, the drivers are nicely classified in the various subsections of the manuscript, but the statements in the text summarizing how many lines there are in specific categories are often confusing. For example, line 129 refers to "drivers encompassing 111 cell types that connect with the DANs and MBONs", but Figure 1E indicates that 46 new cell types downstream of MBONs and upstream of DANs have been generated. This seems like a discrepancy.

      The 46 cell types in Figure 1E consider only the CRE/SMP/SIP/SLP area, where MBON downstreams and DAN upstreams are highly enriched, while the 111 cell types include all. To avoid confusion, we removed the “MBON downstream and DAN upstream” counting in Figure 1E in the revised manuscript.

      Also, at line 75 the MBON lines previously generated by Rubin and Aso (2023) are referred to as though they are separate from the 828 described "In this report." Supplementary file 1 suggests, however, that they are included as part of this report.

      Twenty five lines generated in Rubin and Aso (2023) were initially included in Supplementary file 1 for the convenience of users, but they were not counted towards the 828 new lines described in this report. To avoid confusion, we removed these 25 lines in the revised manuscript. Now all lines listed in Supplementary file 1 were generated in this study (“Aso 2021” release), and if a line has been used in earlier studies, or introduced in other contexts, for example the accompanying omnibus preprint (Meissener 2024, doi: 10.1101/2024.01.09.574419), the citations are listed in the reference column.

      More generally, in lines 94-102 "828 useful lines based on their specificity, intensity and non-redundancy" are referred to, but they are subsequently subdivided into categories of lines with lower specificity (i.e. with off-target expression) and lines that did not target intended cell types (presumably ones unlikely to be involved in learning and memory). It would be useful to know how many lines (at least roughly) fall into these subcategories.

      See the response above to Reviewer #3’s public review.

      Finally, Figures 3B & C indicate cell types connected to DANs and MBONs and the number for which Split-Gal4 lines are available. The text (lines 136-7) states that the new collection covers 30 of these major cell types (Figure 3C)," but Figure 3C clearly has more than 30 dots showing the drivers available. Presumably existing and new driver lines are being pooled, but this should either be explained or the two should be distinguished.

      “(Figure 3C)” was replaced with “(Supplementaryl File 3)” in the revised manuscript to correct the reference. Figure 3B & C are plots of all MB interneurons, not just the major cell types.

      Minor Comments:

      Although the paper is generally well written there are minor grammatical errors throughout (e.g. dropped articles, odd constructions, etc.) that somewhat detract from an otherwise smooth and enjoyable reading experience. A quick editing pass by a native speaker (i.e. any of several of the authors) could clean up these and numerous other small mistakes. A few examples: line 138 "presented" should be present; line 204: "contain off-targeted expressions" should be "have off-target expression;" line 219: "usage to substitute reward" is awkward at best and could be something like "use in generating fictive rewards"; line 326 "arborize[s]"; l. 331 "Based on the likelihood" should be something like "based on these observations"'; line 349 "[is] likely to appear"; l. 352 "extensive connection[s]"; line 353 "has [a] strong influence;" l. 963 "Projections" should be singular; etc.

      All the mentioned examples have been corrected, and we have asked a native speaker to edit through the revised manuscript.

      Lines 81-3: Is the lookup table referred to Suppl. File 1? A reference is desirable.

      Yes, the lookup table referred to “Supplementary File 1” and a reference was added.

      Lines 111-2: what is a "non-redundant set of...cell types?" Cell types that are represented by a single cell (or bilateral pair)? Or does this sentence mean that of the 828 lines, 355 are specific to a single cell type, and in total 319 cell types are targeted? The statement is confusing.

      We revised the text as below.

      “Figure 1E provides an overview of the categories of covered cell types. Among the 828 lines, a subset of 355 lines, collectively labeling at least 319 different cell types, exhibit highly specific and non-redundant expression patterns are likely to be particularly valuable for behavioral experiments. Detailed information, including genotype, expression specificity, matched EM cell type(s), and recommended driver for each cell type, can be found in Supplementary File 1. A small subset of 40 lines from this collection have been previously used in studies (Aso et al.,

      2023; Dolan et al., 2019; Gao et al., 2019; Scaplen et al., 2021; Schretter et al., 2020; Takagi et al., 2017; Xie et al., 2021; Yamada et al., 2023). All transgenic lines newly generated in this study are listed in Supplementary File 2 (Aso et al., 2023; Dolan et al., 2019; Gao et al., 2019; Scaplen et al., 2021; Schretter et al., 2020; Takagi et al., 2017; Xie et al., 2021; Yamada et al., 2023).”

      Line 148: "MB major interneurons" is a confusing descriptor for postsynaptic partners of MBONs.

      We added a sentence to clarify the definition of the “MB major interneurons”.

      “In the hemibrain EM connectome, there are about 400 interneuron cell types that have over 100 total synaptic inputs from MBONs and/or synaptic outputs to DANs. Our newly developed collection of split-GAL4 drivers covers 30 types of these ‘major interneurons’ of the MB (Supplementary File 3).”

      Lines 150-1: Not sure what is meant by "have innervations within the MB." Sounds like cells are presynaptic to KCs, DANS, and MBONs, but Figure 3 Figure Supplement 1 indicates they include neurons that both provide and receive innervation to/from MB neurons. Please clarify.

      For clarification, in the revised manuscript we have included a full list of cell types within the MB in Supplementary File 4. Included are all neurons with >= 50 pre-synaptic connections or with >=250 post-synaptic connections in the MB roi in the hemibrain (excluding the accessory calyx). The cell types include KCs, MBONs, DANs, PNs, and a few other cell types. The coverage ratio was updated based on this list.

      Also, in line 152, what does it mean that they "may have been overlooked previously?" this seems unnecessarily ambiguous. Were they overlooked or weren't they?

      Changed the text to “These lines offer valuable tools to study cell types that previously are not genetically accessible. Notably, SS85572 enables the functional study of LHMB1, which forms a rare direct pathway from the calyx and the lateral horn (LH) to the MB lobes (Bates et al., 2020). ”

      Line 158 refers to PN cells within the MB, which are not mentioned in any place else as MB components.

      What are these PNs and how do they differ from MBONs?

      See responses to Lines 150-1 for clarification of cell types within the MB.

      Line 188: not clear what is meant by "more continual learning tasks".

      We rephrase it as “more complex learning tasks” to avoid jargon.

      Line 235: Not clear why "extended training with high LED intensity" wouldn't promote the formation of robust memories. Is this for some reason unexpected based on previous experiments? Please explain.

      See responses to weakness #1 of the same reviewer

      Lines 317-9: It would be useful to state here that MB0N08 and MB0N09 are the two neurons labeled by MB083C.

      Revised as suggested.

      Line 368: Presumably the "lookup table" referred to is Supplementary File 1, but a reference here would be useful.

      Yes, Supplementary File 1 and a reference was added.

      Comments on Figures:

      Figure 1C The "Dopamine Neurons" label position doesn't align with the Punishment and Reward labels, which is a bit confusing.

      They are intentionally not aligned, because dopamine neurons are not reward/punishment per se. We intend to use the schematic to show that the punishment and reward are conveyed to the MB through the dopamine neuron layer, just as the output from the MB output neuron layer is used to guide further integration and actions. To keep the labels of “Dopamine neurons” and “MB Output Neurons” in a symmetrical position, we decide to keep the original figure unchanged. But we thank the reviewer for the kind suggestion.

      Figure 1F and Figure 1 - Figure Supplement 1: the light gray labels presumably indicate the (EM-identified) neuron labeled by each line, but this should be explicitly stated in the figure legends. It would also be useful in the legends to direct the reader to the key (Supplementary File 1) for decoding neuronal identities.

      Revised as suggested.

      Figure 2: For clarity, I'd recommend titling this figure "LM-EM Match of the CRE011-specific driver SS45245". This reduces the confusion of mixing and matching the driver and cell-type names. Also, it would be helpful to indicate (e.g. with labels above the figure parts) that A & B represent the MCFO characterization step and C & D represent the LM-EM matching step of the pipeline. Revised as suggested.

      Figure 6: For clarity, it would be useful to separately label the PN and sensory neuron groups. Also, for the sensory neurons at the bottom, what is the distinction between the cell names in gray and black font?

      Figure 6 was updated to separate the non-olfactory PN and sensory neuron groups. The gray was intended for olfactory receptor neuron cell types that are additionally labeled in the driver lines. To avoid confusion, the gray cell types were removed in the revised figure, and a clarification sentence was added to the legend.

      “Other than thermo-/hygro-sensory receptor neurons (TRNs and HRNs), SS00560 and MB408B also label olfactory receptor neurons (ORNs): ORN_VL2p and ORN_VC5 for SS00560, ORN_VL1 and ORN_VC5 for MB408B.”

      Figure 7A: It's unclear why the creation of 6 Gr64f-LexADBD lines is reported. Aren't all these lines the same? If not, an explanation would be useful.

      These six Gr64f-LexADBD lines are with different insertion sites, and with the presence or absence of the p10 translational enhancer. Explanation was added to legend. Enhanced expression level with p10 can be helpful to compensate for the general tendency that split-LexA is weaker than split-GAL4. Different insertions will be useful to avoid transvections with split-GAL4s, which are mostly in attP40 and attP2.

      Figure 8F: It would help to include in the legend a brief description of each parameter being measured-essentially defining the y-axis label on the graphs as in Figure Supplement 2. Also, how is the probability of return calculated and what behavioral parameter does the change of curvature refer to?

      We added a brief description to the behavioral parameters in the legend of Figure 8F.

      “Return behavior was assessed within a 15-second time window. The probability of return (P return) is the percentage of flies that made an excursion (>10 mm) and then returned to within 3 mm of their initial position. Curvature is the ratio of angular velocity to walking speed.”

      Figure 9E: What are the parenthetical labels for lines SS49267, SS49300, and SS35008?

      They are EM bodyIDs. Figure legend was revised.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This study compiles a wide range of results on the connectivity, stimulus selectivity, and potential role of the claustrum in sensory behavior. While most of the connectivity results confirm earlier studies, this valuable work provides incomplete evidence that the claustrum responds to multimodal stimuli and that local connectivity is reduced across cells that have similar long-range connectivity. The conclusions drawn from the behavioral results are weakened by the animals' poor performance on the designed task.This study has the potential to be of interest to neuroscientists.

      We thank the editor and the reviewers for their feedback on our work, which we have incorporated to help improve interpretation of our findings as outlined in the response below. While we agree with the editor that further work is necessary to provide a comprehensive understanding of claustrum circuitry and activity, this is true of most scientific endeavors and therefore we feel that describing this work as “incomplete” unfairly mischaracterizes the intent of the experiments performed which provide fundamental insights into this poorly understood brain region. Additionally, as identified in the main text, methods section, and our responses to the comments below, we disagree that the behavioral results are “weakened” by the performance of the animals. Our goal was to assess what information animals learned and used in an ambiguous sensory/reward environment, not to shape them toward a particular behavior and interpret the results solely based on their accuracy in performing the task.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper by Shelton et al investigates some of the anatomical and physiological properties of the mouse claustrum. First, they characterize the intrinsic properties of claustrum excitatory and inhibitory neurons and determine how these different claustrum neurons receive input from different cortical regions. Next, they perform in vitro patch clamp recordings to determine the extent of intraclaustrum connectivity between excitatory neurons. Following these experiments, in vivo axon imaging was performed to determine how claustrum-retrosplenial cortex neurons are modulated by different combinations of auditory, visual, and somatosensory input. Finally, the authors perform claustrum lesions to determine if claustrum neurons are required for performance on a multisensory discrimination task

      Strengths:

      An important potential contribution the authors provide is the demonstration of intra-claustrum excitation. In addition, this paper provides the first experimental data where two cortical inputs are independently stimulated in the same experiment (using 2 different opsins). Overall, the in vitro patch clamp experiments and anatomical data provide confirmation that claustrum neurons receive convergent inputs from areas of the frontal cortex. These experiments were conducted with rigor and are of high quality.

      We thank the reviewer for their positive appraisal of our work.

      Weaknesses:

      The title of the paper states that claustrum neurons integrate information from different cortical sources. However, the authors did not actually test or measure integration in the manuscript. They do show physiological convergence of inputs on claustrum neurons in the slice work. Testing integration through simultaneous activation of inputs was not performed. The convergence of cortical input has been recently shown by several other papers (Chia et al), and the current paper largely supports these previous conclusions. The in vivo work did test for integration because simultaneous sensory stimulations were performed. However, integration was not measured at the single cell (axon) level because it was unclear how activity in a single claustrum ROI changes in response to (for example) visual, tactile, and visual-tactile stimulations. Reading the discussion, I also see the authors speculate that the sensory responses in the claustrum could arise from attentional or salience-related inputs from an upstream source such as the PFC. In this case, claustrum cells would not integrate anything (but instead respond to PFC inputs).

      We thank the reviewer for raising this point. In response, we have provided a definition of “integration” in the manuscript text (lines 112-114, 353-354):

      “...single-cell responsiveness to more than one input pathway, e.g. being capable of combining and therefore integrating these inputs.”

      The reviewer’s point about testing simultaneous input to the claustrum is well made but not possible with the dual-color optogenetic stimulation paradigm used in our study as noted in the Results and Discussion sections (see also Klapoetke et al., 2014, Hooks et al., 2015). The novelty of our paper comes from testing these connections in single CLA neurons, something not shown in other studies to-date (Chia et al., 2020; Qadir et al., 2022), which average connectivity over many neurons.

      Finally, we disagree with the reviewer regarding whether integration was tested at the single-axon level and provide data and supplementary figures to this effect (Fig. 6, Supp. Fig. S14, lines 468-511) . Although the possibility remains that sensory-related information may arise in the prefrontal cortex, as we note, there is still a large collection of studies (including this one) that document and describe direct sensory inputs to the claustrum (Olson & Greybeil, 1980; Sherk & LeVay, 1981; Smith & Alloway, 2010; Goll et al., 2015; Atlan et al., 2017; etc.). We have updated the wording of these sections to note that both direct and indirect sensory input integration is possible.

      The different experiments in different figures often do not inform each other. For example, the authors show in Figure 3 that claustrum-RSP cells (CTB cells) do not receive input from the auditory cortex. But then, in Figure 6 auditory stimuli are used. Not surprisingly, claustrum ROIs respond very little to auditory stimuli (the weakest of all sensory modalities). Then, in Figure 7 the authors use auditory stimuli in the multisensory task. It seems that these experiments were done independently and were not used to inform each other.

      The intention behind the current manuscript was to provide a deep characterisation of claustrum to inform future research into this enigmatic structure. In this case, we sought to test pathways in vivo that were identified as being weak or absent in vitro to confirm and specifically rule out their influence on computations performed by claustrum. We agree with the reviewer’s assessment that it is not surprising that claustrum ROIs respond weakly to auditory stimuli. Not testing these connections in vivo because of their apparent sparsity in vitro would have represented a critical gap in our knowledge of claustrum responses during passive sensory stimulation.

      One novel aspect of the manuscript is the focus on intraclaustrum connectivity between excitatory cells (Figure 2). The authors used wide-field optogenetics to investigate connectivity. However, the use of paired patch-clamp recordings remains the ground truth technique for determining the rate of connectivity between cell types, and paired recordings were not performed here. It is difficult to understand and gain appreciation for intraclaustrum connectivity when only wide-field optogenetics is used.

      We thank the reviewer for acknowledging the novelty of these experiments. We further acknowledge that paired patch-clamp recordings are the gold standard for assessing synaptic connectivity. Typically such experiments are performed in vitro, a necessity given the ventral location of claustrum precluding in vivo patching. In vitro slice preparations by their very nature sever connections and lead to an underestimate of connectivity as noted in our Discussion. Kim et al. (2016) have done this experiment in coronal slices with the understanding that excitatory-excitatory connectivity would be local (<200 μm) and therefore preserved. We used a variety of approaches that enabled us to explore connectivity along the longitudinal axis of the brain (the rostro-caudal, e.g. “long” axis of the claustrum), providing fresh insight into the circuitry embedded within this structure that would be challenging to examine using dual recordings. Further, our optogenetic method (CRACM, Petreanu et al., 2007), has been used successfully across a variety of brain structures to examine excitatory connectivity while circumventing artifacts arising from the slice axis.

      In Figure 2, CLA-rsp cells express Chrimson, and the authors removed cells from the analysis with short latency responses (which reflect opsin expression). But wouldn't this also remove cells that express opsin and receive monosynaptic inputs from other opsin-expressing cells, therefore underestimating the connectivity between these CLA-rsp neurons? I think this needs to be addressed.

      The total number of opsin-expressing CLA neurons in our dataset is 4/46 tested neurons. Assuming all of these neurons project to RSP, they would have accounted for 4/32 CLARSP neurons. Given the rate of monosynaptic connectivity observed in this study, these neurons would only contribute 2-3 additional connected neurons. Therefore, the exclusion of these neurons does not significantly impact the overall statistical accuracy of our connectivity findings.

      In Figure 5J the lack of difference in the EPSC-IPSC timing in the RSP is likely due to 1 outlier EPSC at 30 ms which is most likely reflecting polysynaptic communication. Therefore, I do not feel the argument being made here with differences in physiology is particularly striking.

      We thank the reviewer for their attention to detail about this analysis. We have performed additional statistics and found that leaving this neuron out does not affect the significance of the results (new p-value = 0.158, original p-value = 0.314, Mann-Whitney U test). We have removed this datapoint from the figure and our analysis.

      In the text describing Figure 5, the authors state "These experiments point to a complex interaction ....likely influenced by cell type of CLA projection and intraclaustral modules in which they participate". How does this slice experiment stimulating axons from one input relate to different CLA cell types or intra-claustrum circuits? I don't follow this argument.

      We have removed this speculation from the Results section.

      In Figure 6G and H, the blank condition yields a result similar to many of the sensory stimulus conditions. This blank condition (when no stimulus was presented) serves as a nice reference to compare the rest of the conditions. However, the remainder of the stimulation conditions were not adjusted relative to what would be expected by chance. For example, the response of each cell could be compared to a distribution of shuffled data, where time-series data are shuffled in time by randomly assigned intervals and a surrogate distribution of responses generated. This procedure is repeated 200-1000x to generate a distribution of shuffled responses. Then the original stimulus-triggered response (1s post) could be compared to shuffled data. Currently, the authors just compare pre/post-mean data using a Mann-Whitney test from the mean overall response, which could be biased by a small number of trials. Therefore, I think a more conservative and statistically rigorous approach is warranted here, before making the claim of a 20% response probability or 50% overall response rate.

      We appreciate the reviewer's thorough analysis and suggestion for a more conservative statistical approach. We acknowledge that responses on blank trials occur about 10% of the time, indicating that response probabilities around this level may not represent "real" responses. To address this, we will include the responses to the blank condition in the manuscript (lines 505-509). This will allow readers to make informed decisions based on the presented data.

      Regarding Figure 6, a more conventional way to show sensory responses is to display a heatmap of the z-scored responses across all ROIs, sorted by their post-stimulus response. This enables the reader to better visualize and understand the claims being made here, rather than relying on the overall mean which could be influenced by a few highly responsive ROIs.

      We apologize to the reviewer that our data in this figure was challenging to interpret. We have included an additional supplemental figure (Supp. Fig. S15) that displays the requested information.

      For Figure 6, it would also help to display some raw data showing responses at the single ROI level and the population level. If these sensory stimulations are modulating claustrum neurons, then this will be observable on the mean population vector (averaged df/f across all ROIs as a function of time) within a given experiment and would add support to the conclusions being made.

      We appreciate the reviewer’s desire to see more raw data – we would have included this in the figure given more space. However, the average df/f across all ROIs is shown as a time series with 95% confidence intervals in Fig. 6D.

      As noted by the authors, there is substantial evidence in the literature showing that motor activity arises in mice during these types of sensory stimulation experiments. It is foreseeable that at least some of the responses measured here arise from motor activity. It would be important to identify to what extent this is the case.

      While we acknowledge that some responses may arise from motor-related activity, addressing this comprehensively is beyond the scope of this paper. Given the extensive number of trials and recorded axonal segments, we believe that motor-related activity is unlikely to significantly impact the average response across all trials. Future studies focusing specifically on motor activity during sensory stimulation experiments would be needed to elucidate this aspect in detail.

      All claims in the results for Figure 6 such as "the proportion of responsive axons tended to be highest when stimuli were combined" should be supported by statistics.

      We have provided additional statistics in this section (lines 490-511) to address the reviewer’s comment.

      In Figure 7, the authors state that mice learned the structure of the task. How is this the case, when the number of misses is 5-6x greater than the number of hits on audiovisual trials (S Figure 19). I don't get the impression that mice perform this task correctly. As shown in Figure 7I, the hit rate is exceptionally low on the audiovisual port in controls. I just can't see how control and lesion mice can have the same hit rate and false alarm rate yet have different d'. Indeed, I might be missing something in the analysis. However, given that both groups of mice are not performing the task as designed, I fail to see how the authors' claim regarding multisensory integration by the claustrum is supported. Even if there is some difference in the d' measure, what does that matter when the hits are the least likely trial outcome here for both groups.

      We thank the reviewer for their comments and hope the following addresses their confusion about the performance of animals during our multimodal conditioning task.

      Firstly, as pointed out by the reviewer, the hit-rate (HR) is lower than false-alarm-rate (FR) but crucially only when assessed explicitly within-condition (e.g. just auditory or just visual stimulation). Given the multimodal nature of the assay, HR and FR could also be evaluated across different trials, unimodal and multimodal, for both auditory and visual stimuli. Doing so resulted in a net positive d', as observed by the reviewer. From this perspective, and as documented in the Methods (Multimodal Conditioning and Reversal Learning) and Supplemental Figures, mice do indeed learn the conditioning task and perform at above-chance levels.

      Secondly, as raised in the Discussion, an important caveat of this assay was that it was unnecessary for mice to learn the task structure explicitly but, rather, that they respond to environmental cues in a reward-seeking manner that indicated perception of a stimulus. "Performance" as it is quantified here demonstrates a perceptual difference between conditions that is observed through behavioral choice and timing, not necessarily the degree to which the mice have an understanding of the task per se.

      In the discussion, it is stated that "While axons responded inconsistently to individual stimulus presentations, their responsivity remained consistent between stimuli and through time on average...". I do not understand this part of the sentence. Does this mean axons are consistently inconsistent?

      The reviewer’s interpretation is correct – although recorded axons tended to have a preferred stimulus or combination of stimuli, they displayed variability in their responses (response probability), though little or no variability in their likelihood to respond over time (on average).

      In the discussion, the authors state their axon imaging results contrast with recent studies in mice. Why not actually do the same analysis that Ollerenshaw did, so this statement is supported by fact? As pointed out above, the criteria used to classify an axon as responsive to stimuli were very liberal in this current manuscript.

      While we appreciate this comment from the reviewer, we feel that it was not necessary to perform similar analyses to those of Ollerenshaw et al in order to appreciate that methodological differences between these studies would have confounded any comparisons made, as we note in the Discussion.

      I find the discussion wildly speculative and broad. For example, "the integrative properties of the CLA could act as a substrate for transforming the information content of its inputs (e.g. reducing trial-to-trial variability of responses to conjunctive stimuli...)". How would a claustrum neuron responding with a 10% reliability to a stimuli (or set of stimuli) provide any role in reducing trial-to-trial variability of sensory activity in the cortex?

      We thank the reviewer for their feedback. We acknowledge the reviewer's concern regarding the speculative nature of our discussion. To address the specific point raised, while a neuron with a 10% reliability might appear limited in reducing trial-to-trial variability in sensory activity, it's possible that such neurons are responsive to a combination of stimuli or conditions not fully controlled or recorded in our current setup. For instance, variables like the animal’s attentional or motivational states could influence the responsiveness of claustrum neurons, thus integrating these inputs could theoretically modulate cortical processing. We have refined this section to clarify these points (now lines 810-813).

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Shelton et al. explore the organization of the Claustrum. To do so, they focus on a specific claustrum population, the one projecting to the retrosplenial cortex (CLA-RSP neurons). Using an elegant technical approach, they first described electrophysiological properties of claustrum neurons, including the CLA-RSP ones. Further, they showed that CLA-RSP neurons (1) directly excite other CLA neurons, in a 'projection-specific' pattern, i.e. CLA-RSP neurons mainly excite claustrum neurons not projecting to the RSP and (2) receive excitatory inputs from multiple cortical territories (mainly frontal ones). To confirm the 'integrative' property of claustrum networks, they then imaged claustrum axons in the cortex during singleor multi-sensory stimulations. Finally, they investigated the effect of CLA-RSP lesion on performance in a sensory detection task.

      Strengths:

      Overall, this is a really good study, using state-of-the-art technical approaches to probe the local/global organization of the Claustrum. The in-vitro part is impressive, and the results are compelling.

      We thank the reviewer for their positive appraisal of our work.

      Weaknesses:

      One noteworthy concern arises from the terminology used throughout the study. The authors claimed that the claustrum is an integrative structure. Yet, integration has a specific meaning, i.e. the production of a specific response by a single neuron (or network) in response to a specific combination of several input signals. In this study, the authors showed compelling results in favor of convergence rather than integration. On a lighter note, the in-vivo data are less convincing, and do not entirely support the claim of "integration" made by the authors.

      We thank the reviewer for their clarity on this issue. We absolutely agree that without clear definition in the study, interpretation of our data could be misconstrued for one of several possible meanings. We have updated our Introduction, Results, and Discussion text to reflect the definition of ‘integration’ we used in the interpretation of our work and hope this clarifies our intent to the reader.

      Reviewer #3 (Public Review):

      The claustrum is one of the most enigmatic regions of the cerebral cortex, with a potential role in consciousness and integrating multisensory information. Despite extensive connections with almost all cortical areas, its functions and mechanisms are not well understood. In an attempt to unravel these complexities, Shelton et al. employed advanced circuit mapping technologies to examine specific neurons within the claustrum. They focused on how these neurons integrate incoming information and manage the output. Their findings suggest that claustrum neurons selectively communicate based on cortical projection targets and that their responsiveness to cortical inputs varies by cell type.

      Imaging studies demonstrated that claustrum axons respond to both single and multiple sensory stimuli. Extended inhibition of the claustrum significantly reduced animals' responsiveness to multisensory stimuli, highlighting its critical role as an integrative hub in the cortex.

      However, the study's conclusions at times rely on assumptions that may undermine their validity. For instance, the comparison between RSC-projecting and non-RSC-projecting neurons is problematic due to potential false negatives in the cell labeling process, which might not capture the entire neuron population projecting to a brain area. This issue casts doubt on the findings related to neuron interconnectivity and projections, suggesting that the results should be interpreted with caution. The study's approach to defining neuron types based on projection could benefit from a more critical evaluation or a broader methodological perspective.

      We thank the reviewer for their attention to the methods used in our study. We acknowledge that there is an inherent bias introduced by false-negatives as a result of incomplete labeling but contend that this is true of most modern tracing experiments in neuroscience, irrespective of the method used. Moreover, if false-negative biases are affecting our results, then they likely do so in the direction of supporting our findings – perfect knowledge of claustrum connectivity would likely enhance the effects seen by increasing the pool of neurons for which we find an effect. For example, our cortico-claustal connectivity findings in Figure 3 likely would have shown even larger effects should false-negative CLARSP neurons have been positively identified.

      Where appropriate we have provided estimates of variability and certainty in our experimental findings and do not claim any definitive knowledge of the true rate and scope of claustrum connectivity.

      Nevertheless, the study sets the stage for many promising future research directions. Future work could particularly focus on exploring the functional and molecular differences between E1 and E2 neurons and further assess the implications of the distinct responses of excitatory and inhibitory claustrum neurons for internal computations. Additionally, adopting a different behavioral paradigm that more directly tests the integration of sensory information for purposeful behavior could also prove valuable.

      We thank the reviewer for their outlook on the future directions of our work. These avenues for study, we believe, would be very fruitful in uncovering the cell-type-specific computations performed by claustrum neurons.

      Recommendations for the authors:

      Reviewing Editor (Recommendations for the Authors):

      The editor recommends addressing the issues raised by the reviewers about the statistical significance of sensory response with respect to blank stimuli, and solving the issue generated by the exclusion of monosynaptically connected neurons in the connectivity study, to raise the assessment strength of evidence from incomplete to solid. Moreover, as the reported result stands, the behavioral task does not seem to be learned by the animals as the animals are above chance for visual and auditory but largely below chance level for multisensory. It seems that the animals do not perform a multisensory task. The authors should clarify this.

      Reviewer #1 (Recommendations For The Authors):

      Several references were missing from the manuscript, where mouse CLA-retrosplenial or CLA-frontal neurons were investigated and would be highly relevant to both the discussion of claustrum function and the context of the methodologies used here. (Wang et al., 2023 Nat Comm; Nair et al., 2023 PNAS, Marriott et al. 2024 Cell Reports ; Faig et al., 2024 Current

      Biology).

      Reviewer #2 (Recommendations For The Authors):

      Let me be clear, this is an excellent study, using state-of-the-art technical approaches to probe the local/global organization of the Claustrum. However, the study is somehow disconnected, with a fantastic in-vitro part, and, in my opinion, a less convincing in-vivo one.

      As stated in the public review, I'm concerned about the use of the term "integration", as, in my opinion, the data presented in this study (which I repeat are of excellent level) do not support that claim.

      Below are my main points regarding the article:

      (1) My main comment relates to the use of the term 'integration'. It might be a semantic debate, but I think that this is an important one. In my opinion, neural integration is the "summing of several neural input signals by a single neuron to produce an output signal that is some function of those inputs". As the authors state in the discussion, they were not able to "assess the EPSP response magnitude to the conjunction of stimuli due to photosensitivity of ChrimsonR opsins to blue light". Therefore, the authors did not specifically prove integration, but rather input convergence. This does not mean that the results presented are not important or of excellent quality, but I encourage the authors to either tone down the part on integration or to give a clear definition of what they call integration.

      (2) The in vivo imaging data are somehow confusing. First, the authors image two claustral populations simultaneously (the CLA-RSP and the CLA-ACA axons). I may be missing the information, but there is no evidence that these cells overlap in the CLA (no data in the supplement and existing literature only support partial overlap). Second, in the results part, the authors claim that 96% of the sensory-responsive axons displayed multisensory response. This, combined with the 47% of axons responsive to at least one stimulus should lead to a global response of around 45% of the axons in multisensory trials. Yet, in Figures 6F-G, one can see that the response probability is actually low (closer to 20%). To be honest, I cannot really understand how to make sense of these results. At first, I thought that most of the multisensory responsive axons show no response during multisensory stimulus (but one in the unimodal stimulus). This hypothesis is however unlikely, as response AUC is biased toward positivity in Figure 6H. Overall, I'm not totally convinced by the imaging data, and I think that the authors should be more cautious about interpreting their results (as they are in the discussion part, but less in the results part).

      (3) The TetTox approach used in the study ablates all neurons expressing the CRE in the CLA. If the hypothesis proposed by the authors is true, then ablating one subpopulation should not impact that much the functioning of the whole CLA, as other neurons will likely "integrate" information coming from multiple cortices (Figures 3 and 4), the local divergence (Figure 1) will then allow the broadcasting of this information back to multiples cortices. Do the authors think that such an approach deeply modified intra-claustral network connectivity? If this is not the case, shouldn't we expect less effect after lesioning a specific sub-population of CLA neurons?

      (4) The behavioral protocol is also confusing. If I understand correctly, the aim of the task was to probe the D-Prime factor, as all trials, whatever the response of the animal are rewarded. From the Figure 7I, one can see that the mice cannot properly answer to the audiovisual cues, clearly indicating that both groups show impaired response to this type of trial. The whole conclusion of the authors is therefore drawn from the D-Prime calculation. However, even if D-Prime should represent a measure of sensitivity (i.e. is unaffected by response bias), two assumptions need to be met: (1) the signal and noise distributions should be both normal, and (2) the signal and noise distributions should have the same standard deviation. However, these assumptions cannot be tested in the task used by the authors (one would need rating tasks). The authors might want to use nonparametric measures of sensitivity such as A' (see Pollack and Norman 1964).

      Reviewer #3 (Recommendations For The Authors):

      While the study is comprehensive, some of its conclusions are based on assumptions that potentially weaken their validity. A significant issue arises in the comparison between neurons that project to the retrosplenial cortex (RSC) and those that do not. This differentiation is based on retrograde labeling from a single part of the RSC. However, CTB labeling, the technique used, does not capture 100% of the neurons projecting to a brain area. The study itself demonstrates this by showing that injecting the dye into three sections of the RSC results in three overlapping populations of neurons in the claustrum. Therefore, limiting the injection to just one of these areas inevitably leads to many false negatives-neurons that project to the RSC but are not marked by the CTB. This issue recurs in the analysis of neurons projecting to both the RSC and the prelimbic cortex (PL), where assumptions about interconnectivity are made without a thorough examination of overlap between these populations. The incomplete labeling complicates the interpretation of the data and draws firm conclusions from it.

      Minor.

      There is a reference to Figure 1D where claustrum->cortical connections are described. This should be 5D.

      This is a correct reference pointing back to our single-cell characterizations of CLA morphoelectric types.

      End of Page 22. Implies should be imply.

      This has been resolved in the manuscript text.

    2. eLife Assessment

      This study compiles a wide range of results on the connectivity, stimulus selectivity, and potential role of the claustrum in sensory behavior. While most of the connectivity results confirm earlier studies, this valuable work provides incomplete evidence that the claustrum responds to multimodal stimuli and that local connectivity is reduced across cells that have similar long-range connectivity. The conclusions drawn from the behavioral results are weakened by the animals' poor performance on the designed task. This study has the potential to be of interest to neuroscientists.

    3. Reviewer #1 (Public review):

      Summary:

      The paper by Shelton et al investigates some of the anatomical and physiological properties of the mouse claustrum. First, they characterize the intrinsic properties of claustrum excitatory and inhibitory neurons and determine how these different claustrum neurons receive input from different cortical regions. Next, they perform in vitro patch clamp recordings to determine the extent of intraclaustrum connectivity between excitatory neurons. Following these experiments, in vivo axon imaging was performed to determine how claustrum-retrosplenial cortex neurons are modulated by different combinations of auditory, visual, and somatosensory input. Finally, the authors perform claustrum lesions to determine if claustrum neurons are required for performance on a multisensory discrimination task

      Strengths:

      An important potential contribution the authors provide is the demonstration of intra-claustrum excitation. In addition, this paper does provide the first experimental data where two cortical inputs are independently stimulated in the same experiment (using 2 different opsins). Overall, the in vitro patch clamp experiments and anatomical data provide confirmation that claustrum neurons receive convergent inputs from areas of frontal cortex. These experiments were conducted with rigor and are of high quality.

      Weaknesses:

      The title of the paper states that claustrum neurons integrate information from different cortical sources. However, the authors did not actually test or measure integration in the manuscript. They do show physiological convergence of inputs on claustrum neurons in the slice work. Testing integration through simultaneous activation of inputs was not performed. The convergence of cortical input has been recently shown by several other papers (Chia et al), and the current paper largely supports these previous conclusions. The in vivo work did test for integration, because simultaneous sensory stimulations were performed. However, integration was not measured at the single cell (axon) level because it was unclear how activity in a single claustrum ROI changes in response to (for example) visual, tactile, and visual-tactile stimulations. Reading the discussion, I also see the authors speculate that the sensory responses in the claustrum could arise from attentional or salience related inputs from an upstream source such as the PFC. In this case, claustrum cells would not integrate anything (but instead respond to PFC inputs).

      The different experiments in different figures often do not inform each other. For example, the authors show in Figure 3 that claustrum-RSP cells (CTB cells) do not receive input from the auditory cortex. But then, in Figure 6 auditory stimuli are used. Not surprisingly, claustrum ROIs respond very little to auditory stimuli (the weakest of all sensory modalities). Then, in Figure 7 the authors use auditory stimuli in the multisensory task. It seems that these experiments were done independently and were not used to inform each other.

      One novel aspect of the manuscript is the focus on intraclaustrum connectivity between excitatory cells (Figure 2). The authors used wide-field optogenetics to investigate connectivity. However, the use paired patch clamp recordings remains the ground truth technique for determining the rate of connectivity between cell types, and paired recordings were not performed here. It is difficult to understand and gain appreciation for intraclaustrum connectivity when only wide-field optogenetics is used.

      In Figure 2, CLA-rsp cells express Chrimson, and the authors removed cells from the analysis with short latency responses (which reflect opsin expression). But wouldn't this also remove cells that express opsin and receive monosynaptic inputs from other opsin expressing cells, therefore underestimating the connectivity between these CLA-rsp neurons? I think this needs to be addressed.

      In Figure 5J the lack of difference in the EPSC-IPSC timing in the RSP is likely due to 1 outlier EPSC at 30ms which is most likely reflecting polysynaptic communication. Therefore, I do not feel the argument being made here with differences in physiology is particularly striking.

      In the text describing Figure 5, the authors state "These experiments point to a complex interaction ....likely influenced by cell type of CLA projection and intraclaustral modules in which they participate". How does this slice experiment stimulating axons from one input relate to different CLA cell types or intra-claustrum circuits? I don't follow this argument.

      In Figure 6G and H the blank condition yields a result similar to many of the sensory stimulus conditions. This blank condition (when no stimulus was presented) serves as a nice reference to compare the rest of the conditions. However, the remainder of the stimulation conditions were not adjusted relative to what would be expected by chance. For example, the response of each cell could be compared to a distribution of shuffled data, where time-series data are shuffled in time by randomly assigned intervals and a surrogate distribution of responses generated. This procedure is repeated 200-1000x to generate a distribution of shuffled responses. Then the original stimulus triggered response (1s post) could be compared to shuffled data. Currently, the authors just compare pre/post mean data using a Mann Whitney test from the mean overall response, which could be biased by a small number of trials. Therefore, I think a more conservative and statistically rigorous approach is warranted here, before making the claim of a 20% response probability or 50% overall response rate.

      Regarding Figure 6, a more conventional way to show sensory responses is to display a heatmap of the z-scored responses across all ROIs, sorted by their post-stimulus response. This enables the reader to better visualize and understand the claims being made here, rather than relying on the overall mean which could be influenced by a few highly responsive ROIs.

      For Figure 6 it would also help to display some raw data showing responses at the single ROI level and the population level. If these sensory stimulations are modulating claustrum neurons, then this will be observable on the mean population vector (averaged df/f across all ROIs as a function of time) within a given experiment and would add support to the conclusions being made.

      As noted by the authors, there is substantial evidence in the literature showing that motor activity arises in mice during these types of sensory stimulation experiments. It is foreseeable that at least some of the responses measured here arise from motor activity. It would be important to identify to what extent this is the case.

      All claims in the results for Figure 6 such as "the proportion of responsive axons tended to be highest when stimuli were combined" should be supported by statistics.

      For Figure 7, the authors state that mice learned the structure of the task. How is this the case, when the number of misses are 5-6x greater than the number of hits on audiovisual trials (S Fig 19). I don't get the impression that mice perform this task correctly. As shown in Figure 7I, the hit rate is exceptionally low on the audiovisual port in controls. I just can't see how control and lesion mice can have the same hit rate and false alarm rate yet have different d'. Indeed, I might be missing something in the analysis. However, given that both groups of mice are not performing the task as designed, I fail to see how the authors claim regarding multisensory integration by the claustrum is supported. Even if there is some difference in the d' measure, what does that matter when the hits are the least likely trial outcome here for both groups.

      In the discussion, it is stated that "While axons responded inconsistently to individual stimulus presentations, their responsivity remained consistent between stimuli and through time on average...". I do not understand this part of the sentence. Does this mean axons are consistently inconsistent?

      In the discussion the authors state their axon imaging results contrast with recent studies in mice. Why not actually do the same analysis that Ollerenshaw did, so this statement is supported by fact? As pointed out above, the criteria used to classify an axon as responsive to stimuli was very liberal in this current manuscript.

      I find the discussion wildly speculative and broad. For example, "the integrative properties of the CLA could act as a substrate for transforming the information content of its inputs (e.g. reducing trial to trial variability of responses to conjunctive stimuli...)". How would a claustrum neuron responding with a 10% reliability to a stimuli (or set of stimuli) provide any role in reducing trial to trial variability of sensory activity in the cortex?

      Comments on the latest version: The authors have revised the manuscript, by adding 1 new supplementary figure, and some minor changes to the text. Overall, my comments regarding the manuscript were not sufficiently addressed. Here is one example:

      The authors don't seem to be taking the comments regarding the statistical significance of the sensory responses seriously. If there is a response in 10% of the axons in the blank condition, and a 11 % response in the auditory stimulation, then that means that it is more accurate to say that 1% of axons actually respond to auditory stimulation. "leaving to reader to make their own decisions" as the authors suggest, but then having authors read text such as "All modalities could evoke responses in at least some claustrum neurons", is misleading because no attempt was made to correct for a chance level of detection that is clearly observed in the blank condition. Another interpretation of the authors data would be that in the case of the auditory/visual/somatosensory combined stimuli resulted in 21%(observed) - 10% (blank) = 11% of axons. Therefore, a conclusion that more accurately reflects the data would be that 89% of claustrum axons do not respond, even when the mouse received multisensory stimuli. I tried to get the authors to run some basic stats to more accurately test the true degree of responsiveness, but these changes did not appear in the manuscript.

    4. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Shelton et al. explore the organization of the Claustrum. To do so, they focus on a specific claustrum population, the one projecting to the retrosplenial cortex (CLA-RSP neurons). Using elegant technical approach, they first described electrophysiological properties of claustrum neurons, including the CLA-RSP ones. Further, they showed that CLA-RSP neurons 1) directly excite other CLA neurons, in a 'projection-specific' pattern, i.e. CLA-RSP neurons mainly excite claustrum neurons not projecting to the RSP and 2) received excitatory inputs from multiple cortical territories (mainly frontal ones). In an effort to confirm the 'integrative' property of claustrum networks, they then imaged claustrum axons in the cortex during single- or multi-sensory stimulations. Finally, they investigated the effect of CLA-RSP lesion on performance in a sensory detection task.

      Strengths:

      Overall, this is a really good study, using state of the art technical approaches to probe the local/global organization of the Claustrum. The in-vitro part is impressive, and the results are compelling.

      Weaknesses:

      One noteworthy concern arises from the terminology used throughout the study. The authors claimed that the claustrum is an integrative structure. Yet, integration has a specific meaning, i.e. the production of a specific response by a single neuron (or network) in response to a specific combination of several input signals. In this study, the authors showed compelling results in favor of convergence rather than integration. On a lighter note, the in-vivo data are less convincing, and do not entirely support the claim of "integration" made by the authors.

    5. Reviewer #3 (Public review):

      Public review:

      The claustrum is one of the most enigmatic regions of the cerebral cortex, with a potential role in consciousness and integrating multisensory information. Despite extensive connections with almost all cortical areas, its functions and mechanisms are not well understood. In an attempt to unravel these complexities, Shelton et al. employed advanced circuit mapping technologies to examine specific neurons within the claustrum. They focused on how these neurons integrate incoming information and manage the output. Their findings suggest that claustrum neurons selectively communicate based on cortical projection targets and that their responsiveness to cortical inputs varies by cell type.

      Imaging studies demonstrated that claustrum axons respond to both single and multiple sensory stimuli. Extended inhibition of the claustrum significantly reduced animals' responsiveness to multisensory stimuli, highlighting its critical role as an integrative hub in the cortex.

      However, the study's conclusions at times rely on assumptions that may undermine their validity. For instance, the comparison between RSC projecting and non-RSC projecting neurons is problematic due to potential false negatives in the cell labeling process, which might not capture the entire neuron population projecting to a brain area. This issue casts doubt on the findings related to neuron interconnectivity and projections, suggesting that the results should be interpreted with caution. The study's approach to defining neuron types based on projection could benefit from a more critical evaluation or a broader methodological perspective.

      Nevertheless, the study sets the stage for many promising future research directions. Future work could particularly focus on exploring the functional and molecular differences between E1 and E2 neurons and further assess the implications of the distinct responses of excitatory and inhibitory claustrum neurons for internal computations. Additionally, adopting a different behavioral paradigm that more directly tests the integration of sensory information for purposeful behavior could also prove valuable.

    1. eLife Assessment

      This valuable study investigates the relationship between neuronal dynamics in the thalamus and brain state modulation. The claims that a specific channel is a critical player in the regulation of brain-states and ethanol-resistance in mice are supported by convincing evidence. The work will be of interest to systems neuroscientists interested in brain dynamics and behavioural states.

    2. Reviewer #1 (Public review):

      Summary:

      This is an interesting and valuable study that uses multiple approaches to understand the role of bursting involving voltage-gated calcium channels within the mediodorsal thalamus in the sedative-hypnotic effects of alcohol. Given its unique functional roles and connectivity pattern, the finding that the mediodorsal thalamus has a fundamental role in regulating alcohol-induced transitions in consciousness state is both important for researchers investigating thalamocortical dynamics and more broadly interesting for understanding brain function. In addition, the author's examination of the role of the voltage-gated calcium channel Cav3.1 provides considerable evidence that burst-firing mediated by this channel in the thalamus is functionally important for behavioral-state transitions. While many previous studies have suggested an analogous role for these channels in sleep-state regulation, the evidence for a role of this type of bursting in sedative-induced transitions is more limited so the evidence presented is of considerable value to the field. By performing comparative experiments across multiple thalamic nuclei which have been implicated in controlling state-transitions, the authors also validate their claim and establish the unique role of the mediodorsal thalamus. Overall, this study provides substantial mechanistic insight into how the thalamus influences drug induced transitions between different states of consciousness and opens avenues for future research into how thalamocortical interactions enable brain function.

      Strengths:

      This study employes multiple, complementary research approaches including behavioral assays, sh-RNA based localized knockdown, single-unit recordings, and patterned optogenetic interventions to examine the role of activity in the mediodorsal thalamus in the sedative-hypnotic effects of alcohol. Experiments and analysis included in the manuscript generally appear well conceived and generally well executed. Sample sizes are sufficiently large and statistical analysis appears generally appropriate. The findings presented are novel and provide interesting insight into the role of the thalamus as well as voltage gated calcium channels within this region in controlling behavioral state-transitions induced by alcohol. In particular, the observed effects of selective knockout along with recordings in total knockout oof the voltage gated calcium channel, Cav3.1, which has previously been implicated in bursting dynamics as well as state transitions, particularly in sleep, together suggest that the transition of thalamic neurons to a bursting pattern of firing from a more constant firing is important for transition to the sedated state produced by ethanol intoxication. While previous studies have similarly implicated Cav3.1 bursting in behavioral state-transitions, the direct optogenetic interventions and single-unit recordings provide valuable new insight. These findings may also have valuable implications for the relationship between sleep process disruption associated with ethanol dependence.

      Weaknesses:

      While the authors have made substantial improvements to the analysis and presented important additional results, some of the methods given in the supplemental are still somewhat minimal in their description of the methods employed. In addition, the text of the manuscript still has multiple problematic issues with writing and editing that should be addressed. Such writing issues appear throughout the manuscript including in the abstract as well as in all other sections. While they do not reduce the value of the findings presented, they do make them more difficult to understand and so should be corrected.

    3. Reviewer #2 (Public review):

      This study explores the role of the mediodorsal thalamus (MD) and the T-type calcium channel Cav3.1 in ethanol-induced behavioral changes, focusing on transitions between sedation and shifts in brain-states. The authors utilize genetic knockdown, optogenetic manipulation, and electrophysiological recording techniques in mice to assess the contribution of MD Cav3.1 channels to ethanol's sedative effects. The central hypothesis is that Cav3.1-mediated burst firing in the MD is essential for regulating ethanol-induced sedation and arousal transitions.

      The authors' detailed responses to reviewers' comments significantly improved the manuscript, particularly regarding experimental specificity and methodological transparency. They addressed concerns about the specificity of MD knockdowns versus neighboring thalamic nuclei by adding quantifications, enhancing figure clarity, and providing lesion localization data. The revised figures, with added quantification panels, strengthened the claim that the manipulations specifically targeted the MD. Improvements in lesion validation figures and electrode placement explanations further clarified the accuracy of their methods.

      One major limitation, as highlighted by Reviewer 1, is the lack of direct evidence from inhibitory optogenetic studies to validate the role of Cav3.1 channels in modulating ethanol-induced transitions in the MD. While the authors acknowledged the challenges of such experiments, citing technical issues like the inability of Cav3.1 knockout to allow rebound burst firing, the absence of these controls limits definitive causal conclusions about the MD's role. Alternative experiments with varying ethanol doses and data on tonic versus burst firing were presented, but these do not fully compensate for the missing inhibitory optogenetics, leaving some uncertainty regarding the attribution of observed behavioral effects solely to Cav3.1-mediated burst activity in the MD.<br /> Another challenge is the complexity of distinguishing the specific contribution of the MD from that of other thalamic nuclei involved in regulating arousal and brain-states. Although additional quantification was provided to demonstrate MD specificity, control experiments targeting adjacent regions like the central lateral nucleus (CL) would have strengthened the manuscript. While the practical constraints are understandable, this limitation slightly weakens the argument regarding the MD's unique role in state transitions. The provided explanations about spatial targeting and electrophysiological methods were reasonable, but a broader set of thalamic controls would have offered a more comprehensive understanding.

      Overall, the authors successfully achieved their aims, providing strong evidence that Cav3.1-mediated burst firing in the MD is crucial for ethanol-induced sedation. The knockdown experiments showed a clear reduction in ethanol sensitivity, and the behavioral assays supported the conclusion that MD Cav3.1 activity plays a key role in regulating arousal states. The combined use of Cav3.1 knockdown and optogenetic stimulation effectively linked MD activity to ethanol-induced behavioral changes. The evidence presented establishes a clear mechanistic connection between neuronal activity and behavioral responses.

      The expanded discussion and clarifications in response to reviewer feedback enhanced the manuscript's coherence, and the revisions to the figures improved the transparency of the findings. Despite not implementing all the additional experiments suggested by Reviewer 1, the authors provided sufficient alternative evidence and a clear explanation of practical limitations, making their conclusions credible given the available data.

      This study significantly advances our understanding of thalamic involvement in behavioral state transitions, particularly ethanol-induced sedation. By clarifying the role of Cav3.1-mediated burst firing in the MD, the research provides new insights into how specific neuronal activity patterns influence global brain states and behavioral arousal, which has implications for understanding mechanisms underlying anesthesia, sedation, and sleep regulation. Moreover, the transparency in data sharing and detailed methodological revisions make this work a valuable resource for replication or adaptation in similar studies.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is an interesting and valuable study that uses multiple approaches to understand the role of bursting involving voltage-gated calcium channels within the mediodorsal thalamus in the sedative-hypnotic effects of alcohol. Given its unique functional roles and connectivity pattern, the idea that the mediodorsal thalamus may have a fundamental role in regulating alcohol-induced transitions in consciousness state would be both important for researchers investigating thalamocortical dynamics and more broadly interesting for understanding brain function. In addition, the author's examination of the role of the voltage-gated calcium channel Cav3.1 provides some evidence that burst-firing mediated by this channel in the thalamus is functionally important for behavioral-state transitions. While many previous studies have suggested an analogous role for sleep-state regulation, the evidence for an analogous role of this type of bursting in sedative-induced transitions is more limited. Despite the importance of these results, however, there is some concern that the manipulations and recording approaches employed by the authors may affect other thalamic nuclei adjacent to the MD, such as the central lateral nucleus, which has also been implicated in controlling state transitions. The evidence for a specific role of the mediodorsal thalamus is therefore somewhat incomplete, and so additional validation is needed.

      Strengths:

      This study employs multiple, complementary research approaches including behavioral assays, sh-RNAbased localized knockdown, single-unit recordings, and patterned optogenetic interventions to examine the role of activity in the mediodorsal thalamus in the sedative-hypnotic effects of alcohol. Experiments and analyses included in the manuscript generally appear well conceived and are also generally well executed. Sample sizes are sufficiently large and statistical analysis appears generally appropriate though in some cases additional quantification would be helpful. The findings presented are novel and provide some interesting insight into the role of the thalamus as well as voltage-gated calcium channels within this region in controlling behavioral state transitions induced by alcohol. In particular, the observed effects of selective knockout along with recordings in total knockout of the voltage-gated calcium channel, Cav3.1, which has previously been implicated in bursting dynamics as well as state transitions, particularly in sleep, together suggest that the transition of thalamic neurons to a bursting pattern of firing from a more constant firing is important for transition to the sedated state produced by ethanol intoxication. While previous studies have similarly implicated Cav3.1 bursting in behavioral state transitions, the direct optogenetic interventions and single-unit recordings provide valuable new insight. These findings may also have interesting implications for the relationship between sleep process disruption associated with ethanol dependence, although the authors do not appear to examine this directly or extensively discuss these implications of their findings.

      Weaknesses:

      A key claim of the study is that the mediodorsal thalamus is specifically important for the sedative-hypnotic effect of ethanol and that a transition to a bursting pattern of firing in this circuit facilitates these effects due to a loss of a more constant tonic firing pattern. Despite the generally clear observed effects across the included experiments, however, the evidence presented does not fully support that the mediodorsal thalamus, in particular, is involved. This distinction is important because some previous studies have suggested that another thalamic nucleus which is very close to the mediodorsal thalamus, the central-lateral thalamus, has previously been suggested to play a role in preventing sedative-induced transitions. Despite its proximity to the mediodorsal thalamus, the central-lateral thalamus has a substantially different pattern of connectivity so distinguishing which region is impacted is important for understanding the findings in the manuscript. While sh- RNA knockdown appears to be largely centered in the mediodorsal thalamus in the example shown, (Figure 2) this is rather minimal evidence and it is also not well explained (indeed, the relevant panels do not even appear to be referenced in the text of the manuscript) and the consistency of the knockdown targeting is not quantified. Additional evidence should be provided to validate this approach. Similarly, while an example is shown for the expression of ChR2 (Fig. 5) there seems to be some spread of expression outside of the mediodorsal thalamus even in his example raising a concern about how regionally specific this effect.

      The recordings targeting the mediodorsal thalamus could provide evidence of a direct association between changes in activity specifically in this part of the thalamus with the behavioral measures but there are currently some issues with making this link. One difficulty is that, although lesions are shown in Figure S5 to validate recording locations, this figure is relatively unclear and the examples appear to be taken from a different anterior/posterior location compared to the reference diagram. A larger image and improved visualization of the overall set of lesion locations that includes multiple anterior/posterior coronal sections would be helpful. Moreover, even for these example images, it is difficult to evaluate whether these are in the mediodorsal thalamus, particularly given the small size of the image shown. Ideally, an example image that is more obviously in the mediodorsal thalamus would also be included. Finally, an assessment of the relationship between the approximate locations of recorded neurons across the tetrode arrays and the behavioral measures would be very helpful in supporting the unique role of the mediodorsal thalamus. The lack of these direct links, in combination with the histological issues, reduces the insight that can be gained from this study.

      In addition to the key experimental issues mentioned above, there are often problems in the text of the manuscript with reasoning or at least explanation as well as numerous minor issues with editing. The most substantial such issue is the lack of clarity in discussing the mediodorsal thalamus and other adjacent thalamic nuclei, such as the central-lateral nucleus, in the author's discussion of previous findings. Given that at last one of the manuscripts cited by the authors (Saalman, Front. Sys. Neuro. 2014) has directly claimed that central-lateral, rather than the mediodorsal, thalamus is important for arousal regulation related to a conscious state, this distinction should be addressed clearly in the discussion rather than papered over by grouping multiple thalamic nuclei as being medial. As part of this discussion, it would be important to consider additional relevant literature including Bastos et al., eLife, 2021 and Redinbaugh et al., Neuron, 2020 which are quite critical but currently do not appear to be cited. Considering additional literature relevant to the function of the mediodorsal thalamus would also be beneficial. While the methods employed generally seem sound, the description in the methods section is lacking in detail and is often difficult to follow. Analysis methods such as the burst index appear to only be given a brief explanation in the text and appear not to be mentioned in the methods section. Similarly, the staining method used in Figure 2 does not appear to be described in the methods section. The most substantial case is for the UMAP approach used in Figure 4-E which does not appear to be described in the methods or even described in the main text. The lack of detailed descriptions makes it difficult to evaluate the applicability and quality of the experimental and analytical approaches. Citations justifying the use of methods such as the approach to separate regular spiking and narrow spiking neuron subtypes are also needed.

      Beyond the problems with content and reasoning discussed above, there are also some relatively minor issues with the clarity of writing throughout the paper (for example, in the abstract the authors refer to "the ethanol resistance behavior in WT mice" but it is difficult to parse what they mean by this statement. Similarly, the next sentence "These results support that the maintenance..." while clearer, is not well phrased. Though individually minor, issues like this re-occur throughout the manuscript and sometimes make it difficult to follow so the text should be revised to correct them. There are also some problems with labels such as the labels of A1/A2 in Figure 4, which appear to be incorrect. Also, S7 has no label] on the B panels. Finally, some references are not included (only a label of [ref]).

      Reviewer #2 (Public Review):

      In the current study, Latchoumane and collaborators focus on the Cav3.1 calcium channels in the mediodorsal thalamic nucleus as critical players in the regulation of brain-states and ethanol resistance in mice. By combining behavioural, electrophysiological, and genetic techniques, they report three main findings. First, KO Cav3.1 mice exhibit resistance to ethanol-induced sedation and sustained tonic firing in thalamocortical units. Second, knocked-down Cav3.1 mice reproduce the same behaviour when the mediodorsal, but not the ventrobasal, thalamic nucleus is targeted. Third, either optogenetic or electric stimulation of the mediodorsal thalamus reduces ethanol-induced sedation in control animals.

      Overall, the study is well designed and performed, correctly controlled for confounds, and properly analysed. Nonetheless, it is important to address some aspects of the report. The results support the conclusions of the study. These results are likely to be relevant in the field of systems neuroscience, as they increase the molecular evidence showing how the thalamus regulates brain states.

      Reviewer #1 (Recommendations For The Authors):

      Aside from the additional quantification and clarification of the analysis discussed in the weakness section, in general, the experiments included in the manuscript seem reasonable. However, I would suggest one additional experiment as well as one control, both of which are relatively straightforward optogenetic experiments, that I feel would be helpful to further improve the study. First, as the authors note, the optogenetic interventions used do not directly address the relevance of the changes in bursting patterns observed in the knockout (KO), which are by far the most robust effect, with the changes in alcohol sensitivity. One approach that could help address this would be to use patterned suppression via inhibitory opsins (e.g. halorhodopsin) to "rescue" the periods of inhibition associated with bursting in the KO. Localizing this inhibition to the mediodorsal thalamus would also lend further credence to their claim that this nuclei is the relevant circuit for their observed effects. For the control, tonic activation of the ventrobasal nucleus, as the authors did for the mediodorsal nucleus, would be beneficial to rule out the possibility that the observed effect would occur with any thalamic nucleus. In addition to these experiments, I did not note the strategy for sharing data obtained through this study so this should be added.

      R1 – 1: A key claim of the study is that the mediodorsal thalamus is specifically important for the sedative-hypnotic effect of ethanol and that a transition to a bursting pattern of firing in this circuit facilitates these effects due to a loss of a more constant tonic firing pattern. Despite the generally clear observed effects across the included experiments, however, the evidence presented does not fully support that the mediodorsal thalamus, in particular, is involved. This distinction is important because some previous studies have suggested that another thalamic nucleus which is very close to the mediodorsal thalamus, the central-lateral thalamus, has previously been suggested to play a role in preventing sedative-induced transitions. Despite its proximity to the mediodorsal thalamus, the central-lateral thalamus has a substantially different pattern of connectivity so distinguishing which region is impacted is important for understanding the findings in the manuscript.

      R1-A1: The reviewer is right that CL has been pointed as another candidate structure with causal influence on arousal and consciousness. We have focused our efforts in including only recording single units that were from tetrode located in the MD specifically using the lesion code we explain in the method section and in response to R1 question#3. We also produced a quantification of Cav3.1 knock-down that clearly demonstrates that the KD experiment was itself specific to MD, bilaterally, and that CL to CM were minimally impacted by the knock-down process (Fig. 2C and D). Moreover, the optogenetic  (fiber incidence was 30 degrees guaranteeing a central coverage rather than lateral; Fiber optic NA = 0.22) and electric stimulation (bipolar twisted electrodes, 50uA) experiments were also very selective and specific to the MD (Fig.S5). It remains clear that MD might not be the sole structure involved in the brain state control towards sedation and “anesthetic states”, and CL might be a significant contributor as well, however, we show that CL manipulations were rather irrelevant in our experiments  (Fig. 2, S5, S9 and S11).

      R1-2: While sh-RNA knockdown appears to be largely centered in the mediodorsal thalamus in the example shown, (Figure 2) this is rather minimal evidence and it is also not well explained (indeed, the relevant panels do not even appear to be referenced in the text of the manuscript) and the consistency of the knockdown targeting is not quantified. Additional evidence should be provided to validate this approach.

      R1-A2: In order to address this important question, we have created an additional panel quantification to fig2D. We have then quantified the intensity per area of Cav3.1 expression in sub zones of 4 regions of interest: MD (left, right; 2 subzones each), Centro Medial (CM; 1 subzones in total), Centrolateral/Paraventricular nucleus (CL/PCN; left, right; 2 subzones each) and the submedial nucleus (SMT; left, right; used as a control for the intensity normalization; 1 subzones in total). This panel clearly illustrates that MD was knocked-down bilaterally (p<0.001). Moreover, CM (p<0.05) and CL (p<0.01) were also partially and unilaterally knocked down, as well. This analysis confirms that our KD had a high specificity to MD.

      We added the relevant figure caption and text:

      [Result section, Cav3.1 silencing in the MD, but not VB, increased ethanol resistance in mice, paragraph 3]

      “We then characterized the change in Cav3.1 expression following the shControl and shCav3.1 knockdown injections in three test regions MD (left and right), CM (centromedial nucleus) and CL (centrolateral nuclei, left and right side) and a negative control region SMT (submedial thalamic nuclei, left and right side). The average intensity was obtained from two coronal brain slices for each mice used in the experiment (see Methods sections, Cav3.1 Intensity quantification). Our results show that the targeting of the knockdown was very specific to the bilateral MD (p<0.001; Fig. 2D). We noted that the CM (p<0.05) and a marginal unilateral knock-down of the CL were also observed (p<0.01). Notably, we tested the correlation between the level of knock-down in MD and the total time in LOM and observed a significant association (Fig. 2D inset; R = 0.599, p = 0.018). This result highlights that the Cav3.1 knock-down was specific to MD and with an intensity associated with ethanol-induced loss of motion.”

      R1-3: One difficulty is that, although lesions are shown in Figure S5 to validate recording locations, this figure is relatively unclear and the examples appear to be taken from a different anterior/posterior location compared to the reference diagram. A larger image and improved visualization of the overall set of lesion locations that includes multiple anterior/posterior coronal sections would be helpful. Moreover, even for these example images, it is difficult to evaluate whether these are in the mediodorsal thalamus, particularly given the small size of the image shown. Ideally, an example image that is more obviously in the mediodorsal thalamus would also be included. Finally, an assessment of the relationship between the approximate locations of recorded neurons across the tetrode arrays and the behavioral measures would be very helpful in supporting the unique role of the mediodorsal thalamus.

      R1-A3: Related to fig.S5, we re-distributed the position of the recordings from the tetrode electrode burned positions over 3 representative coronal planes that best represent the implant positions. We also provided additional snapshots of tetrode location. To identify the positions of four tetrodes in each animal, we encoded the positions with different electrical lesion strategies as follows: 1 lesion(tetrode 1), 2 lesions while we redrew the tetrode with 100 um interval (tetrode 2), 3 lesions with 200um interval (tetrode 3), 4 lesions with 50um intervals (tetrode4). Tetrodes that were found outside of the MD delimited region were discarded post analysis. A straight relationship between the closeness of the electrode is unfortunately not possible for tetrode recording, a straight silicone probe which maintains the spatial spacing in recording would have been a better approach in that case, but unfortunately, it was not performed in our study.

      R1-4: In addition to the key experimental issues mentioned above, there are often problems in the text of the manuscript with reasoning or at least explanation as well as numerous minor issues with editing. The most substantial such issue is the lack of clarity in discussing the mediodorsal thalamus and other adjacent thalamic nuclei, such as the central-lateral nucleus, in the author's discussion of previous findings. Given that at last one of the manuscripts cited by the authors (Saalman, Front. Sys. Neuro. 2014) has directly claimed that central-lateral, rather than the mediodorsal, thalamus is important for arousal regulation related to a conscious state, this distinction should be addressed clearly in the discussion rather than papered over by grouping multiple thalamic nuclei as being medial. As part of this discussion, it would be important to consider additional relevant literature including Bastos et al., eLife, 2021 and Redinbaugh et al., Neuron, 2020 which are quite critical but currently do not appear to be cited. Considering additional literature relevant to the function of the mediodorsal thalamus would also be beneficial.

      R1-A4: We thank the reviewer for his comments and suggestions. We agree that the added references mentioned by the reviewers are highly relevant and should be integrated in the manuscript. We have integrated the above-mentioned references and further developed on the discussion on the role of MD relative to other thalamic nuclei (ILN and CL in particular). We believe that this better-referenced and clarified text does improve the manuscript greatly.

      [introduction section, paragraph 3]

      “The centrolateral (CL) thalamic nucleus has been implicated in the modulation of arousal, behavior arrest 31, and improvement of level of consciousness during seizures 32. Notably, the direct electrical stimulation of the intralaminar nuclei (ILN) and, in particular CL, promoted hallmarks of arousal and awakening in primate under propofol and ketamine propofol anesthesia.”

      [Discussion section, paragraph 1]

      “In this work, we identified that the neural activity in MD plays a causal role in the maintenance of consciousness. Whole body Cav3.1 KO and MD-specific Cav3.1 KD mice showed resistance to loss of consciousness induced by hypnotic dose of ethanol. In WT mice, MD neurons demonstrated a reduced firing rate in natural (sleep) and ethanol-induced unconscious states compared to awake states. This neural activity reduction was impaired in KO mice. In particular, transition to an unconscious state was accompanied with a switch of firing mode from tonic firing to burst firing in WT mice whereas this modeshift disappeared in KO mice. Finally, optogenetic or electric stimulations of the MD after ethanol injection were sufficient to induce a resistance to loss of motion, supporting that the level of neural firing in the MD is critical to maintain conscious state and delay unconscious state. We showed that the expression of Cav3.1 t-type calcium channels in MD is a cellular modulator associated with this effect.”

      [Discussion section, MD is a modulator of consciousness, paragraph 2 and 3]

      “The MD is known to innervate limbic region, basal ganglia and medial prefrontal cortex 50 and increased activity in MD might modulate the stability of cortical UP states (e.g. awaken, aroused and attentive states) and synchronization 9,26. Thus, MD might be a major hub involved in cortical state control and brain state stabilization.

      Supporting the brain state stabilization theory and the ethanol resistance of Cav3.1 mutants, Choi et al.34 demonstrated that the loss of Cav3.1 T-type calcium channel reduced the bilateral coherence between PFC and MD under ketamine anesthesia and ethanol hypnosis, especially in the delta frequency bands. More importantly, under propofol anesthesia, Bastos et al.35 showed that intralaminar nucleus and MD stimulation lead to increased wake-up subscore and arousal, together with an increased in cortico-cortico and thalamo-cortical slow (delta) frequency power.

      In the present study, we observed that MD KD (Fig. 2A), but not VB KD (Fig. S3) of Cav3.1 increased and is associated (Fig. 2D) with ethanol resistance in mice. We found that MD neurons in Cav3.1 mutant mice exhibited tonic firing within range of wakefulness (Fig. 3 and 4), indicative of resistance to ethanol and wake-like brain state. In addition, we found a strong association between the normalized tonic firing in MD and the arousal through brain states (i.e. walk to wake to sleep states), supporting that MD tonic firing could be interpreted both as a thalamic readout and a modulator of the brain state 11 (Fig. 3). Finally, direct optogenetic and electric MD stimulation increased resistance to loss of consciousness in WT mice (Fig.5 and Fig. S10). To our knowledge, this is the first report demonstrating the causal involvement of mediodorsal thalamic nucleus in the modulation of wakefulness and the resistance to ethanol-induced loss of consciousness in mice.”

      R1-5: While the methods employed generally seem sound, the description in the methods section is lacking in detail and is often difficult to follow. Analysis methods such as the burst index appear to only be given a brief explanation in the text and appear not to be mentioned in the methods section.

      R1-A5: We have added a clear definition in the supplementary method following the original work used:

      [Supplementary Method section, Single Unit recording, sorting and analysis, last paragraph]

      “The bursting index was derived as described in (Royer et al. 2012). Namely, the burst index was estimated from the spike auto-correlogram (1-ms bin size) by subtracting the mean value between 40 and 50 ms (baseline) from the peak measured between 0 and 10 ms. Positive burst amplitudes were normalized to the peak and negative amplitudes were normalized to the baseline to obtain indexes ranging from −1 to 1.” We also edited its mention in the text for clarity:

      [Result section, Lack of Ca3.1 in MD neurons removes thalamic burst in NREM sleep, paragraph 2]

      “[…] and a clear reduction in total bursting represented as bursting index (Fig. 3-B; ratio of spikes count <10 ms and >50 ms based on auto-cross-correlogram).”

      R1-6: Similarly, the staining method used in Figure 2 does not appear to be described in the methods section.

      R1-A6: The staining method can be found in the supplementary method of the paper. [supplementary method, Immunohistochemistry]

      R1-7: The most substantial case is for the UMAP approach used in Figure 4-E which does not appear to be described in the methods or even described in the main text.

      R1-A7: Regarding the method, the UMAP approach is described in the supplementary method document [Uniform Manifold Approximation and Projection (UMAP)]. We believe that only a succinct description was needed here considering the extent of the analysis. Regarding the inserts in the main text, we agree that the main text was lacking the description of these results and we have amended the main text to reflect a clear description of this result and what it entails. The following paragraph was added:

      [Result section, Under ethanol, MD neurons lacking Cav3.1 show no burst and a wake state-like neural activity, second to last paragraph]

      “Finally, we asked whether the firing modes and properties (tonic firing rate, burst firing rate; see supplementary methods) of single MD neurons would form distinct qualitative representation of “brain stages” using a lowered dimensional UMAP representation (Uniform Manifold Approximation and Projection42 ). We observed that for awake and active (i.e. walk), the brain state representation formed two adjacent clusters that confounded both wild and mutant neurons (Fig. 4E, left panel). The REM and NREM states, the wild type neurons formed 2 additional interconnected clusters, whereas the mutant neurons tend to overlap with the clusters attributed to the “awake” brain state (Fig. 4E, second to left panel). Ethanol induced fLOM, similarly to REM and NREM clusters, was distinct from awake clusters in wild type mice and overlapped with the NREM clusters (Fig. 4E, third to left panel). Here also, mutant MD neurons showed overlap with the awake clusters rather than the “low consciousness” brain states. These results indicate that the firing mode and properties could define a brain state representation that shows distinctions in levels of consciousness. Moreover, the mutant showed a representation of “low consciousness” states overlapping with wild type “awake” states consistent with the hypothesis of resistance to loss of consciousness.”

      R1-8: Citations justifying the use of methods such as the approach to separate regular spiking and narrow spiking neuron subtypes are also needed.

      R1-A8: We have added two references related to the observation of the two subpopulations of spiking neurons [Schiff and Reyes, 2012; Destexhe, 2008].

      R1-9: Beyond the problems with content and reasoning discussed above, there are also some relatively minor issues with the clarity of writing throughout the paper (for example, in the abstract the authors refer to "the ethanol resistance behavior in WT mice" but it is difficult to parse what they mean by this statement.

      R1-A9: We addressed this issue by editing and revising the manuscript for clarity and flow.

      R1-10: Similarly, the next sentence "These results support the maintenance..." while clearer, is not well phrased. Though individually minor, issues like this re-occur throughout the manuscript and sometimes make it difficult to follow so the text should be revised to correct them.

      R1-A10: We thank the reviewer for highlighting this point. We have edited the overall text to improve clarity and flow.

      [abstract] 

      These results suggest that maintaining MD neural firing at a wakeful level is sufficient to induce resistance to ethanol-induced hypnosis in WT mice.

      R1-11: There are also some problems with labels such as the labels of A1/A2 in Figure 4, which appear to be incorrect.

      R1-A11: We noted this issue and have rectified the figure for clarity.

      R1-12: Also, S7 has no label on the B panels.

      R1-A12: We thank the reviewer for pointing out this lack. We have added the y-label on the panel for clarity.

      R1-13: Finally, some references are not included (only a label of [ref]).

      R1-A13: We have completed the missing reference and thank the reviewer for pointing that out.

      Additional comments

      R1-14: Aside from the additional quantification and clarification of the analysis discussed in the weakness section, in general, the experiments included in the manuscript seem reasonable. However, I would suggest one additional experiment as well as one control, both of which are relatively straightforward optogenetic experiments, that I feel would be helpful to further improve the study. First, as the authors note, the optogenetic interventions used do not directly address the relevance of the changes in bursting patterns observed in the knockout (KO), which are by far the most robust effect, with the changes in alcohol sensitivity. One approach that could help address this would be to use patterned suppression via inhibitory opsins (e.g. halorhodopsin) to "rescue" the periods of inhibition associated with bursting in the KO.

      R1-A14: Here the reviewer proposes an interesting experiment which we have attempted to perform, however, poses several technical challenges. First, the KO do not have burst firing as they are depleted from Cav3.1 low-threshold calcium channel. Therefore, under ethanol, even if there might exist a rhythmic inhibition that activates Cav3.1 channels and causes a rebound burst, the KO are unable to have it. Therefore, an optogenetic inhibition would only accentuate the total inhibition and could potentially induce an overall decrease in MD firing, resulting in an increase in LOM features. Alternatively, we showed that in a WT with low ethanol dose (where LOM induction is harder), the increased rhythmic inhibition does indeed increase significantly LOM duration and marginally decreases latency to LOM (Fig. S12), indicating that increased inhibition could indeed explain the hypothesis: “ the stronger the decrease in MD firing, the faster and longer the LOM.” The only caveat of using WT here is that optogenetic inhibition might also include rebound burst post-inhibition. Injecting bursts only did not alter the response to ethanol (Fig. S10). These results point to the role of loss of firing in MD as a main factor for LOM, and potentially the contribution of burst necessitating a concurrent inhibition/loss of firing.

      We agree that inhibition in KO would further validate this hypothesis, controlling for the role of burst. We regret that we are not in the capacity to perform additional experiments involving the KO mice.

      R1-15: For the control, tonic activation of the ventrobasal nucleus, as the authors did for the mediodorsal nucleus, would be beneficial to rule out the possibility that the observed effect would occur with any thalamic nucleus.

      R1-A15: We agree with the reviewer that we could have added an additional region control to the gain/loss of function experiments. We would even go further as to suggest that a better control nucleus would be a high order nucleus such as PO or an unrelated sensory relay nucleus such as LGN. VB being a motor relay nucleus, could also mediate movement initiation, which could be hard to interpret. Since the complete control study for all thalamic nuclei Cav3.1 KD is outside the scope of this study, we opted not to redo these experiments and keep the focus of the manuscript on the manipulation of MD activity rather than the various available thalamic nuclei. We also do not claim that MD is the sole center able to initiate a switch in the loss of consciousness, and a more in-depth study on that matter would be clearly needed.

      R1-16: In addition to these experiments, I did not note the strategy for sharing data obtained through this study so this should be added.

      R1-A16: We have uploaded data and code for most figures at the following repository and provided a clearer statement regarding data sharing. We thank the reviewer for pointing out this missing element.

      The link for the repository is the following:

      It contains:

      - Excel spreadsheet file of all behavior values, including the newly quantified Cv3.1 expression in MD/CL/SMT

      - Excel spreadsheet follow-up of all MD cells (single unit; tetrode) analyzed

      - Folders for all groups studied with representative figures showing EEG power over time and normalized activity (WT vs KO for 2, 3 and 4 g/kg; MDshKD vs shCTR, VBshKD vs shCTR; CHR2 NOSTIM vs STIM; ESTIM Groups and ARCH NOSTIM vs STIM)

      - A1G LORRvsLOM and OPEN FIELD Matlab data

      - Matlab and ImageJ Codes: single unit analysis, characterization, brain state characterization, sleep stages, LOM, open field analysis and statistical analysis.

      We have added the data sharing subsection in the acknowledgements:

      “Part of the analyzed data and codes are available on the open access platform, mendeley:

      Latchoumane, Charles-francois (2024), “Mediodorsal thalamic nucleus mediates resistance to ethanol through Cav3.1 T-type Ca2+ regulation of neural activity”, Mendeley Data, V1, doi: 10.17632/7fr427426m.1

      Additional data (large size recording and images) can be provided upon reasonable requests.”

      Reviewer #2 (Recommendations For The Authors):

      R2-1. Consciousness is a contentious subject. Even in humans, there is still intense research on the topic, not to mention animals, about which we still know very little. Moreover, consciousness is not quantified in this study, as there is no standard metric to do so. Accordingly, talking about 'modulation', 'transition', ́level ', or 'reduction' of consciousness can be misleading. Hence, it is probably safer to strictly refer to brain-states and/or stages of the sleep-wake cycle in this study and reframe it entirely around these concepts.

      R2-A1. The reviewer points to an important point and we appreciate this highlight. Agreeing that the definition of consciousness is rather loose and arguably difficult to pinpoint. Here, we settle on a definition that relies on the loss of motion and loss of righting reflex. This definition is widely accepted as the “verified” state in which the absence of responsiveness (to continuous stimuli, inducing reflex or discomfort) is observed and uninterrupted by jerks and spurious movements. Additional metrics needed would be the recording of EMG to quantify atonia and EEG to the settling of a dominantly slow-wave frequency (~4 Hz; ethanol-induced sedation at theta rhythm), as shown in Fig S1A. The driver of this 4Hz frequency and its correlation has been investigated previously (e.g. Choi et al, PNAS, 2012), leading to the accepted link between LOM/LORR and loss of consciousness. Our data present the advantage of showing single neuron recordings and that LOM is a state where the lowest firing activity is present (Fig S7AB) and comparable to deep sleep state activity (Fig3D). The first LOM is the most important as it highlights the deepest loss of consciousness before the ethanol starts to be metabolized and cleared, which would be consistent between animals.

      As a result, we have edited the manuscript to clarify all mentions related to brain states and states of unconsciousness.

      R2-2. It is not clear why the authors focus on the mediodorsal nucleus. This should be better explained in the introduction and developed in the discussion.

      R2-A2. This comment converges with the Reviewer 1 comments and we are addressing this lack in the discussion as suggested. We have addressed it with this previous comment and believe it is now clearer.

      R2-3. The discussion mentions that 'increased activity in MD might modulate the stability of cortical UP state and synchronization' (pg 21). This point should be either further developed and put into context, or removed. In its current state, it does not seem to contribute much to the discussion of results.

      R2-A3. We understand that the working “UP state” might not be clear enough. We have modified this sentences as follows to clarify that UP state could be either a state of where the animal is awake, aroused or attentive:

      [Discussion section, MD is a modulator of consciousness, first paragraph]

      “The MD is known to innervate limbic region, basal ganglia and medial prefrontal cortex 50 and increased activity in MD might modulate the stability of cortical UP states (e.g. awaken, aroused and attentive states) and synchronization 9,26. Thus, MD might be a major hub involved in cortical state control and brain state stabilization.“

      R2-4. The discussion states that 'mutant mice did not exhibit a decreased arousal level (i.e. increased locomotor activity)' (pg 23). This is confusing as decreased arousal should be reflected in decreased locomotor activity.

      R2-A4. We understand that the formulation of this sentence may be confusing and we have edited this portion of the text to improve quality in the revised version of the manuscript. To clarify, mutant mice do not exhibit reduced or increased arousal (not quantified, just observational), they do have a phenotypic hyperlocomotion. This comes in contrast with a lower basal firing rate in the MD, which in our interpretation, is not synonymous with lower arousal. We believe that the relative change in MD determines the change in arousal, and that the absolute firing is not indicative of arousal in itself, only in comparison.

      [Discussion section, The lower variability in MD Firing reflects Ethanol Resistance in Cav3.1 mutant mice, paragraph 2]

      “Mutant RS neurons in MD showed an overall lower excitability and variability of firing in various natural conscious and unconscious states compared to wild type mice. Remarkably, Cav3.1 mutant mice exhibited a clear increased locomotor activity and an increased resistance to ethanol. The general lower firing rate and the high “arousal” observed in mutant mice suggests that the relative change from state to state in tonic firing in MD, and not the absolute value of firing, might be a better correlate of change in brain state in the mice.”

      R2-5. The methods (pg 27) state that two genetic backgrounds (129/svjae and C57BL/6J ) were used in the study. Authors should show whether there were significant differences between those backgrounds in the key parameters assessed in the study (particularly resistance to ethanol sedation).

      R2-A5. As mentioned in the method section, we only used the F1-background mice, which are the firstgeneration offspring produced by crossing 129/svjae and C57BL/6J strains. To produce F1 KO mice, we kept the heterozygote mice in two strains. We unfortunately did not study the particular difference of the respective KO of these two backgrounds; however, the pure C57BL/6J KO has been used in other studies by our group (Kim et al 2001; Na et al, 2008; Park et al., 2010). The F1 background allows us to work with mice that are less aggressive and can be handled with less inherent stress.

      R2-6. It would be convenient to produce a supplementary figure associated with Figure 1C to show the same data with averages per mouse. That is, 9 points for control and 9 points for KO mice. This also applies to all cases where data is not presented per mouse but pooled between animals.

      R2-A6. We have added a panel C in Figure S1, to show the scatter values for all the mice corresponding to the figure 1C. We have also generalized this presentation for all behavior graphics showing all the animals in the scatter plot next to the boxplot. We believe that this presentation increases further the transparency of the manuscript. We have then added the scatter plot for all mice in figure Fig1, Fig2, Fig5, Fig.S2, Fig.S3, Fig.S10 and Fig.S12.

      R2-7. It would be informative to make a supplementary figure associated with Figure 1D to compare baseline raw activity levels (i.e., baseline walking recording) between control and KO mice. That is, do KO and control mice cover comparable distances and at similar speeds during baseline conditions? Figure 1D and Figure 4A suggest that the variability of locomotor activity is larger in KO mice. Hence, this parameter should be quantified and reported.

      R2-A7. We thank the reviewer for this comment. We strived to answer to this question in the manuscript in two ways:

      - We first measure the overall hyperlocomotion of the mice using the open field total distance parkoured in our mice cohorts (FigS4C). We did observe that the KO mutant showed hyperlocomotion, but not MD or VB knock-down mice. Which indicates that the hyperlocomotion component is not specific to the two thalamic nuclei studied.

      - Using the forced walking task, we impose on the animal to keep a steady pace of roughly 6cm/s. This assay allows to normalize the general walking behavior to a relatively fixed pace making it comparable for all animals.

      The reviewer suggested reporting the mean and variance in walking of WT and KO during baseline (prior to the ethanol I.P. injection). We believe that the two points mentioned above are sufficient to describe in a more quantitative way the WT vs KO locomotion differences. Moreover, by construction the normalized locomotion on the forced walking task will return similar means for the baseline, the standard deviation would, however, potentially show differences but would remain inconclusive.

      R2-8. The legend in Figure 1 states that 'the loss of consciousness is evaluated using normalized moving index using either video analysis (differential pixel motion), on- head accelerometer-based motion, or neck electromyograms'. Authors should clarify whether these methods are equivalent and support it with data.

      R2-A8. We understand the reviewer point and we have made a few modifications to the method description aligning better with what was done. For most mice, video analysis was used to obtain the moving index. When video recording was not available (2 mice), we had an accelerometer attached to the animal’s head stage which helped us derive a moving index that was similar to the video moving index. The neck electromyogram was rather used for animals implanted with the tetrodes to identify sleep stages based on local field potential frequency and muscle tone.  We have then clarified the method for this matter and Figure 1 to avoid this confusion. Since no concurrent recording of both video and accelerometer was performed, we do not have the data to compute the correlation between the two measures, however, no noticeable deviation from loss of motion was observed between the two methods. We realize that this may be a weak argument, however, our observations showed that video and accelerometers returned very similar timings for loss of motion (only a few comparative instances insufficient to present a statistical comparison).

      R2-9. How were spike bursts defined? The authors should try different criteria and verify the consistency of results.

      R2-A9 For in vivo single unit recording, we opted for a definition that is validated from our works and others as a silencing of at least 100 ms followed by a minimum of 3 spikes with:

      - First spike pairs interspike interval less than 4 ms

      - Remaining spike pairs interspike interval less than 20 ms

      We have performed this analysis using a minimum of 2 spikes, and varied silencing periods between 50 and 100ms, without observing significant deviation of the results. As shown in Figure S6B, with this approach we observed that the burst distribution had a majority with <10 spikes per burst. Figure S6C indicated that with a clear distribution of ISI for first spike within 2-4ms as observed in previous works (Desai and Varela, 2021; Alitto et al, 2019), importantly, not clearly capped at 4 ms, showing that the range for the first ISI might indeed be lower than 4ms for thalamic burst. Within burst spike waveforms can become very variable and the choice of 3 over 2 spikes minimum per burst stems from the aim to reduce false positive detection of ultra-short bursts, which in single unit recording remains controversial (Gray et al. 1995).

      Minor:

      R2-10: Figure 4A2 'Cav3.1(+/+)' should presumably be Cav3.1(-/-).

      R2-A10: this is correct and we have corrected the figure label [This sentence is ambiguous. What is ‘this’ that is correct?]

      R2-11: Figure S2C legend states 'Post-hoc group comparison was performed using.' The sentence seems to be incomplete.

      R2-A11: We have completed the sentence for clarity.

      R2-12: In the methods (pg 29) virus concentration is reported as '107 TU/ul', which probably refers to 10e7.

      R2-A12: We have corrected it by superscripting the power 7.

      R2-13: Verify Fig 1C1 and correct Y-axis overlap between title and units.

      R2-A13: We edited the figure for clarity, thank you.

      R2-14: On page 24 there is a '[ref]' that probably stands for (a missing) reference.

      R2-A14: the missing reference has been added.

    1. eLife Assessment

      This important study investigates how AD(H)D affects attention using neural and physiological measures in a Virtual Reality (VR) environment. Solid evidence is provided that individuals diagnosed with AD(H)D differ from control participants in both the encoding of the target sound and the encoding of acoustic interference. The VR paradigm here can potentially bridge lab experiments and real-life experiments. However, the reviewers identified a few potential technical issues that will need to be verified and discussed.

    2. Reviewer #1 (Public review):

      Summary:

      This is an interesting study on AD(H)D. The authors combine a variety of neural and physiological metrics to study attention in a VR classroom setting. The manuscript is well written and the results are interesting, ranging from an effect of group (AD(H)D vs. control) on metrics such as envelope tracking, to multivariate regression analyses considering alpha-power, gaze, TRF, ERPs, and behaviour simultaneously. I find the first part of the results clear and strong. The multivariate analyses in Tables 1 and 2 are good ideas, but I think they would benefit from additional clarification. Overall, I think that the methodological approach is useful in itself. The rest is interesting in that it informs us on which metrics are sensitive to group effects and correlated with each other. I think this might be one interesting way forward. Indeed, much more work is needed to clarify how these results change with different stimuli and tasks. So, I see this as an interesting first step into a more naturalistic measurement of speech attention.

      Strengths:

      I praise the authors for this interesting attempt to tackle a challenging topic with naturalistic experiments and metrics. I think the results broadly make sense and they contribute to a complex literature that is far from being linear and cohesive.

      Weaknesses:

      Nonetheless, I have a few comments that I hope will help the authors improve the manuscript. Some aspects should be clearer, some methodological steps were unclear (missing details on filters), and others were carried out in a way that doesn't convince me and might be problematic (e.g., re-filtering). I also suggested areas where the authors might find some improvements, such as deriving distinct markers for the overall envelope reconstruction and its change over time, which could solve some of the issues reported in the discussion (e.g., the lack of correlation with TRF metrics).

      I also have some concerns regarding reproducibility. Many details are imprecise or missing. And I did not find any comments on data and code sharing. A clarification would be appreciated on that point for sure.

      There are some minor issues, typically caused by some imprecisions in the write-up. There are a few issues that could change things though (e.g., re-filtering; the worrying regularisation optimisation choices), and there I'll have to see the authors' reply to determine whether those are major issues or not. Figures should also be improved (e.g., Figure 4B is missing the ticks).

    3. Reviewer #2 (Public review):

      Summary:

      While selective attention is a crucial ability of human beings, previous studies on selective attention are primarily conducted in a strictly controlled context, leaving a notable gap in underlying the complexity and dynamic nature of selective attention in a naturalistic context. This issue is particularly important for classroom learning in individuals with ADHD, as selecting the target and ignoring the distractions are pretty difficult for them but are the prerequisites of effective learning. The authors of this study have addressed this challenge using a well-motivated study. I believe the findings of this study will be a nice addition to the fields of both cognitive neuroscience and educational neuroscience.

      Strengths:

      To achieve the purpose of setting up a naturalistic context, the authors have based their study on a novel Virtual Reality platform. This is clever as it is usually difficult to perform such a study in a real classroom. Moreover, various techniques such as brain imaging, eye-tracking, and physiological measurement are combined to collect multi-level data. They found that, different from the controls, individuals with ADHD had higher neural responses to the irrelevant rather than the target sounds, and reduced speech tracking of the teacher. Additionally, the power of alpha-oscillations and frequency of gaze shifts away from the teacher are found to be associated with ADHD symptoms. These results provide new insights into the mechanism of selective attention among ADHD populations.

      Weaknesses:

      It is worth noting that nowadays there have been some studies trying to do so in the real classroom, and thus the authors should acknowledge the difference between the virtual and real classroom context and foresee the potential future changes.

      The approach of combining multi-level data has the advantage of obtaining reliable results, but also raises significant difficulty for the readers to understand the main results.

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions.

      As expected, individuals with ADHD showed anomalous patterns of neural responses, and eye-tracking patterns, compared to the controls. But there are also some similarities between groups such as the amount of time paying attention to teachers, etc. In general, their conclusions are supported.

      A discussion of the likely impact of the work on the field, and the utility of the methods and data to the community, would highlight the contributions of the work.

      The findings are an extension of previous efforts in understanding selective attention in the naturalistic context. The findings of this study are particularly helpful in inspiring teacher's practice and advancing the research of educational neuroscience. This study demonstrates, again, that it is important to understand the complexity of cognitive processes in the naturalistic context.

    4. Reviewer #3 (Public review):

      Summary:

      The authors conducted a well-designed experiment, incorporating VR classroom scenes and background sound events, with both control and ADHD participants. They employed multiple neurophysiological measures, such as EEG, eye movements, and skin conductance, to investigate the mechanistic underpinnings of paying attention in class and the disruptive effects of background noise.

      The results revealed that individuals with ADHD exhibited heightened sensory responses to irrelevant sounds and reduced tracking of the teacher's speech. Overall, this manuscript presented an ecologically valid paradigm for assessing neurophysiological responses in both control and ADHD groups. The analyses were comprehensive and clear, making the study potentially valuable for the application of detecting attentional deficits.

      Strengths:

      • The VR learning paradigm is well-designed and ecologically valid.

      • The neurophysiological metrics and analyses are comprehensive, and two physiological markers are identified capable of diagnosing ADHD.

      • This research provides a valuable dataset that could serve as a benchmark for future studies on attention deficits.

      Weaknesses:

      • Several results are null results, i.e., no significant differences were found between ADHD and control populations.

      • Although the paradigm is well-designed and ecologically valid, the specific contributions or insights from the results remain unclear.

      • Lack of information regarding code and data availability.

    5. Author response:

      We are glad that the reviewers found our work to be interesting and appreciate its contribution to enhancing ecological validity of attention research. We also agree that much more work is needed to solidify this approach, and that some of the results should be considered “exploratory” at this point, but appreciate the recognition of the novelty and scientific potential of the approach introduced here.

      We will address the reviewers’ specific comments in a revised version of the paper, and highlight the main points here:

      · We agree that the use of multiple different neurophysiological measures is both an advantage and a disadvantage, and that the abundance of results can make it difficult to tell a “simple” story. In our revision, we will make an effort to clarify what (in our opinion) are the most important results and provide readers with a more cohesive narrative.

      · Important additional discussion points raised by the reviewers, which will be discussed in a revised version are a) the similarities and differences between virtual and real classrooms; b) the utility of the methods and data to the community and c) the implication of these results for educational neuroscience and ADHD research.

      · In the revision, we will also clarify several methodological aspects of the data analysis, as per the reviewers’ requests.

      · After final publication, the data will be made available for other researchers to use.

    1. eLife Assessment

      This study describes a new analysis strategy to compare active neurons during behavioral tasks across the brain. This is significant because analysing how different brain areas are active during behavior requires better methods. The evidence provided in support of the method is solid. Although useful now, the work may increase its significance following appropriate revisions.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Jin et. al., describe SMARTR, an image analysis strategy optimized for analysis of dual-activity ensemble tagging mouse reporter lines. The pipeline performs cell segmentation, then registers the location of these cells into an anatomical atlas, and finally, calculates the degree of co-expression of the reporters in cells across brain regions. They demonstrate the utility of the method by labeling two ensemble populations during two related experiences: inescapable shock and subsequent escapable shock as part of learned helplessness.

      Strengths:

      (1) We appreciated that the authors provided all documentation necessary to use their method and that the scripts in their publicly available repository are well commented.

      (2) The manuscript was well-written and very clear, and the methods were generally highly detailed.

      Weaknesses:

      (1) The heatmaps (for example, Figure 3A, B) are challenging to read and interpret due to their size. Is there a way to alter the visualization to improve interpretability? Perhaps coloring the heatmap by general anatomical region could help? We feel that these heatmaps are critical to the utility of the registration strategy, and hence, clear visualization is necessary.

      (2) Additional context in the Introduction on the use of immediate early genes to label ensembles of neurons that are specifically activated during the various behavioral manipulations would enable the manuscript and methodology to be better appreciated by a broad audience.

      (3) The authors mention that their segmentation strategies are optimized for the particular staining pattern exhibited by each reporter and demonstrate that the manually annotated cell counts match the automated analysis. They mention that alternative strategies are compatible, but don't show this data.

      (4) The authors provided highly detailed information for their segmentation strategy, but the same level of detail was not provided for the registration algorithms. Additional details would help users achieve optimal alignment.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript describes a workflow and software package, SMARTR, for mapping and analyzing neuronal ensembles tagged using activity-dependent methods. They showcase this pipeline by analyzing ensembles tagged during the learned helplessness paradigm. This is an impressive effort, and I commend the authors for developing open-source software to make whole-brain analyses more feasible for the community. Software development is essential for modern neuroscience and I hope more groups make the effort to develop open-source, easily usable packages. However, I do have concerns over the usability and maintainability of the SMARTR package. I hope that the authors will continue to develop this package, and encourage them to make the effort to publish it within either the Bioconductor or CRAN framework.

      Strengths:

      This is a novel software package aiming to make the analysis of brain-wide engrams more feasible, which is much needed. The documentation for the package and workflow is solid.

      Weaknesses:

      While I was able to install the SMARTR package, after trying for the better part of one hour, I could not install the "mjin1812/wholebrain" R package as instructed in OSF. I also could not find a function to load an example dataset to easily test SMARTR. So, unfortunately, I was unable to test out any of the packages for myself. Along with the currently broken "tractatus/wholebrain" package, this is a good example of why I would strongly encourage the authors to publish SMARTR on either Bioconductor or CRAN in the future. The high standards set by Bioc/CRAN will ensure that SMARTR is able to be easily installed and used across major operating systems for the long term.

      The package is quite large (several thousand lines include comments and space). While impressive, this does inherently make the package more difficult to maintain - and the authors currently have not included any unit tests. The authors should add unit tests to cover a large percentage of the package to ensure code stability.

      Why do the authors choose to perform image segmentation outside of the SMARTR package using ImageJ macros? Leading segmentation algorithms such as CellPose and StarMap have well-documented APIs that would be easy to wrap in R. They would likely be faster as well. As noted in the discussion, making SMARTR a one-stop shop for multi-ensemble analyses would be more appealing to a user.

      Given the small number of observations for correlation analyses (n=6 per group), Pearson correlations would be highly susceptible to outliers. The authors chose to deal with potential outliers by dropping any subject per region that was> 2 SDs from the group mean. Another way to get at this would be using Spearman correlation. How do these analyses change if you use Spearman correlation instead of Pearson? It would be a valuable addition for the author to include Spearman correlations as an option in SMARTR.

      I see the authors have incorporated the ability to adjust p-values in many of the analysis functions (and recommend the BH procedure) but did not use adjusted p-values for any of the analyses in the manuscript. Why is this? This is particularly relevant for the differential correlation analyses between groups (Figures 3P and 4P). Based on the un-adjusted p-values, I assume few if any data points will still be significant after adjusting. While it's logical to highlight the regional correlations that strongly change between groups, the authors should caution ¬ which correlations are "significant" without adjusting for multiple comparisons. As this package now makes this analysis easily usable for all researchers, the authors should also provide better explanations for when and why to use adjusted p-values in the online documentation for new users.

      The package was developed in R3.6.3. This is several years and one major version behind the current R version (4.4.3). Have the authors tested if this package runs on modern R versions? If not, this could be a significant hurdle for potential users.

    4. Author response:

      Reviewer #1 (Public review):

      Weaknesses:

      (1) The heatmaps (for example, Figure 3A, B) are challenging to read and interpret due to their size. Is there a way to alter the visualization to improve interpretability? Perhaps coloring the heatmap by general anatomical region could help? We feel that these heatmaps are critical to the utility of the registration strategy, and hence, clear visualization is necessary.

      We thank the reviewers for this point on aesthetic improvement, and we agree that clearer visualization of our correlation heatmaps is important. To address this point, we have incorporated the capability of grouping “child” subregions in anatomical order by their more general “parent” region into the package function, plot_correlation_heatmaps(). Parent regions will be visually represented as smaller sub-facets in the heatmaps, and we will be submitting our full revised manuscript with these visual changes.

      (2) Additional context in the Introduction on the use of immediate early genes to label ensembles of neurons that are specifically activated during the various behavioral manipulations would enable the manuscript and methodology to be better appreciated by a broad audience.

      We thank the reviewers for this suggestion and will be revising parts of our Introduction to reflect the broader use and appeal of immediate early genes (IEGs) for studying neural changes underlying behavior.

      (3) The authors mention that their segmentation strategies are optimized for the particular staining pattern exhibited by each reporter and demonstrate that the manually annotated cell counts match the automated analysis. They mention that alternative strategies are compatible, but don't show this data.

      We thank the reviewers for this comment. We also appreciate that integration with alternative strategies is a major point of interest to readers, given that others may be interested in compatibility with our analysis and software package, rather than completely revising their own pre-existing workflows.

      This specific point on segmentation refers to the import_segmentation_custom()function in the package. As there is currently not a standard cell segmentation export format adopted by the field, this function still requires some data wrangling into an import format saved as a .txt file. However, we chose not to visually demonstrate this capability in the paper for a few reasons.

      i. A figure showing the broad testing of many different segmentation algorithms, (e.g., Cellpose, Vaa3d, Trainable Weka Segmentation) would better demonstrate the efficacy of segmentation of these alternative approaches, which have already been well-documented. However, demonstrating importation compatibility is more of a demonstration of API interface, which is better shown in website documentation and tutorial notebooks.

      ii. Additionally, showing importation with one well-established segmentation approach is still a demonstration of a single use case. There would be a major burden-of-proof in establishing importation compatibility with all potential alternative platforms, their specific export formats, which may be slightly different depending on post-processing choices, and the needs of the experimenters (e.g., exporting one vs many channels, having different naming conventions, having different export formats). For example, output from Cellpose can take the form of a NumPy file (_seg.npy file), a .png, or Native ImageJ ROI archive output, and users can have chosen up to four channels. Until the field adopts a standardized file format, one flexible enough to account for all the variables of experimental interest, we currently believe it is more efficient to advise external groups on how to transform their specific data to be compatible with our generic import function.

      Internally, in collaborative efforts, we have validated the ability to import datasets generated from completely different workflows for segmentation and registration. We intend on releasing this documentation in coming updates on our package website, which we believe will be more demonstrative on how to take advantage of our analysis package, without adopting our entire workflow.

      (4) The authors provided highly detailed information for their segmentation strategy, but the same level of detail was not provided for the registration algorithms. Additional details would help users achieve optimal alignment.

      We apologize for this lack of detail. The registration strategy depends upon the WholeBrain package for registration to the Allen Mouse Common Coordinate Framework. While this strategy has been published and documented elsewhere, we will be revising our methods to better incorporate details of this approach.

      Reviewer #2 (Public review):

      Weaknesses:

      (1) While I was able to install the SMARTR package, after trying for the better part of one hour, I could not install the "mjin1812/wholebrain" R package as instructed in OSF. I also could not find a function to load an example dataset to easily test SMARTR. So, unfortunately, I was unable to test out any of the packages for myself. Along with the currently broken "tractatus/wholebrain" package, this is a good example of why I would strongly encourage the authors to publish SMARTR on either Bioconductor or CRAN in the future. The high standards set by Bioc/CRAN will ensure that SMARTR is able to be easily installed and used across major operating systems for the long term.

      We thank reviewers for pointing out this weakness; long-term maintenance of this package is certainly a mutual goal. Loading an .RDATA file is accomplished by either double-clicking directly on the file in a directory window, or by using the load() function, (e.g., load("directory/example.RData")). We will explicitly outline these directions in the online documentation and in our full revision.

      Moreover, we will submit our package to CRAN. Currently, SMARTR is not dependent on the WholeBrain package, which remains optional for the registration portion of our workflow. Ultimately, this independence will allow us to maintain the analysis and visualization portion of the package independently, and allow for submission to a more centralized software repository such as CRAN.

      (2) The package is quite large (several thousand lines include comments and space). While impressive, this does inherently make the package more difficult to maintain - and the authors currently have not included any unit tests. The authors should add unit tests to cover a large percentage of the package to ensure code stability.

      We appreciate this feedback and will add unit testing to improve the reliability of our package in the full revision.

      (3) Why do the authors choose to perform image segmentation outside of the SMARTR package using ImageJ macros? Leading segmentation algorithms such as CellPose and StarMap have well-documented APIs that would be easy to wrap in R. They would likely be faster as well. As noted in the discussion, making SMARTR a one-stop shop for multi-ensemble analyses would be more appealing to a user.

      We appreciate this feedback. We believe parts of our response to Reviewer 1, comment 3, are relevant to this point. Interfaces for CellPose and ClusterMap (which processes in situ transcriptomic approaches like STARmap) are both in python, and currently there are ways to call python from within R (https://rstudio.github.io/reticulate/index.html). We will certainly explore incorporating these APIs from R. However, we would anticipate this capability is more similar to “translation” between programming languages, but would not currently preclude users from the issue of still needing some familiarity with the capabilities of these python packages, and thus with python syntax.

      (4) Given the small number of observations for correlation analyses (n=6 per group), Pearson correlations would be highly susceptible to outliers. The authors chose to deal with potential outliers by dropping any subject per region that was> 2 SDs from the group mean. Another way to get at this would be using Spearman correlation. How do these analyses change if you use Spearman correlation instead of Pearson? It would be a valuable addition for the author to include Spearman correlations as an option in SMARTR.

      We thank reviewers for this suggestion and will provide a supplementary analysis of our results using Spearman correlations.

      (5) I see the authors have incorporated the ability to adjust p-values in many of the analysis functions (and recommend the BH procedure) but did not use adjusted p-values for any of the analyses in the manuscript. Why is this? This is particularly relevant for the differential correlation analyses between groups (Figures 3P and 4P). Based on the un-adjusted p-values, I assume few if any data points will still be significant after adjusting. While it's logical to highlight the regional correlations that strongly change between groups, the authors should caution which correlations are "significant" without adjusting for multiple comparisons. As this package now makes this analysis easily usable for all researchers, the authors should also provide better explanations for when and why to use adjusted p-values in the online documentation for new users.

      We appreciate the feedback and will more explicitly outline that in our paper, our dataset is presented as a more demonstrative and exploratory resource for readers and, as such, we accept a high tolerance for false positives, while decreasing risk of missing possible interesting findings. As noted by Reviewer #2, it is still “logical to highlight the regional correlations that strongly change between groups.” We will further clarify in our methods that we chose to present uncorrected p-values when speaking of significance. We will also include more statistical detail on our online documentation regarding FDR correction. Ultimately, the decision to correct for multiple comparisons and FDR choice of threshold, should still be informed by standard statistical theory and user-defined tolerance for inclusion of false-positives and missing of false-negatives. This will be influenced by factors, such as the nature and purpose of the study, and quality of the dataset.  

      (6) The package was developed in R3.6.3. This is several years and one major version behind the current R version (4.4.3). Have the authors tested if this package runs on modern R versions? If not, this could be a significant hurdle for potential users.

      We thank reviewers for pointing out concerns regarding versioning. Analysis and visualization capabilities are currently supported using R version 4.1+. The recommendation for R 3.6.3 is primarily for users interested in using the full workflow, which requires installation of the WholeBrain package. We anticipate supporting of visualization and network analysis capabilities with updated packages and R versions, and maintaining a legacy version for the full workflow presented in this paper.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors present a new application of the high-content image-based morphological profiling Cell Painting (CP) to single cell type classification in mixed heterogeneous induced pluripotent stem cellderived mixed neural cultures. Machine learning models were trained to classify single cell types according to either "engineered" features derived from the image or from the raw CP multiplexed image. The authors systematically evaluated experimental (e.g., cell density, cell types, fluorescent channels) and computational (e.g., different models, different cell regions) parameters and convincingly demonstrated that focusing on the nucleus and its surroundings contains sufficient information for robust and accurate cell type classification. Models that were trained on mono-cultures (i.e., containing a single cell type) could generalize for cell type prediction in mixed co-cultures, and describe intermediate states of the maturation process of iPSC-derived neural progenitors to differentiation neurons.

      Strengths:

      Automatically identifying single-cell types in heterogeneous mixed-cell populations holds great promise to characterize mixed-cell populations and to discover new rules of spatial organization and cell-cell communication. Although the current manuscript focuses on the application of quality control of iPSC cultures, the same approach can be extended to a wealth of other applications including an in-depth study of the spatial context. The simple and high-content assay democratizes use and enables adoption by other labs.

      The manuscript is supported by comprehensive experimental and computational validations that raise the bar beyond the current state of the art in the field of high-content phenotyping and make this manuscript especially compelling. These include (i) Explicitly assessing replication biases (batch effects); (ii) Direct comparison of feature-based (a la cell profiling) versus deep-learning-based classification (which is not trivial/obvious for the application of cell profiling); (iii) Systematic assessment of the contribution of each fluorescent channel; (iv) Evaluation of cell-density dependency; (v) Explicit examination of mistakes in classification; (vi) Evaluating the performance of different spatial contexts around the cell/nucleus; (vii) Generalization of models trained on cultures containing a single cell type (mono-cultures) to mixed co-cultures; (viii) Application to multiple classification tasks.

      I especially liked the generalization of classification from mono- to co-cultures (Figure 4C), and quantitatively following the gradual transition from NPC to Neurons (Figure 5H).

      The manuscript is well-written and easy tofollow.

      Thank you for the positive appreciation of our work and constructive comments. 

      Weaknesses:

      I am not certain how useful/important the specific application demonstrated in this study is (quality control of iPSC cultures), this could be better explained in the manuscript. 

      To clarify the importance we have added an additional explanation to the introduction (page 3) and also come back to it in the discussion (page 17).

      Text from the introduction:

      “However, genetic drift, clonal and patient heterogeneity cause variability in reprogramming and differentiation efficiency10,11. The differentiation outcome is further strongly influenced by variations in protocol12. This can significantly impact experimental outcomes, leading to inconsistent and potentially misleading results and consequently, it hinders the use of iPSC-derived cell systems in systematic drug screening or cell therapy pipelines. This is particularly true for iPSC-derived neural cultures, as their composition, purity and maturity directly affect gene expression and functional activity, which is essential for modelling neurological conditions13,14. Thus, from a preclinical perspective, there is the need for a fast and cost-effective QC approach to increase experimental reproducibility and cell type specificity15. From a clinical perspective in turn, robust QC is required for safety and regulatory compliance (e.g., for cell therapeutic solutions). This need for improved standardization and QC is underscored by large-scale collaborative efforts such as the International Stem Cell Banking Initiative16, which focusses on clinical quality attributes and provides recommendations for iPSC validation testing for use as cellular therapeutics, or the CorEuStem network, aiming to harmonize iPSC practices across core facilities in Europe.”

      Text from the discussion: 

      “Many groups highlight the difficulty of reproducible neural differentiation and attribute this to culture conditions, cultivation time and variation in developmental signalling pathways in the source iPSC material43,44. Spontaneous neural differentiation has previously been shown to require approximately 80 days before mature neurons arise that can fire action potentials and show neural circuit formation. Although these differentiation processes display a stereotypical temporal sequence34, the exact timing and duration might vary. This variation negatively affects the statistical power when testing drug interventions and thus prohibits the application of iPSC-culture derivatives in routine drug screening. Current solutions (e.g., immunocytochemistry, flow cytometry, …) are often cost-ineffective, tedious, and incompatible with longitudinal/multimodal interrogation. CP is a much more cost-effective solution and ideally suited for this purpose. Routine CP-based could add confidence to and save costs for the drug discovery pipeline. We have shown that CP can be leveraged to capture the morphological changes associated with neural differentiation.”

      Another issue that I feel should be discussed more explicitly is how far can this application go - how sensitively can the combination of cell painting and machine learning discriminate between cell types that are more subtly morphologically different from one another?

      Thank you for this interesting question. The fact that an approach based on a subregion not encompassing the whole cell (the “nucleocentric” approach) can predict cell types equally well, suggests that the cell shape as such is not the defining factor for accurate cell type profiling. And, while clearly neural progenitors, neurons or glia have vastly different cell shapes. We have shown that cells with closer phenotypes such as 1321N1 vs. SH-SY5Y or astrocytes vs. microglia can be distinguished with equal performance. However, triggered by the reviewers’ question, we have now tested additional conditions with more subtle phenotypes, including the classification of 1321N1 vs. two related retinal pigment epithelial cells with much more similar morphology (ARPE and RPE1 cells). We found that the CNN could discriminate these cells equally well and have added the results on page 8 and in Fig. 3D. To address this question from a different angle, we have also performed an experiment in which we changed cell states to assess whether discriminatory power remains high. Concretely, we exposed co-cultures of neurons and microglia to LPS to trigger microglial activation (more subtly visible as cytoskeletal changes and vacuole formation). This revealed that our approach still discriminates both cell types (neurons vs. microglia) with high accuracy, regardless of the microglial state. Furthermore, using a two-step approach, we could also distinguish LPS-treated (assumed to be activated) from unchallenged microglia (assumed to be more homeostatic), albeit with a lower accuracy. This experiment has been added as an extra results section (Cell type identification can be applied to mixed iPSC-derived neuronal cultures regardless of activation state, p12) and Fig. 7c. Finally, we have also added our take on what the possibilities could be for future applications in even more complex contexts such as tissue slice, 3D and live cell applications (page 17-18). 

      Regarding evaluations, the use of accuracy, which is a measure that can be biased by class imbalance, is not the most appropriate measurement in my opinion. The confusion matrices are a great help, but I would recommend using a measurement that is less sensitive for class imbalance for cell-type classification performance evaluations.  

      Across all CNNs trained in this manuscript, the sample size of the input classes has always been equalized, ruling out any effects of class imbalance. Nevertheless, to follow the reviewers’ recommendation, we have now used the F-score to document performance as it is insensitive to such imbalance. For clarity, we have now also mentioned the input number (ROIs/class) in every figure.

      Another issue is that the performance evaluation is calculated on a subset of the full cell population - after exclusion/filtering. Could there be a bias toward specific cell types in the exclusion criteria? How would it affect our ability to measure the cell type composition of the population?

      As explained in the M&M section, filtering was performed based on three criteria:

      (1) Nuclear size: values below a threshold of 160, objects are considered to represent debris;

      (2) DAPI intensity: values below a threshold of 500 represent segmentation errors;

      (3) IF staining intensity: gates were set onto the intensity of the fluorescent markers used with posthoc IF to only retain cells that are unequivocally positive for either marker and to avoid inclusion of double positive (or negative) cells in the ground truth training. 

      One could argue that the last criterion introduces a certain bias in that it does not consider part of the cell population. However, this is also not the purpose of our pioneering study that aims at identifying unique cell types for which ground truth is as pure and reliable as possible. Not filtering out these cells with a ‘dubious’ IF profile (e.g., cells that might be transitioning or are of a different type) would negatively affect the model by introducing noise. It is correct that the predictions are based only on these inputs and so cells of a subsequent test set will only be classified according to these labels. For example, in the neuronal differentiation experiment (Fig. 6G-H), cells are either characterized as NPC or as neurons, which leaves the transitioning (or undefined) cells in either category. Despite this simplification, the model adequately predicted the increase in neuron/NPC ratio with culture age. In future iterations, one could envision defining more refined cell (sub-)types in a population based on richer post-hoc information (e.g., through cyclic immunofluorescence or spatial single cell transcriptomics) or longitudinal follow-up of cell-state transitions using live imaging. This notion has been added to page 17 of the manuscript.

      I am not entirely convinced by the arguments regarding the superiority of the nucleocentric vs. the nuclear representations. Could it be that this improvement is due to not being sensitive/ influenced by nucleus segmentation errors?

      The reviewer has a valid point that segmentation errors may occur. However, the algorithm we have used (Stardist classifier), is very robust to nuclear segmentation errors. To verify the performance, we have now quantified segmentation errors in 20 images for 3 different densities and found a consistently low error rate (0.6 -1.6%) without correlation to the culture density. Moreover, these errors include partial imperfections (e.g., a missed protrusion or bleb) as well as over- (one nucleus detected as more) or under- (more nuclei detected as one) segmentations. The latter two will affect both the nuclear and nucleocentric predictions and should thus not affect the prediction performance. In the case of imperfect segmentations, there may be a specific impact on the nucleus-based predictions (which rely on blanking the non-nuclear part), but this alone cannot explain the significantly higher gain in accuracy for nucleocentric predictions (>5%). Therefore, we conclude that segmentation errors may contribute in part, but not exclusively, to the overall improved performance of nucleocentric input models. We have added this notion in the discussion (pages 14-15 and Suppl. Fig. 1E).

      GRADCAM shows cherry-picked examples and is not very convincing.

      To help convince the reviewer and illustrate the representativeness of selected images, we have now randomly selected for each condition and density 10 images (using random seeds to avoid cherrypicking) and added these in a Suppl. Fig. 3.

      There are many missing details in the figure panels, figure legend, and text that would help the reader to better appreciate some of the technical details, see details in the section on recommendations for the authors.

      Please see further for our specific adaptations.

      Reviewer #2 (Public Review):

      This study uses an AI-based image analysis approach to classify different cell types in cultures of different densities. The authors could demonstrate the superiority of the CNN strategy used with nucleocentric cell profiling approach for a variety of cell types classification. The paper is very clear and well-written. I just have a couple of minor suggestions and clarifications needed for the reader.

      The entire prediction model is based on image analysis. Could the authors discuss the minimal spatial resolution of images required to allow a good prediction? Along the same line, it would be interesting to the reader to know which metrics related to image quality (e.g. signal to noise ratio) allow a good accuracy of the prediction.

      Thank you for the positive and relevant feedback.

      The reviewer has a good point that it is important to portray the imaging conditions that are required for accurate predictions. To investigate this further we have performed additional experiments that give a better view on the operating window in terms of resolution and SNR (manuscript page 7-8 and new figure panels Fig. 3B-C). The initial image resolution was 0.325 µm/pixel. To understand the dependency on resolution we performed training and classifications for image data sets that were progressively binned. We found that a two-fold reduction in resolution did not significantly affect the F-score, but further degradation decreased the performance. At a resolution of 6,0 µm/pixel (20-fold binning), the F-score dropped to 0.79±0.02, comparable to the performance when only the DAPI (nuclear) channel was used as input. The effect of reduced image quality was assessed in a similar manner, by iteratively adding more Gaussian noise to the image. We found that above an SNR of 10 the prediction performance remains consistent but below it starts to degrade. While this exercise provides a first impression of the current confines of our method, we do believe it is plausible that its performance can be extended to even lower-quality images for example by using image restoration algorithms. We have added this notion in the discussion (page 14).

      The authors show that nucleocentric-based cell feature extraction is superior to feeding the CNN-based model for cell type prediction. Could they discuss what is the optimal size and shape of this ROI to ensure a good prediction? What if, for example, you increase or decrease the size of the ROI by a certain number of pixels?

      To identify the optimal input, we varied the size of the square region around the nuclear centroid from 0.6 to 150 µm for the whole dataset. Within the nuclear-to-cell window (12µm- 30µm) the average Fscore is limited, but an important observation is the increasing error and differences in precision and recall with increasing nucleocentric patch sizes, which will become detrimental in cases of class imbalance. The F-score is maximal for a box of 12-18µm surrounding the nuclear centroid. In this “sweet spot”, the precision and recall are also in balance. Therefore, we have selected this region for the actual density comparison experiment. We have added our results to the manuscript (page 9 and 15).

      It would be interesting for the reader to know the number of ROI used to feed each model and know the minimal amount of data necessary to reach a high level of accuracy in the predictions.

      The figures have now been adjusted so that the number of ROIs used as input to feed the model are listed. The minimal number of ROIs required to obtain high level accuracy is tested in Figure 2C. By systematically increasing the number of input ROIs for both RF and CNN, we found that a plateau is reached at 5000 input ROIs (per class) for optimal prediction performance. This is also documented in the results section page 6.

      From Figure 1 to Figure 4 the author shows that CNN based approach is efficient in distinguishing 1321N1 vs SH-SY5Y cell lines. The last two figures are dedicated to showing 2 different applications of the techniques: identification of different stages of neuronal differentiation (Figure 5) and different cell types (neurons, microglia, and astrocytes) in Figure 6. It would be interesting, for these 2 two cases as well, to assess the superiority of the CNN-based approach compared to the more classical Random Forest classification. This would reinforce the universal value of the method proposed.

      To meet the reviewer’s request, we have now also compared CNN to RF for the classification of cells in iPSC-derived models (Figures 6 and 7). As expected, the CNN performed better in both cases. We have now added these results in Fig. 6 D and 7 C and pages 12 and 13 of the manuscript.

      Reviewer #3 (Public Review):

      Induced pluripotent stem cells, or iPSCs, are cells that scientists can push to become new, more mature cell types like neurons. iPSCs have a high potential to transform how scientists study disease by combining precision medicine gene editing with processes known as high-content imaging and drug screening. However, there are many challenges that must be overcome to realize this overall goal. The authors of this paper solve one of these challenges: predicting cell types that might result from potentially inefficient and unpredictable differentiation protocols. These predictions can then help optimize protocols.

      The authors train advanced computational algorithms to predict single-cell types directly from microscopy images. The authors also test their approach in a variety of scenarios that one may encounter in the lab, including when cells divide quickly and crowd each other in a plate. Importantly, the authors suggest that providing their algorithms with just the right amount of information beyond the cells' nuclei is the best approach to overcome issues with cell crowding.

      The work provides many well-controlled experiments to support the authors' conclusions. However, there are two primary concerns: (1) The model may be relying too heavily on the background and thus technical artifacts (instead of the cells) for making CNN-based predictions, and (2) the conclusion that their nucleocentric approach (including a small area beyond the nucleus) is not well supported, and may just be better by random chance. If the authors were to address these two concerns (through additional experimentation), then the work may influence how the field performs cell profiling in the future.

      Thank you very much for confirming the potential value of our work and raising these relevant items. To better support our claims we have now performed additional validations, which we detail below. 

      (1) The model may be relying too heavily on the background and thus technical artifacts (instead of the cells) for making CNN-based predictions 

      To address the first point, we have adapted the GradCAM images to show an overlay of the input crop and GradCAM heatmap to give a better view of the structures that are highlighted by the CNN. We further investigated the influence of the background on the prediction performance. Our finding that a CNN trained on a monoculture retains a relatively high performance on cocultures implies that the CNN uses the salient characteristics of a cell to recognize it in more complex heterogeneous environments. Assuming that the background can vary between experiments, the prediction of a pretrained CNN on a new dataset indicates that cellular characteristics are used for robust prediction.  When inspecting GradCAM images obtained from the nucleocentric CNN approaches (now added in Suppl. Fig. 3), we noticed that the nuclear periphery typically contributed the most (but not exclusively) to the prediction performance. When using only the nuclear region as input, GradCAMs were more strongly (but again not exclusively) directed to the background surrounding the nuclei. To train the latter CNN, we had cropped nuclei and set the background to a value of zero. To rule out that this could have introduced a bias, we have now performed the exact same training and classification, but setting the background to random noise instead (Suppl. Fig. 2). While this effectively diverted the attention of the GradCAM output to the nucleus instead of the background, the prediction performance was unaltered. We therefore assume that irrespective of the background, when using nuclear crops as input, the CNN is dominated by features that describe nuclear size. We observe that nuclear size is significantly different in both cell types (although intranuclear features also still contribute) which is also reflected in the feature map gradient in the first UMAP dimension (Suppl. Fig. 2). This notion has been added to the manuscript (page 9) and Suppl. Fig. 2. 

      (2) The conclusion that their nucleocentric approach (including a small area beyond the nucleus) is not well supported, and may just be better by random chance. 

      To address this second concern, which was also raised by reviewer 2, we have performed a more extensive analysis in which the patch size was varied from 0.6 to 120µm around the nuclear centroid (Fig. 4E and page 9 of the manuscript). We observed that there is little effect of in- or decreasing patch size on the average F-score within the nuclear to cell window, but that the imbalance between the precision and recall increases towards the larger box sizes (>18µm). Under our experimental conditions, the input numbers per class were equal, but this will not be the case in situations where the ground truth is unknown (and needs to be predicted by the CNN). Therefore, a well-balanced CNN is of high importance. This notion has been added to page 15 of the manuscript.

      The main advantage of nucleocentric profiling over whole-cell profiling in dense cultures is that it relies on a more robust nuclear segmentation method and is less sensitive to differences in cell density (Suppl. Fig. 1D). In other words, in dense cultures, the segmentation mask will contain similar regional input as the nuclear mask and the nucleocentric crop will contain more perinuclear information which contributes to the prediction accuracy. Therefore, at high densities, the performance of the CNN on whole-cell crops decreases owing to poorer segmentation performance. A CNN that uses nucleocentric crops, will be less sensitive to these errors. This notion has been added to pages 14-15 of the manuscript. 

      Additionally, the impact of this work will be limited, given the authors do not provide a specific link to the public source code that they used to process and analyze their data.

      The source code is now available on the Github page of the DeVos lab, under the following URL: https://github.com/DeVosLab/Nucleocentric-Profiling

      Recommendations for the authors:  

      Reviewing Editor (Recommendations For The Authors):

      Evaluation summary

      The authors present a new application of the high-content image-based morphological profiling Cell Painting (CP) to single cell type classification in mixed heterogeneous induced pluripotent stem cellderived mixed neural cultures. Machine learning models were trained to classify single cell types according to either "engineered" features derived from the image or from the raw CP multiplexed image. The authors systematically evaluated experimental (e.g., cell density, cell types, fluorescent channels, replication biases) and computational (e.g., different models, different cell regions) parameters and argue that focusing on the nucleus and its surroundings contains sufficient information for robust and accurate cell type classification. Models that were trained on mono-cultures (i.e., containing a single cell type) could generalize for cell type prediction in mixed co-cultures, and describe intermediate states of the maturation process of iPSC-derived neural progenitors to differentiation neurons.

      Strengths:

      Automatically identifying single-cell types in heterogeneous mixed-cell populations is an important application and holds great promise. The simple and high-content assay democratizes use and enables adoption by other labs. The manuscript is supported by comprehensive experimental and computational validations. The manuscript is well-written and easy to follow.

      Weaknesses:

      The conclusion is that the nucleocentric approach (including a small area beyond the nucleus) is not well supported, and may just be better by random chance. If better supported by additional experiments, this may influence how the field performs cell profiling in the future. Model interpretability (GradCAM) analysis is not convincing. The lack of a public source code repository is also limiting the impact of this study. There are missing details in the figure panels, figure legend, and text that would help the reader to better appreciate some of the technical details.

      Essential revisions:

      To reach a "compelling" strength of evidence the authors are requested to either perform a comprehensive analysis of the effect of ROI size on performance, or tune down statements regarding the superior performance of their "nucleocentric" approach. Further addition of a public and reproducible source code GitHub repository will lead to an "exceptional" strength of evidence.

      To answer the main comment, we have performed an experiment in which we varied the size of the nucleocentric patch and quantified CNN performance. We have also evaluated the operational window of our method by varying the resolution and SNR and we have experimented with different background blanking methods. We have expanded our examples of GradCAM images and now also made our source code and an example data set available via GitHub.

      Reviewer #1 (Recommendations For The Authors):

      I think that an evaluation of how the excluded cells affect our ability to measure the cell type composition of the population would be helpful to better understand the limitations and practical measurement noise introduced by this approach. A similar evaluation of the excluded cells can also help to better understand the benefit of nucleocentric vs. cell representations by more convincingly demonstrating the case for the nucleocentric approach. In any case, I recommend discussing in more depth the arguments for using the nucleocentric representation and why it is superior to the nuclear representation.

      The benefits of nucleocentric representation over nuclear and whole-cell representation are discussed more in depth at pages 14-15 of the manuscript. 

      “The nucleocentric approach, which is based on more robust nuclear segmentation, minimizes such mistakes whilst still retaining input information from the structures directly surrounding the nucleus. At higher cell density, the whole-cell body segmentation becomes more error-prone, while also loosing morphological information (Suppl. Fig. 1D). The nucleocentric approach is more consistent as it relies on a more robust segmentation and does not blank the surrounding region. This way it also buffers for occasional nuclear segmentation errors (e.g., where blebs or parts of the nucleus are left undetected).”

      It is not entirely clear to me why Figure 5 moves back to "engineered" features after previous figures showed the superiority of the deep learning approach. Especially, where Figure 6 goes again to DL. Dimensionality reduction can be also applied to DL-based classifications (e.g., using the last layer).

      Following up on the reviewers’ interesting comment, we extracted the embeddings from the trained CNN and performed UMAP dimensionality reduction. The results are shown in Fig. 3D, 6F and supplementary figure 1B and added to the manuscript on pages 6, 8 and 12. 

      We concluded that unsupervised dimensionality reduction using the feature embeddings could separate cell type clusters, where the distance between the clusters reflected the morphological similarity between the cell lines. 

      I would recommend including more comprehensive GRADCAM panels in the SI to reduce the concern of cherry-picking examples. What is the interpretation of the nucleocentric area?

      A more extensive set of GradCAM images have now been included in supplementary material (Supplementary figure 3) using the same random seeds for all conditions, thus avoiding any cherry picking. We interpret the GradCAM maps on the nucleocentric crops as highlighting the structures surrounding the nucleus (reflecting ER, mitochondria, Golgi) indicating their importance in correct cell classification. This was added to the manuscript on pages 9 and 15.

      Missing/lacking details and suggestions in the figure panels and figure legend:

      - Scale bars missing in some of the images shown (e.g., Figure 2F, Figure 3D, Figure 4, Supplementary Figure 4), what are the "composite" channels (e.g., Figure 2F), missing x-label in Figure 3B. 

      These have now been added.

      - Terms that are not clear in the figure and not explained in the legend, such as FITC and cy3 energy (Figure 1C). 

      The figure has been adapted to better show the region, channel and feature. We have now added a Table (Table 5), detailing the definition of each morphological feature that is extracted. On page 27, information on feature extraction is noted.

      - Details that are missing or not sufficiently explained in the figure legends such as what each data point represents and what is Gini importance (Figure 1D) 

      We have added these explanations to the figure legends. The Gini importance or mean decrease in impurity reflects how often this feature is used in decision tree splits across all random forest trees.

      Is it the std shown in Figure 2C?

      Yes, this has now been added to the legend.  

      It is not fully clear what is single/mixed (Figure 2D)

      Clarification is added to the legend and in the manuscript on page 6.

      explain what is DIV 13-90 in the legend (Figure 5).

      DIV stands for days in vitro, here it refers to the days in culture since the start of the neural induction process. This has been added in the legend.

      and state what are img1-5 (Supplementary Figures 1B-C) Clarification has been added to the legend.

      - Supplementary Figure 1. What is the y-axis in panel C and how do the results align with the cell mask in panel B?

      The y-axis represents the intersection over union (IoU). The IoU quantifies the overlap between ground truth (manually segmented ROI) and the ROI detected by the segmentation algorithm. It is defined as the area of the overlapping region over the total area. This clarification has been added to the legend.

      - Supplementary Figure 1 and Methods. Please explain when CellPose and when StarDist were applied.

      Added to supplementary figure and methods at page 24. In the case of nuclear segmentation (nucleus and nucleocentric crops), Stardist was used. For whole-cell crops, cell segmentation using Cellpose was used.

      - Supplementary Figure 4C - the color code is different between nuclear and nucleocentric - this is confusing.

      We have changed to color code to correspond in both conditions in Fig. 1A.

      - Figure 3B - better to have a normalized measure in the x-axis (number of cells per area in um^2)

      We agree and have changed this.

      Suggestions and missing/lacking details in the text:

      • Line #38: "we then applied this" because it is the first time that this term is presented.

      This has been rephrased.

      • Line #88: a few words on what were the features extracted would be helpful.

      Short description added to page 26-27 and detailed definition of all features added in table 5.

      -  Line #91: PCA analysis - the authors can highlight what (known) features were important to PC1 using the linear transformation that defined it.

      The 5 most important features of PC1 were (in order of decreasing importance): channel 1 dissimilarity, channel 1 homogeneity, nuclear perimeter, channel 4 dissimilarity and nuclear area.  

      - Line #92: Order of referencing Supplementary Figure 4 before referencing Supplementary Figure 13.

      The order of the Supplementary images was changed to follow the chronology. 

      • Line #96: Can the authors show the data supporting this claim?

      The unsupervised UMAP shown in fig. 1B is either color coded by cell type (left) or replicate (right). Based on this feature map, we observe clustering along the UMAP1 axis to be associated with the cell type. Variations in cellular morphology associated with the biological replicate are more visible along the UMAP2 axis. When looking at fig. 1C, the feature map reflecting the cellular area shows a gradient along the UMAP1 direction, supporting the assumption that cell area contributes to the cell type separation. On the other hand, the average intensity (Channel 2 intensity) has a gradient within the feature map along the UMAP2 direction. This corresponds to the pattern associated with the inter-replicate variability in panel B.

      - Line #108: what is "nuclear Cy3 energy"?

      This represents the local change of pixel intensities within the ROI in the nucleus in the 3rd channel dimension. This parameter reflects the texture within the nuclear region for the phalloidin and WGA staining. The definitions of all handcrafted features are added in table 5 of the manuscript.

      - Line #110-112: Can the authors show the data supporting this claim?

      The figure has been changed to include the results from a filtered and unfiltered dataframe (exclusion and inclusion of redundant features). Features could be filtered out if the correlation was above a threshold of 0.95. This has been added to page 6 of the manuscript and fig. 1D.  

      - Line #115-116: please state the size of the mask.

      Added to the text (page 6). We used isotropic image crops of 60µm centred on individual cell centroids.

      - Lines 120-122: more details will make this more clear (single vs. mixed).

      This has been changed on page 6 of the manuscript.

      • Line #142: "(mimics)" - is it a typo?

      Tissue mimics refers to organoids/models that are meant to replicate the physiological behaviour.

      • Line #159: the bounding box for nucleocentric analysis is 15x15um (and not 60), as stated in the Methods.

      Thank you for pointing out this mistake. We have adapted this.

      - Line #165: what is the interpretation of what was important for the nucleocentric classification?

      The colour code in GradCAM images is indicative of the attention of the CNN (the more to the red, the more attention). In fig. 4D and Suppl. Fig. 3 the structures directly surrounding the nucleus receive high attention from the CNN trained on nucleocentric crops. This has been added to the manuscript page 9 and 15.

      • Section starting in line #172: not explicitly stated what model was used (nucleocentric?).

      Added in the legend of fig. 5. For these experiments, the full cell segmentation was still used. 

      - Section starting in line #199: why use a feature-based model rather than nucleocentric? A short sentence would be helpful.

      For CNN training, nucleocentric profiling was used. In response to a legitimate question of one of the reviewers, the feature-based UMAP analysis was replaced with the feature embeddings from the CNN. 

      - Line #213: Fig. 5B does not show transitioning cells.

      Thank you for pointing this out, this was a mistake and has been changed.

      Lines #218-220: not fully clear to some readers (culture condition as a weak label), more details can be helpful.

      We changed this at page 11 of the manuscript for clarity. 

      “This gating strategy resulted in a fractional abundance of neurons vs. total (neurons + NPC) of 36,4 % in the primed condition and 80,0% in the differentiated condition (Fig. 6C). We therefore refer to the culture condition as a weak label as it does not take into account the heterogeneity within each condition (well).”

      -  Line #230: "increasing dendritic outgrowth" - what does it mean? Can you explicitly highlight this phenotype in Figure 5G?

      When the cells become more mature during differentiation, the cell body becomes smaller and the neurons form long, thin ramifications. This explanation has been added to page 12 of the manuscript.

      • Line #243: is it the nucleocentric CNN?

      Yes.

      • Lines #304-313, the authors might want to discuss other papers dealing with continuous (non-neural) differentiation state transitions (eg PMID: 38238594).  

      A discussion of the use of morphological profiling for longitudinal follow-up of continuous differentiation states has been added to the manuscript at page 18. 

      - Line #444: cellpose or stardist? How did the authors use both?

      Clarification has been added to supplementary figure 1 and methods at page 24. Stardist was used for nuclear segmentation, whereas Cellpose was used for whole-cell segmentation. 

      • Line #470-474: I would appreciate seeing the performance on the full dataset without exclusions.

      Cells have been excluded based on 3 arguments: the absence of DAPI intensity, too small nuclear size and absence of ground truth staining. The first two arguments are based on the assumption that ROIs that contain no DAPI signal or are too small are errors in cell segmentation and therefore should not be taken along in the analysis. The third filtering step was based on the ground-truth IF signal. Not filtering out these cells with a ‘dubious’ IF profile (e.g., cells that might be transitioning or are of a different type) would negatively affect the model by introducing noise. It is correct that the predictions are based only on these inputs and so cells of a subsequent test set will only be classified according to these labels which might introduce bias. However, the model could predict increase in neuron/NPC ratio with culture age in absence of ground-truth staining (and thus IF-based filtering).

      Reviewer #2 (Recommendations For The Authors):

      Figure 1A: it would be interesting to the reader to see the SH-SY5Y data as well.

      This has been added in fig. 1A.

      Figure 3A: 95-100% image: showing images with the same magnification as the others would help to appreciate the cell density.

      Now fig. 4A. The figure has been changed to make sure all images have the same magnification. 

      Figure Supp 4 (line 132) is referred to before Figure Supp1 (line 152).

      The image order and numbering has been changed to solve this issue.

      Figure Supp 2 & 3 are not referred to in the text.

      This has been adjusted.

      Line 225: a statistical test would help to convince of the accuracy of these results (Figure 5C vs Figure 5F)?

      These figures represent the total ROI counts and thus represent a single number.

      Line 227: Could you explain to the reader, in a few words, what a dual SMAD inhibition is?

      This has been added to the manuscript at page 20. 

      “This dual blockade of SMAD signalling in iPSCs is induces neural differentiation by synergistically causing the loss of pluripotency and push towards neuroectodermal lineage.”

      Reviewer #3 (Recommendations For The Authors):

      I have a few concerns and several comments that, if addressed, may strengthen conclusions, and increase clarity of an already technically sound paper.

      Concerns

      • The results presented in Figure 3 panel D, may indicate a critical error in data processing and interpretation that the authors must address. The GradCAM method highlights the background as having the highest importance. While it can be argued in the nucleocentric profiling method that GradCAM focuses on the nuclear membrane, the background is highly important even for the nuclear profiling method, which should provide little information. What procedure did the authors use for mask subtraction prior to CNN training? Could the segmentation algorithm be performing differently between cell lines? The authors interpret the GradCAM results to indicate a proxy for nuclear size, but then why did the CNN perform so much better than random forest using hand-crafted features that include this variable? The authors should also present size distributions between cell lines (and across seeding densities, in case one of the cell lines has different compaction properties with increasing density).

      Perhaps clarifying this sentence (lines 166-168) would help as well: "As nuclear area dropped with culture density, the dynamic range decreased, which could explain the increased error rate of the CNN for high densities unrelated to segmentation errors (Suppl. Fig. 4B)." What do the authors mean by "dynamic range" and it is not clear how Supplementary Figure 4B provides evidence for this? 

      The dynamic range refers to the difference between the minimum and maximum nuclear area. We expect the difference to decrease at highe rdensity owing to the crowding that forces all nuclei to take on a more similar (smaller) size.

      More clarification on this has been added to page 9 of the manuscript.

      I certainly understand that extrapolating the GradCAM concern to the remaining single-cell images using only four (out of tens of thousands of options) is also dangerous, but so is "cherry-picking" these cells to visualize. Finally, I also recommend that the authors quantitatively diagnose the extent of the background influence according to GradCAM by systematically measuring background influence in all cells and displaying the results per cell line per density.

      To avoid cherry picking of GradCAM images, we have now randomly selected for each condition and density 10 images (using random seeds to avoid cherry-picking) and added these in a Suppl. Fig. 3.

      In answer to this concern, we refer to the response above: 

      “To address the first point, we have adapted the GradCAM images to show an overlay of the input crop and GradCAM heatmap to give a better view of the structures that are highlighted by the CNN. We further investigated the influence of the background on the prediction performance. Our finding that a CNN trained on a monoculture retains a relatively high performance on cocultures implies that the CNN uses the salient characteristics of a cell to recognize it in more complex heterogeneous environments. Assuming that the background can vary between experiments, the prediction of a pretrained CNN on a new dataset indicates that cellular characteristics are used for robust prediction.  When inspecting GradCAM images obtained from the nucleocentric CNN approaches (now added in Suppl. Fig. 3), we noticed that the nuclear periphery typically contributed the most (but not exclusively) to the prediction performance. When using only the nuclear region as input, GradCAMs were more strongly (but again not exclusively) directed to the background surrounding the nuclei. To train the latter CNN, we had cropped nuclei and set the background to a value of zero. To rule out that this could have introduced a bias, we have now performed the exact same training and classification, but setting the background to random noise instead (Suppl. Fig. 2). While this effectively diverted the attention of the GradCAM output to the nucleus instead of the background, the prediction performance was unaltered. We therefore assume that irrespective of the background, when using nuclear crops as input, the CNN is dominated by features that describe nuclear size. We observe that nuclear size is significantly different in both cell types (although intranuclear features also still contribute) which is also reflected in the feature map gradient in the first UMAP dimension (Suppl. Fig. 2). This notion has been added to the manuscript (page 9) and Suppl. Fig. 2.”

      • The data supporting the conclusion about nucleocentric profiling outperforming nuclear and full-cell profiling is minimal. I am picking on this conclusion in particular, because I think it is a super cool and elegant result that may change how folks approach issues stemming from cell density disproportionately impacting profiling. Figures 3B and 3C show nucleocentric slightly outperforming full cell, and the result is not significant. The authors state in lines 168-170: "Thus, we conclude that using the nucleocentric region as input for the CNN is a valuable strategy for accurate cell phenotype identification in dense cultures." This is somewhat of a weak conclusion, that, with additional analysis, could be strengthened and add high value to the community. Additionally, the authors describe the nucleocentric approach insufficiently. In the methods, the authors state (lines 501-503): "Cell crops (60μm whole cell - 15μm nucleocentric/nuclear area) were defined based on the segmentation mask for each ROI." This is not sufficient to reproduce the method. What software did the authors use?

      Presumably, 60μm refers to a box size around cytoplasm? Much more detail is needed. Additionally, I suggest an analysis to confirm the impact of nucleocentric profiling, which would strengthen the authors' conclusions. I recommend systematically varying the subtraction (-30μm, -20μm, -10μm, 5μm, 0, +5μm, +10μm, etc.) and reporting the density-based analysis in Figure 3B per subtraction. I would expect to see some nucleocentric "sweet spot" where performance spikes, especially in high culture density. If we don't see this difference, then the non-significant result presented in Figures 3B and C is likely due to random chance. The authors mention "iterative data erosion" in the abstract, which might refer to what I am recommending, but do not describe this later.

      More detail was added to the methods describing the image crops given as input to the CNN (page 28 of the manuscript). 

      “Crops were defined based on the segmentation mask for each ROI. The bounding box was cropped out of the original image with a fixed patch size (60µm for whole cells, 18µm for nucleus and nucleocentric crops) surrounding the centroid of the segmentation mask. For the whole cell and nuclear crops, all pixels outside of the segmentation mask were set to zero. This was not the case for the nucleocentric crops. Each ROI was cropped out of the original morphological image and associated with metadata corresponding to its ground truth label.”

      To address this concern, we also refer to the answer above. 

      “We have performed a more extensive analysis in which the patch size was varied from 0.6 to 120µm around the nuclear centroid (Fig. 4E and page 9 of the manuscript). We observed that there is little effect of in- or decreasing patch size on the average F-score within the nuclear to cell window, but that the imbalance between the precision and recall increases towards the larger box sizes (>18µm). Under our experimental conditions, the input numbers per class were equal, but this will not be the case in situations where the ground truth is unknown (and needs to be predicted by the CNN). Therefore, a well-balanced CNN is of high importance. This notion has been added to page 12 of the manuscript.

      The main advantage of nucleocentric profiling over whole-cell profiling in dense cultures is that it relies on a more robust nuclear segmentation method and is less sensitive to differences in cell density (Suppl. Fig. 1D). In other words, in dense cultures, the segmentation mask will contain similar regional input as the nuclear mask and the nucleocentric crop will contain more perinuclear information which contributes to the prediction accuracy. Therefore, at high densities, the performance of the CNN on whole-cell crops decreases owing to poorer segmentation performance. A CNN that uses nucleocentric crops, will be less sensitive to these errors. This notion has been added to pages 14-15 of the manuscript.“

      Comments

      • There is a disconnect between the abstract and the introduction. The abstract highlights the nucleocentric model, but then it is not discussed in the introduction, which focuses on quality control. The introduction would benefit from some additional description of the single-cell or whole-image approach to profiling.

      We highlight the importance of QC of complex iPSC-derived neural cultures as an application of morphological profiling. We used single-cell profiling to facilitate cell identification in these mixed cultures where the whole-image approach would be unable to deal with the heterogeneity withing the field of view. In the introduction, we added a description of the whole-image vs. single-cell approach to profiling (page 4). In the discussion (page 18), we further highlight the application of this single-cell profiling approach for QC purposes. 

      - Comments on Figure 1. It is unclear how panel B shows "without replicate bias". 

      In response to this comment, we refer to the answer above: “The unsupervised UMAP shown in fig. 1B is either color coded by cell type (left) or replicate (right). Based on this feature map, we observe clustering along the UMAP1 axis to be associated with the cell type. Variations in cellular morphology associated with the biological replicate are more visible along the UMAP2 axis. When looking at fig. 1C, the feature map reflecting the cellular area shows a gradient along the UMAP1 direction, supporting the assumption that cell area contributes to the cell type separation. On the other hand, the average intensity (Channel 2 intensity) has a gradient within the feature map along the UMAP2 direction. This corresponds to the pattern associated with the inter-replicate variability in panel B.” We added this notion to page 5 of the manuscript.

      The paper would benefit from a description of how features were extracted sooner.

      Information on the feature extraction was added to the manuscript at page 27. An additional table (table 5) has been added with the definition of each feature.  

      - Comments on Supplementary Figure 4. The clustering with PCA is only showing 2 dimensions, so it is not surprising UMAP shows more distinct clustering.

      We used two components for UMAP dimensionality reduction, so the data was also visualized in two dimensions. However, we agree that UMAP can show more distinct clustering as this method is non-linear.

      Why is Figure S4 the first referenced Supplementary Figure?

      This has been changed. 

      • Comments on Figure 2. Need discussion of the validation set - how was it determined? Panel E might have the answer I am looking for, but it is difficult to decipher exactly what is being done. The terminology needs to be defined somewhere, or maybe it is inconsistent. It is tough to tell. For example, what exactly are the two categories of model validation (cross-validation and independent testing)?

      Additional clarification has been added to the manuscript at pages 6-7 and figure 2.

      The metric being reported is accuracy for the independent replicate if the other two are used to train?

      Yes. 

      Panel C is a very cool analysis. Panel F needs a description of how those images were selected, randomly?

      Added in the methods section (page 29). GradCAM analysis was used to visualize the regions used by the CNN for classification. This map is specific to each cell. Images are selected randomly out the full dataset for visualization.  

      They also need scale bars.

      Added to the figures. 

      Panel G would benefit from explicit channel labels (at least a legend would be good!).

      Explanation has been added to the legend. All color code and channel numbering are consistent with fig. 1A. 

      What do the dots and boxplots represent? The legend says, "independent replicates", but independent replicates of, I assume, different model initializations?

      Clarification has been added to the figure legends. For plots showing the performance of a CNN or RF classifier, each dot represents a different model initialization. Each classifier has been initialized at least 3 times. When indicated, the model training was performed with different random seeds for data splitting.

      • Comments on Figure 3. Panel A needs scale bar. See comment on Panel D in concern #1 described above. 

      This has been added.

      • Comments on Supplementary Figure 1. A reader will need a more detailed description in panel C. I assume that the grey bar is the average of the points, and the points represent different single cells?

      How many cells? How were these cells selected? 

      This information on the figure (now Suppl. Fig. 1D), has been added to the legend.

      “Left: Representative images of 1321N1 cells with increasing density alongside their cell and nuclear mask produced using resp. Cellpose and Stardist. Images are numbered from 1-5 with increasing density. Upper right: The number of ROIs detected in comparison to the ground truth (manual segmentation). A ROI was considered undetected when the intersection over union (IoU) was below 0,15. Each bar refers to the image number on the left. The IoU quantifies the overlap between ground truth (manually segmented ROI) and the ROI detected by the segmentation algorithm. It is defined as the area of the overlapping region over the total area. IoU for increasing cell density for cell and nuclear masks is given in the bottom right. Each point represents an individual ROI. Each bar refers to the image number on the left.”

      • Comments on Figure 4. More details on quenching are needed for a general audience. The markers chosen (EdU and BrdU) are generally not specific to cell type but to biological processes (proliferation), so it is confusing how they are being used as cell-type markers. 

      The base analogues were incorporated into each cell line prior to mixing them, i.e.  when they were still growing in monoculture so they could be labelled and identified after co-seeding and morphological profiling. Additional clarification has been added to the manuscript (page 26) 

      It is also unclear why reducing CV is an important side-effect of finetuning. CV of what? The legend says, "model iterations", but what does this mean? 

      The dots in the violinplot are different CNN initializations. A lower variability between model initializations is an indicator of certainty of the results. Prior to finetuning, the results of the CNN were highly variable leading to a high CoV between the different CNNs. This means the outcome after finetuning is more robust.

      • Comments on Figure 5. This is a very convincing and well-described result, kudos! This provides another opportunity to again compare other approaches (not just nucleocentric). Additionally, since the UMAP space uses hand-crafted features. The authors could consider interpreting the specific morphology features impacted by the striking gradual shift to neuron population by fitting a series of linear models per individual feature. This might confirm (or discover) how exactly the cells are shifting morphology.

      The supervised UMAP on the handcrafted features did not highlight any features contributing to the separation. Using the supervised UMAP, the clustering is dominated by the known cell type. Unsupervised UMAP on the handcrafted features does not show any clustering. In response to a previous comment, we adapted the figure to show UMAP dimensionality reduction using the feature embeddings from the cell-based CNN. This unsupervised UMAP does show good cell type separation, but it does not use any directly interpretable shape descriptors.

      • General comments on Methods. The section on "ground truth alignment" needs more details. Why was this performed? 

      Following sequential staining and imaging rounds, multiple images were captured representing the same cell with different markers. Lifting the plate of the microscope stage and imaging in sequential rounds after several days results in small linear translations in the exact location of each image. These linear translations need to be corrected to align (or register) morphological with ground truth image data within the same ROI. This notion has been added to the manuscript at page 26. 

      Handcrafted features extracted using what software? 

      The complete analysis was performed in python. All packages used are listed in table 4. Handcrafted features were extracted using the scikit-image package (regionprops and GLCM functions). This has been added to the manuscript at page 27.

      Software should be cited more often throughout the manuscript. 

      Lastly, the GitHub URL points to the DeVosLab organization, but should point to a specific repository. Therefore, I was unable to review the provided code. A well-documented and reproducible analysis pipeline should be included.

      A test dataset and source code are available on GitHub:  https://github.com/DeVosLab/Nucleocentric-Profiling

    2. eLife Assessment

      This study presents an important application of high-content image-based morphological profiling to quantitatively and systematically characterize induced pluripotent stem cell-derived mixed neural cultures cell type compositions. Compelling evidence through rigorous experimental and computational validations support new potential applications of this cheap and simple assay.

    3. Joint Public Review:

      Summary:

      The authors present a new application of the high-content image-based morphological profiling Cell Painting (CP) to single cell type classification in mixed heterogeneous induced pluripotent stem cell-derived mixed neural cultures. Machine learning models were trained to classify single cell types according to either "engineered" features derived from the image or from the raw CP multiplexed image. The authors systematically evaluated experimental (e.g., cell density, cell types, fluorescent channels) and computational (e.g., different models, different cell regions) parameters and convincingly demonstrated that focusing on the nucleus and its surroundings contain sufficient information for robust and accurate cell type classification. Models that were trained on mono-cultures (i.e., containing a single cell type) could generalize for cell type prediction in mixed co-cultures, and to describe intermediate states of the maturation process of iPSC-derived neural progenitors to differentiation neurons.

      Strengths:

      Automatically identifying single cell types in heterogeneous mixed cell populations hold great promise to characterize mixed cell populations and to discover new rules of spatial organization and cell-cell communication. Although the current manuscript focuses on the application of quality control of iPSC cultures, the same approach can be extended to a wealth of other applications including in depth study of the spatial context. The simple and high-content assay democratizes use and enables adoption by other labs.

      The manuscript is supported by comprehensive experimental and computational validations that raises the bar beyond the current state of the art in the field of high-content phenotyping and makes this manuscript especially compelling. These include (i) Explicitly assessing replication biases (batch effects); (ii) Direct comparison of feature-based (a la cell profiling) versus deep-learning-based classification (which is not trivial/obvious for the application of cell profiling); (iii) Systematic assessment of the contribution of each fluorescent channel; (iv) Evaluation of cell-density dependency; (v) explicit examination of mistakes in classification; (vi) Evaluating the performance of different spatial contexts around the cell/nucleus; (vii) generalization of models trained on cultures containing a single cell type (mono-cultures) to mixed co-cultures; (viii) application to multiple classification tasks.

      Comments on latest version:

      I have consulted with Reviewer #3 and both of us were impressed by revised manuscript, especially by the clear and convincing evidence regarding the nucleocentric model use of the nuclear periphery and its benefit for the case of dense cultures. However, there are two issues that are incompletely addressed (see below). Until these are resolved, the "strength of evidence" was elevated to "compelling".

      First, the analysis of the patch size is not clearly indicating that the 12-18um range is a critical factor (Fig. 4E). On the contrary, the performance seems to be not very sensitive to the patch size, which is actually a desired property for a method. Still, Fig. 4B convincingly shows that the nucleocentric model is not sensitive to the culture density, while the other models are. Thus, the authors can adjust their text saying that the nucleocentric approach is not sensitive to the patch size and that the patch size is selected to capture the nucleus and some margins around it, making it less prone to segmentation errors in dense cultures.

      Second, the GitHub does not contain sufficient information to reproduce the analysis. Its current state is sparse with documentation that would make reproducing the work difficult. What versions of the software were used? Where should data be downloaded? The README contains references to many different argparse CLI arguments, but sparse details on what these arguments actually are, and which parameters the authors used to perform their analyses. Links to images are broken. Ideally, all of these details would be present, and the authors would include a step-by-step tutorial on how to reproduce their work. Fixing this will lead to an "exceptional" strength of evidence.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1. In Figure 1, the MafB antibody (Sigma) was used to identify Renshaw cells at P5. However, according to the supplementary Figure 3D, the specificity of the MafB antibody (Sigma) is relatively low. The image of MafB-GFP, V1-INs, and MafB-IR at P5 should be added to the supplementary figure. The specificity of MaFB-IR-Sigma in V1 neurons at P5 should be shown. This image also might support the description of the genetically labeled MafB-V1 distribution at P5 (page 8, lines 28-32). 

      We followed the reviewer’s suggestion and moved analyses of the MafB-GFP mouse to a supplemental figure (Fig S3). The characterization of MafB immunoreactivities is now in supplemental Figure S2 and the related text in results was also moved to supplemental to reduce technicalities in the main text. We added confocal images of MafB-GFP V1 interneurons at P5 showing immunoreactivities for both MafB antibodies, as suggested by the reviewer (Fig S2A,B). We agree with the reviewer that this strengthens our comparisons on the sensitivity and specificity of the two MafB antibodies used in this study. 

      As explained in the preliminary response we cannot show lack of immunoreactivity for MafB antibodies in MafB GFP/GFP knockout mice at P5 because MafB global KOs die at birth. This is why we used tissues from late embryos to check MafB immunoreactivities (Figure S2C and S2D). We made this point clearer in the text and supplemental figure legends.

      Comment 2. The proportion of genetically labeled FoxP2-V1 in all V1 is more than 60%, although immunolabeled FoxP2-V1 is approximately 30% at P5. Genetically labeled Otp-V1 included other nonFoxP2 V1 clades (Fig. 8L-M). I wonder whether genetically labeled FoxP2-V1 might include the other three clades. The authors should show whether genetically labeled FoxP2-V1 expresses other clade markers, such as pou6f2, sp8, and calbindin, at P5. 

      We included the requested data in Figure 3E-G. Lineage-labeled Foxp2-V1 neurons in our genetic intersection do not include cells from other V1-clades.

      Reviewer 2:

      Comment 1. The current version of the paper is VERY hard to read. It is often extremely difficult to "see the forest for the trees" and the reader is often drowned in methodological details that provide only minor additions to the scientific message. Non-specialists in developmental biology, but still interested in the spinal cord organization, especially students, might find this article challenging to digest and there is a high risk that they will be inclined to abandon reading it. The diversity of developmental stages studied (with possible mistakes between text and figures) adds a substantial complexity in the reading. It is also not clear at all why authors choose to focus on the Foxp2 V1 from page 9. Naively, the Pou6f2 might have been equally interesting. Finally, numerous discrepancies in the referencing of figures must also be fixed. I strongly recommend an in-depth streamlining and proofreading, and possibly moving some material to supplement (e.g. page 8, and elsewhere).

      The whole text was re-written and streamlined with most methodological discussion (including the section referred to by the reviewer) transferred to supplemental data. Nevertheless, enough details on samples, stats and methods were retained to maintain the rigor of the manuscript. 

      The reasons justifying a focus on Foxp2-V1 interneurons were fully explained in our preliminary response. Briefly, we are trying to elucidate V1 heterogeneity, and prior data showed that this is the most heterogeneous V1 clade (Bikoff et al., 2016), so it makes sense it was studied further. We agree that the Pou6f2 clade is equally interesting and is in fact the subject of several ongoing studies.

      Comment 2. … although the different V1 populations have been investigated in detail regarding their development and positioning, their functional ambition is not directly investigated through gain or loss of function experiments. For the Foxp2-V1, the developmental and anatomical mapping is complemented by a connectivity mapping (Fig 6s, 8), but the latter is fairly superficial compared to the former. Synapses (Fig 6) are counted on a relatively small number of motoneurons per animal, that may, or may not, be representative of the population. Likewise, putative synaptic inputs are only counted on neuronal somata. Motoneurons that lack of axo-somatic contacts may still be contacted distally. Hence, while this data is still suggestive of differences between V1 pools, it is only little predictive of function.

      We fully answered the question on functional studies in the preliminary response. Briefly, we are currently conducting these studies using various mouse models that include chronic synaptic silencing using tetanus toxin, acute partial silencing using DREADDs, and acute cell deletion using diphtheria toxin. Each intervention reveals different features of Foxp2-V1 interneuron functions, and each model requires independent validation. Moreover, these studies are being carried out at three developmental stages: embryos, early postnatal period of locomotor maturation and mature animals. Obviously, this is all beyond the goals and scope of the present study. The present study is however the basis for better informed interpretations of results obtained in functional studies.

      Regarding the question on synapse counts, we explained in the preliminary results fully why we believe our experimental designs for synapse counting at the confocal level are among the most thorough that can be found in the literature. We counted a very large number of motoneurons per animal when adding all motor column and segments analyzed in each animal. Statistical power was also enough to detect fundamental variation in synaptic density among motor columns.

      We focus our analyses on motoneuron cells bodies because analysis of full dendritic arbors on all motor columns present throughout all lumbosacral segments is not feasible. Please see Rotterman et al., 2014 (J. of Neuroscience; doi: 10.1523/JNEUROSCI.4768-13.2014) for evaluation of what this entails for a single motoneuron. We agree with the reviewer that analyses of V1 synapses over full dendrite arbors in specific motoneurons will be very relevant in further studies. These should be carried out now that we know which motor columns are of high interest. Nevertheless, inhibitory synapses exert the most efficient modulation of neuronal firing when they are on cell bodies, and our analyses clearly suggest a difference in in cell body inhibitory synapses targeting between different V1 interneuron types that we find very relevant.

      Comment 3. I suggest taking with caution the rabies labelling (Figure 8). It is known that this type of Rabies vectors, when delivered from the periphery, might also label sensory afferents and their postsynaptic targets in the cord through anterograde transport and transneuronal spread (e.g., Pimpinella et al., 2022). Yet I am not sure authors have made all controls to exclude that labelled neurons, presumed here to be premotoneurons, could rather be anterogradely labelled from sensory afferents. 

      Over the years, we performed many extensive controls and validation of rabies virus transsynaptic tracing methods. These were presented at two SfN meetings (Gomez-Perez et al., 2015 and 2016; Program Nos. 242.08 and 366.06). Our validation of this technique was fully explained in our preliminary response. We also pointed out that the methods used by Pimpinella et al. have a very different design and therefore their results are not comparable to ours. In this study we injected the virus at P15 into leg muscles, and not directly into the spinal cord. In our hands, and as cited in Pimpinella et al., the rabies virus loses tropism for primary afferents with age when injected in muscle. The lack of primary afferent labeling in key lumbosacral segments (L4 and L5) is now illustrated in a new supplemental figure (Figure S6). This figure also shows some starter motoneurons. As explained in the text and in our previous response, these are few in number because of the reduced infection rate when using this method in mature animals (after P10).  

      Comment 4. The ambition to differentiate neuronal birthdate at a half-day resolution (e.g., E10 vs E10.5) is interesting but must be considered with caution. As the author explains in their methods, animals are caged at 7pm, and the plug is checked the next morning at 7 am. There is hence a potential error of 12h. 

      We agree with the reviewer, and we previously explicitly discussed these temporal resolution caveats. We have now further expanded on this in new text (see middle paragraph in page 5). Nevertheless, the method did reveal the temporal sequence of neurogenesis of V1 clades with close to 12-hour resolution.

      As explained in text and preliminary response this is because we analyzed a sufficient number of animals from enough litters and utilized very stringent criteria to count EdU positives. 

      Moreover, our results fit very well with current literature. The data agree with previous conclusions from Andreas Sagner group (Institut für Biochemie, Friedrich-Alexander-Universität Erlangen-Nürnberg), on spinal interneurons (including V1s) birthdates based on a different methodology (Delile J et al.

      Development. 2019 146(12):dev173807. doi: 10.1242/dev.173807. PMID: 30846445; PMCID: PMC6602353). In the discussion we compared in detail both the data and methods between Delile article and our results. We also cite Sagner 2024 review as requested later in the reviewer’s detailed comments. Our results also confirmed our previous report on the birthdates of V1-derived Renshaw cells and Ia inhibitory interneurons (Benito-Gonzalez A, Alvarez FJ J Neurosci. 2012 32(4):1156-70. doi: 10.1523/JNEUROSCI.3630-12.2012. PMID: 22279202; PMCID: PMC3276112). Finally, we recently received a communication notifying us that our neurogenesis sequence of V1s has been replicated in a different vertebrate species by Lora Sweeney’s group (Institute of Science and Technology Austria; direct email from this lab) and we shared our data with them for comparison. This manuscript is currently close to submission. Therefore, we are confident that despite the limitations of EdU birthdating we discussed, the conclusions we offered are strong and are being validated by other groups using different methods and species. We also want to acknowledge the positive comments of reviewer 3 regarding our birthdating study, indicating it is one the most rigorous he or she has ever seen.

      Reviewer 3:

      Comment 1. My only criticism is that some of the main messages of the paper are buried in technical details. Better separation of the main conclusions of the paper, which should be kept in the main figures and text, and technical details/experimental nuances, which are essential but should be moved to the supplement, is critical. This will also correct the other issue with the text at present, which is that it is too long.

      Similar to our response to comment 1 from Reviewer 2 we followed the reviewers’ recommendations and greatly summarized, simplified and removed technical details from the main text, trying not to decrease rigor.  

      Reviewer #1 (Recommendations For The Authors):

      In Figure 1, the definition of the area to analyze MafB ventral and MafB dorsal is unclear. It should be described.

      This has been clarified in both text and supplemental figure S3.

      “We focused the analyses on the brighter dorsal and ventral MafB-V1 populations defined by boxes of 100 µm dorsoventral width at the level of the central canal (dorsal) or the ventral edge of the gray matter (ventral) (Supplemental Figure S3B).”

      Problems with figure citation.

      We apologize for the mistakes. All have been corrected. 

      Reviewer #2 (Recommendations For The Authors):

      As indicated in the public review, I'd recommend to substantially revise the writing, for clarity. As such, the paper is extremely hard to read. I would also recommend justifying the focus on Foxp2 neurons.

      Also, the scope of the present paper is not clearly stated in the introduction (page 4).

      Done. We also modified the introduction such that the exact goals are more clearly stated.

      I would also recommend toning down the interpretation that V1 clades constitute "unique functional subsets" (discussion and elsewhere). Functional investigation is not performed, and connectomic data is partial and only very suggestive.

      We include the following sentence at the end of the 1st paragraph in the discussion:

      “This result strengthens the conclusion that these V1 clades defined by their genetic make-up might represent distinct functional subtypes, although further validation is necessary in more functionally focused studies.”

      Different post-natal stages are used for different sections of the manuscript. This is often confusing, please justify each stage. From the beginning even, why is the initial birthdating (Figure 1) done here at p5, while the previous characterization of clades was done at p0? I am not sure to understand the justification that this was chosen "to preserve expression of V1 defining TFs". Isn't the sooner the better?

      The birthdating study was carried out at P5. P5 is a good time point because there is little variation in TF expression compared to P0, as demonstrated in the results. Furthermore, later tissue harvesting allows higher replicability since it is difficult to consistently harvest tissue the day a litter is born (P0). Also technically, it is easier to handle P5 tissue compared to P0. The analysis of VGUT1 synapses was also done at P5 rather than later ages. This has two advantages: TFs immunoreactivities are preserved at this age, and also corticospinal projections have not yet reached the lumbar cord reducing interpretation caveats on the origins of VGUT1 synapses in the ventral horn (although VGLUT1 synapses are still maturing at this age, see below).

      Other parts of the study focus on different ages selected to be most adequate for each purpose. To best study synaptic connectivity, it is best to study mature spinal cords after synaptic plasticity of the first week. For the tracing study we thoroughly explain in the text the reasons for the experimental design (see also below in detailed comments). For counting Foxp2-V1 interneurons and comparing them to motor columns we analyze mature animals. For testing our lineage labeling we use animals of all ages to confirm the consistency of the genetic targeting strategy throughout postnatal development and into adulthood.

      Figure 5: wouldn't it be worth quantifying and illustrating cellular densities, in addition to the average number of Foxp2 neurons, across lumbar segments (panel D & E)? Indeed, the size of - and hence total number of cells within - each lumbar segment might not be the same, with a significant "enlargement" from L2 to L4 (this is actually visible on the transverse sections). Hence, if the total number of cells is in the higher in these enlarged segments, but the total number of Foxp2-V1 is not, it may mean that this class is proportionally less abundant.

      We believe the critical parameter is the ratio of Foxp2-V1s to motoneurons. This informs how Foxp2-V1 interneurons vary according to the size of the motor columns and the number of motoneurons overall.

      The question asked by the reviewer would best be answered by estimating the proportion of Foxp2-V1 neurons to all NeuN labeled interneurons. This is because interneuron density in the spinal cord varies in different segments. We are not sure what this additional analysis will contribute to the paper.

      Why, in the Rabies tracing scheme (Fig 8), the Rabies injection is performed at p15? As the authors explain in the text, rabies uptake at the neuromuscular junction is weak after p10. It is not clear to me why such experiments weren't done all at early postnatal stages, with a "classical" co-injection of TVA and Rabies.

      First, we do not need TVA in this experiment because we are using B19-G coated virus and injecting it into muscles, not into the spinal cord directly.

      Second, enhanced tracing occurs when the AAV is injected a few days before rabies virus. This is because AAV transgene expression is delayed with respect to rabies virus infection and replication. We have performed full time courses and presented these data in one abstract to SfN: Gomez-Perez et al., 2015 Program Nos. 242. We believe full description of these technical details is beyond the scope of this manuscript that has already been considered too technical.

      Third, the justification of P15 timing of injections for anterograde primary afferent labeling and retrograde monosynaptic labeling of interneurons is fully explained in the text. 

      “To obtain transcomplementation of RVDG-mCherry with glycoprotein in LG motoneurons, we first injected the LG muscle with an AAV1 expressing B19-G at P4. We then performed RVDG and CTB injections at P15 to optimize muscle targeting and avoid cross-contamination of nearby muscles. Muscle specificity was confirmed post-hoc by dissection of all muscles below the knee. Analyses were done at P22, a timepoint after developmental critical windows through which Ia (VGLUT1+) synaptic numbers increase and mature on V1-IaINs (Siembab et al., 2010)” 

      Furthermore, CTB starts to decrease in intensity 7 days after injection because intracellular degradation and rabies virus labeling disappears because cell death. Both limit the time of postinjection for analyses.

      Likewise, I am surprised not to see a single motoneuron in the rabies tracing (Fig 8, neither on histology nor on graphs (Fig 8). How can authors be certain that there was indeed rabies uptake from the muscle at this age, and that all labelled cells, presumed to be preMN, are not actually sensory neurons? It is known that Rabies vectors, when delivered from the periphery, might also label sensory afferents and their post-synaptic targets through anterograde transport and transneuronal spread (e.g., Pimpinella et al., 2022). This potential bias must be considered.

      This is fully explained in our previous response to the second reviewer’s general comments. We have also added a confocal image showing starter motoneurons as requested (Figure S6A).

      Please carefully inspect the references to figures and figure panels, which I suspect are not always correct.

      Thank you. We carefully revised the manuscript to correct these deficiencies and we apologize for them.

      Reviewer #3 (Recommendations For The Authors):

      Figure 1: Data here is absolutely beautiful and provides one of the most thorough studies, in terms of timepoints, number of animals analyzed, and precision of analysis, of edU-based birth timing that has been published for neuron subtypes in the spinal cord so far. My only suggestion is to color code the early and late born populations (in for example, different shades of green for early; and blue for late, to better emphasize the differences between them). It is very difficult to differentiate between the purple, red and black colors in G-I, which this would also fix. The antibody staining for Pou6f2 (F) is also difficult to see; gain could be increased on these images or insets added for clarity.

      The choice of colors is adapted for optimal visualization by people with different degrees of color blindness. Shades of individual colors are always more difficult to discriminate. This is personally verified by the senior corresponding author of this paper who has some color discrimination deficits. Moreover, each line has a different symbol for the same purpose of easing differentiation.

      Figure 2: This is also a picture-perfect figure showing further diversity by birth time even within a clade. One small aesthetic comment is that the arrows are quite unclear and block the data. Perhaps the contours themselves could be subdivided by region and color coded by birth time-such that for example the dorsal contours that emerge in the MafB clade at E11 are highlighted in their own color. Some quantification of the shift in distribution as well as the relative number of neurons within each spatially localized group would also be useful. For MafB, for example, it looks as though the ventral cells (likely Renshaw) are generated at all times in the contour plots; in the dot plots however, it looks like the most ventral cells are present at e10.5. This is likely because the contours are measuring fractional representations, not absolute number. An independent measure of absolute number of ventral and dorsal, by for example, subdividing the spinal cord into dorsoventral bins, would be very useful to address this ambiguity.

      We believe density plots already convey the message of the shift in positions with birthdate. We are not sure how we can quantify this more accurately than showing the differences in cellular density plots. We used dorsoventral and mediolateral binning in our first paper decades ago (Avarez et al., 2005). This has now been replaced by more rigorous density profiles that describe better cell distributions. Unfortunately, to obtain the most accurate density profiles we need to pool all cells from all animals precluding statistical comparisons. This is because for some groups there have very few cells per animal (for example early born Sp8 or Foxp2 cells).

      Figure 3 and Figure 4: These, and all figures that compare the lineage trace and antibody staining, should be moved to the supplement in my opinion-as they are not for generalist readers but rather specialists that are interested in these exact tools. In addition, the majority of the text that relates to these figures should be transferred to the supplement as well. Figure 5: Another great figure that sets the stage for the analysis of FoxP2V1-to-MN synaptic connectivity, and provides basic information about the rostrocaudal distribution of this clade, by analyzing settling position by level. I have only minor comments. The grid in B obscures the view of the cells and should be removed. The motor neuron cell bodies in C would be better visible if they were red.

      We moved some of the images to supplemental (see new supplemental Fig S4). However, we also added new data to the figure as requested by reviewers (Fig 3E-G). We preserved our analyses of Foxp2 and non-Foxp2 V1s across ages and spinal segments because we think this information is critical to the paper. Finally, we want to prevent misleading readers into believing that Foxp2 is a marker that is unique to V1s. Therefore, we also preserved Figures 3H to 3J showing the non-V1 Foxp2 population in the ventral horn. 

      Figure 6: Very careful and quantitative analysis of V1 synaptic input to motor neurons is presented here.  For the reader, a summary figure (similar to B but with V1s too) that schematizes V1 FoxP2 versus Renshaw cell connectivity with LMC, MMC, and PGC motor neurons are one level would be useful.

      Thanks for the suggestion. A summary figure has now been included (Figure 5G). 

      Figure 7: The goal of this figure is to highlight intra-clade diversity at the level of transcription factor expression (or maintenance of expression), birth timing and cell body position culminating in the clear and concise diagram presented in G. In panels A-F however, it takes extra effort to link the data shown to these I-IV subtypes. The figure should be restructured to better highlight these links. One option might be to separate the figure into four parts (one for each type): with the individual spatial, birth timing and TF data for each population extracted and presented in each individual part.

      We agree with the reviewer that this is a very busy figure. We tried to re-structure the figure following the suggestions of the reviewer and also several alternative options. All resulted in designs that were more difficult to follow than the original figure. We apologize for its complexity, but we believe this is the best organization to describe all the data in the simplest form.

      Figure 8: in A-D, the main point of the figure - that V1FoxP2Otp preferentially receive proprioceptive synapses is buried in a bunch of technical details. To make it easier for the reader, please:

      (1) add a summary as in B of the %FoxP2-V1 Otp+ cells (82%) with Vglut1 synapses to make the point stronger that the majority of these cells have synapses.

      We added this graph by extending the previous graph to include lineage labeled Foxp2-V1s with OTP or Foxp2 immunoreactivity. It is now Figure 7B.

      (2) Additionally, add a representative example that shows large numbers of proximal synapses on an FoxP2-V1 Otp+.

      The image we presented before as Figure 8A was already immunostained for OTP, so we just added the OTP channel to the images. Now all this information is in panels that are subparts of Figure 7A.

      (3) Move the comparison between FoxP2-V1 and FoxP2AB+V1s to the supplement.

      We preserved the quantitative data on Foxp2-V1 lineage cells with Foxp2-immunoreactivity but made this a standalone figure, so it is not as busy.

      (4) Move J-M description of antibody versus lineage trace of Otp to supplement as ending with this confuses the main message of the paper (see comment above).

      All results for the Otp-V1 mouse model have now been placed in a supplemental figure (Figure 5S).

      Discussion: A more nuanced and detailed discussion of how the temporal pattern of subtype generation presented here aligns with the established temporal transcription factor code (nicely summarized in Sagner 2024) would be helpful to place their work in the broader context of the field.

      This aspect of the discussion was expanded on pages 20 and 21. We replaced the earlier cited review (Sagner and Briscoe, 2019, Development) with the updated Sagner 2024 review and further discussed the data in the context of the field and neurogenesis waves throughout the neural tube, not only the spinal cord. We previously carefully compared our data with the spinal cord data from Sagner’s group (Delile et, 2019, Development). We have now further expanded this comparison in the discussion.

    2. eLife Assessment

      This study provides a valuable description of subtypes of V1 neurons, including birthdates and connections to motor neurons. V1 neurons are one of the main groups of inhibitory neurons in the spinal cord. The methods of data collection and analysis are convincing. This work will interest developmental biologists and neuroscientists working on spinal circuits.

    3. Reviewer #1 (Public review):

      To understand spinal locomotor circuits, we need to reveal how various types of spinal interneurons work in them. So far, the general roles of the cardinal groups of spinal interneurons (dI6, V0, V1, V2a, V2b, and V3) in locomotion have been studied but not fully understood. Each group is believed to contain some subgroups with more detailed functional differences. However, each character and function of these subgroups has yet to be elucidated.

      In this study, Worthy et al. investigated V1 neurons, one of the main groups of inhibitory neurons in the spinal cord. Previous reports proposed four major clades in V1 neurons defined by the expression of transcription factors (MafA/MafB, Foxp2, sp8, and pou6f2). The authors investigated the birth time for V1 neurons in each of the four clades and showed the postnatal location in the spinal cord with different birthdates. Next, the authors investigated the Foxp2-V1 population in detail using genetically labeled Foxp2-V1 mice. They found some FoxP2-V1 located near LMC motor neurons that innervate limbs. They showed that most of the synapses of V1 neurons on the cell bodies of LMC motor neurons were from Foxp2-V1 and Renshaw cells, and the proportion of Foxp2-V1 synapses in V1 synapses on motor neurons was relatively high in LMC compared to other motor columns. They also proposed that Foxp2-V1 can be further classified according to the expression of transcription factors Otp and Foxp4. The results of this paper are well supported by the data obtained using widely used methods.

      This study will be helpful for future analyses of the development and function of V1 neurons. In particular, the discovery of strong synaptic connections between Foxp2-V1 and LMC motor neurons will be beneficial in analyzing the role of V1 neurons in motor circuits that generate movement of the limbs.

    4. Reviewer #2 (Public review):

      Summary:

      This work brings important information regarding the composition of interneurons in the mammalian spinal cord, with a developmental perspective. Indeed, for the past decades, tools inspired from developmental biology have opened up promising avenues for challenging the functional heterogeneity in the spinal cord. They rely on the fact that neurons sharing similar mature properties also share a largely similar history of expression of specific transcription factor (TF) genes during embryogenic and postnatal development. For instance, neurons originating from p1 progenitors and expressing the TF Engrailed-1, form the V1 neuronal class. While such "cardinal" neuronal classes defined by one single RF indeed share numerous features - e.g., for the case of V1 neurons, a ventral positioning, an inhibitory nature and ipsilatetal projections - there is accumulating evidence for a finer-grained diversity and specialization in each class which is still largely obscure. The present work studies the heterogeneity of V1 interneurons and describes multiple classes based on their birthdate, final positioning, and expression of additional TF. It brings in particular a solid characterization of the Foxp2-expressing V1 interneurons for which authors also delve into the connectivity, and hence, possible functional implication. The work will be of interest to developmental biologists and those interested in the organization of the locomotor spinal network.

      Strengths:

      This study has deeply analyzed the diversity of V1 neurons by intersecting multiple criteria: TF expression, birthdate, location in the spinal cord, diversity along the rostro-caudal axis, and for some subsets, connectivity. This illustrates and exemplifies the absolute need to not consider cardinal classes, defined by one single TF, as homogeneous. Rather, it highlights the limits of single-TF classification and exemplifies the existence of further diversity within the cardinal class.

      Experiments are generally well performed with a satisfactory number of animals and adequate statistical tests.

      Authors have also paid strong attention to potential differences in cell-type classification when considering neurons currently expressing of a given TF (e.g., using antibodies), from those defined as having once expressed that TF (e.g., defined by a lineage-tracing strategy). This ambiguity is a frequent source of discrepancy of findings across studies.

      Furthermore, there is a risk in developmental studies to overlook the fact that the spinal cord is functionally specialized rostro-caudally, and to generalize features that may only be applicable to a specific segment and hence to a specific motor pool. While motoneurons share the same dorso-ventral origin and appear homogenous on a ChAT staining, specific clusters are dedicated to specific muscle groups, e.g., axial, hypaxial or limb muscles. Here, the authors make the important distinction between different lumbar levels and detail the location and connectivity of their neurons of interest with respect to specific clusters of MN.

      Finally, the authors are fully transparent on inter-animal variability in their representation and quantification. This is crucial to avoid the overgeneralization of findings but to rather provide a nuanced understanding of the complexities of spinal circuits.

      Weaknesses:

      The different V1 populations have been investigated in detail regarding their development and positioning, but their functional ambition is not directly investigated through gain or loss of function experiments in the present study. While the putative inputs onto motoneurons are interesting and suggestive of differences between V1 pools, they are only a little predictive of function.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      In this manuscript by Napoli et al, the authors study the intracellular function of Cytosolic S100A8/A9 a myeloid cell soluble protein that operates extracellularly as an alarmin, whose intracellular function is not well characterized. Here, the authors utilize state-of-the-art intravital microscopy to demonstrate that adhesion defects observed in cells lacking S100A8/A9 (Mrp14-/-) are not rescued by exogenous S100A8/A9, thus highlighting an intrinsic defect. Based on this result subsequent efforts were employed to characterize the nature of those adhesion defects.

      The authors thank reviewer #1 for his/her insightful comments and suggestions. Please find our point to point responses below.

      (1) Ex vivo characterization of the function of S100A8/A9 in adhesion, spreading, and calcium signaling requires at least one rescue experiment to support the direct role of these proteins in the biological processes under study.

      We thank the reviewer for this comment. We agree that rescue experiments would be helpful to confirm the direct role of intracellular S100A8/A9 in adhesion, spreading, and Ca2+ signaling. Although transfection of primary cells, especially neutrophils, poses challenges due to their short half-life, we now have undertaken additional in vitro rescue experiments. Specifically, we used extracellular S100A8/A9 and coated Ibidi flow chambers with E-selectin, ICAM-1 and CXCL1 alone or alongside S100A8/A9, and measured rolling and adhesion of blood neutrophils. Our data reveal that extracellular S100A8/A9 can induce increased adhesion in WT neutrophils but fails to rescue the adhesion defect in Mrp14-/- neutrophils (Author response image 1). This result corroborates our in vivo findings, emphasizing that the observed adhesion defect is due to the lack of intracellular S100A8/A9.

      Author response image 1.

      Extracellular S100A8/A9 does not rescue the adhesion defect in Mrp14/- neutrophils. Analysis of number of adherent leukocytes FOV-1 normalized to the WBC of WT and Mrp14-/- mice. Whole blood was harvested through a carotid artery catheter and perfused with a high precision pump at constant shear rate using flow cambers coated with either E-selectin, ICAM-1 and CXCL1 or E-selectin, ICMA-1, CXCL1 and S100A8/A9. [mean+SEM, n=5 mice per group, 12 (WT) and 14 (Mrp14-/-) flow chambers, 2way ANOVA, Sidak’s multiple comparison]. ns, not significant; *p≤0.05, **p≤0.01, ***p≤0.001.

      (2) There is room for improvement in the analysis of signaling pathways presented in Figures 3 H and I. Western blots and analyses are not convincing, in particular for p-Pax.

      We acknowledge the reviewer's concern regarding the clarity of the signaling pathway analysis, particularly the western blots for p-Paxillin. To address this, we have repeated the western blot experiments using murine neutrophils. Our new data confirm the defective paxillin phosphorylation upon CXCL1 stimulation and ICAM-1 binding in the absence of cytosolic S100A8/A9. We have now integrated these new findings with the original data and included the updated results in the manuscript (Figure 3I revised). These enhanced analyses provide a more robust and convincing demonstration of the signaling defects in Mrp14-/- neutrophils.

      (3) At least one western blot showing a knockdown of S100A8/A9 should be included towards the beginning of the result section.

      We appreciate the reviewer's suggestion to include a western blot demonstrating the knockout of S100A8/A9 early in the results section. In a recent publication by our group, we have already demonstrated the absence of S100A8/A9 at the protein level in Mrp14-/- neutrophils via western blotting ([1], please refer to Extended Data Fig. 1h). We agree that visual confirmation of the absence of S100A8/A9 protein is crucial for establishing the validity of our study.

      (4) The Ca2+ measurements at LFA-1 nanoclusters using the Mrp14-/- Lyz2xGCamP5 are interesting; It is understood that the authors are correcting calcium levels by normalizing by LFA-1 cluster areas and that seems fine to me. The issue is that the total calcium signal seems decreased in Mrp14-/- cells compared to WT cells (Fig. 4E)...why is totalCa2+ low? Please discuss.

      We thank the reviewer for this insightful comment. Indeed, our observations reveal reduced overall Ca2+ levels in Mrp14-/- neutrophils compared to WT neutrophils. Initially, we noticed a general decrease in Ca2+ intensity (Author response image 2A-B) and lifetime in Mrp14-/- neutrophils (Author response image 2C-D). Further analysis indicated that these differences in Ca2+ levels are localized specifically to the LFA-1 nanocluster sites. In contrast, the cytosolic Ca2+ levels outside of the LFA-1 nanocluster areas were comparable between Mrp14-/- and WT neutrophils (Figure 4H-J). This suggests that the reduced total Ca2+ levels observed in Mrp14-/- neutrophils are primarily due to the impaired Ca2+ supply at the LFA-1 nanocluster areas. Our data support the notion that cytosolic S100A8/A9 plays a crucial role in actively supplying Ca2+ to LFA-1 nanoclusters during neutrophil crawling. In the absence of S100A8/A9, the increase in overall Ca2+ levels (summing both inside and outside LFA-1 nanocluster areas) is minimal, further highlighting the specific role of S100A8/A9 in maintaining localized Ca2+ concentrations at these crucial sites.

      Author response image 2.

      Overall Ca2+ levels in WT and Mrp14-/- neutrophils (A) Representative confocal images of neutrophils from WT Lyz2xGCaMP5 and Mrp14-/- Lyz2xGCaMP5 mice, labeled with Lyz2 td Tomato marker. The images illustrate overall cytosolic Ca2+ levels during neutrophil crawling flow chambers coated with E-selectin, ICAM-1, and CXCL1 (scale bar=10μm). (B) Quantitative analysis of total cytosolic Ca2+ intensity in single cells from WT Lyz2xGCaMP5 and Mrp14-/- Lyz2xGCaMP5 neutrophils measured over three time intervals: min 0-1, 5-6 and 9-10 [mean+SEM, n=5 mice per group, 56 (WT) and 54 (Mrp14-/-) neutrophils, 2way ANOVA, Sidak’s multiple comparison]. (C) Representative traces and (D) single cell analysis of total Ca2+ lifetime over the first 5 minutes in WT Lyz2xGCaMP5 and Mrp14-/- Lyz2xGCaMP5 neutrophils crawling on Eselectin, ICAM-1, and CXCL1 coated flow chambers recorded with FLIM microscopy [mean+SEM, n=3 mice per group, 111 (WT) and 95 (Mrp14-/-) neutrophils, 2way ANOVA, Sidak’s multiple comparison]. ns, not significant; *p≤0.05, **p≤0.01, ***p≤0.001.

      (5) Even if the calcium level outside LFA-1 nanoclusters is not significant (Figure 4J), the data at min 9-10 in Figure 4J seems to be affected by a single event that may be an outlier. Additional data may be needed here.

      We appreciate the reviewer’s attention to this detail. To address the concern regarding a potential outlier in the Ca2+ level measurements at 9-10 minutes in Figure 4J, we rigorously tested the dataset using the GraphPad outlier calculator. The analysis revealed that no data point was statistically identified as an outlier. Given that the current dataset is robust and the statistical analysis confirms the integrity of the data, we believe that the results accurately reflect the biological variability observed in our experiments. Therefore, we have not added additional data points at this stage but remain open to discussing this further.

      (6) Finally, even though there is less calcium at LFA-1 clusters, that does not necessarily mean that "cytosolic S100A8/A9 plays an important role in Ca2+ "supply" at LFA-1 adhesion spots" as proposed. S100A8/A9 may play an indirect role in calcium availability. The analysis of the subcellular localization of S100A8/A9 at LFA-1 clusters together with calcium dynamics in stimulated WT cells would help support the authors' interpretation, which although possibly correct, seems speculative at this point.

      We thank the reviewer for this insightful comment and fully agree that additional evidence regarding the subcellular localization of S100A8/A9 would strengthen our conclusions. Although live cell imaging of intracellular S100A8/A9 was initially challenging due to technical limitations, we have now performed additional experiments to address this issue. We conducted end-point measurements where we allowed WT neutrophils to crawl on E-selectin, ICAM-1, and CXCL1 coated flow chambers for 10 minutes. Following this, we fixed and permeabilized the cells to stain intracellular S100A9, along with LFA-1 and a cell tracker for segmentation. Confocal microscopy and subsequent single-cell analysis revealed a significant enrichment of S100A8/A9 at LFA-1 positive nanocluster areas compared to the surrounding cytosol (Figure 4K and 4L, new). This finding supports our hypothesis that S100A8/A9 plays a direct role in the localized supply of Ca2+ at LFA-1 adhesion spots, thus facilitating efficient neutrophil crawling under shear stress. These new data have been included in the revised manuscript, providing stronger evidence for our proposed mechanism.

      Reviewer #2:

      Napoli et al. provide a compelling study showing the importance of cytosolic S100A8/9 in maintaining calcium levels at LFA-1 nanoclusters at the cell membrane, thus allowing the successful crawling and adherence of neutrophils under shear stress. The authors show that cytosolic S100A8/9 is responsible for retaining stable and high concentrations of calcium specifically at LFA-1 nanoclusters upon binding to ICAM-1, and imply that this process aids in facilitating actin polymerisation involved in cell shape and adherence. The authors show early on that S100A8/9 deficient neutrophils fail to extravasate successfully into the tissue, thus suggesting that targeting cytosolic S100A8/9 could be useful in settings of autoimmunity/acute inflammation where neutrophil-induced collateral damage is unwanted.

      The authors appreciate reviewer #2's insightful comments and suggestions. Below are our detailed responses:

      (1) Extravasation is shown to be a major defect of Mrp14-/- neutrophils, but the Giemsa staining in Figure 1H seems to be quite unspecific to me, as neutrophils were determined by nuclear shape and granularity. It would have perhaps been more clear to use immunofluorescence staining for neutrophils instead as seen in Supplementary Figure 1A (staining for Ly6G or other markers instead of S100A9).

      We acknowledge the reviewer's concern. However, Giemsa staining is a well-established method in hematology, histology, cytology, and bacteriology, widely recognized for its ability to distinguish leukocyte subsets based on nuclear shape and cytoplasmic characteristics. This method is extensively documented in the literature [2-5]. Its advantages are the easy morphological discrimination of leukocytes based on nuclear and cytoplasmic shape and conformation (Author response image 3).

      Author response image 3.

      Giemsa staining of extravasated leukocyte subsets. (A) Representative image of Giemsa-stained cremaster muscle tissue post-TNF stimulation. The image clearly differentiates leukocyte subsets (white arrow = neutrophils, yellow arrow = eosinophils, red arrow = monocytes). Scale bar = 50µm.

      (2) The representative image for Mrp14-/- neutrophils used in Figure 4K to demonstrate Ripley's K function seems to be very different from that shown above in Figures 4C and 4F.

      The reviewer correctly observed that the cell in Figure 4K is different from those in Figures 4C and 4F. This is intentional, as Figure 4K is meant to show a representative image that accurately reflects the overall results of the experiments. We assure the reviewer that all cells analyzed in Figures 4C and 4F were also included in the analysis for Figure 4K.

      (3) Although the authors have done well to draw a path linking cytosolic S100A8/9 to actin polymerisation and subsequently the arrest and adherence of neutrophils in vitro, the authors can be more explicit with the analysis - for example, is the F-actin co-localized with the LFA-1 nanoclusters? Does S100A8/9 localise to the membrane with LFA-1 upon stimulation? Lastly, I think it would have been very useful to close the loop on the extravasation observation with some in vitro evidence to show that neutrophils fail to extravasate under shear stress.

      We thank the reviewer for this comment and questions. 

      Concerning the co-localization of F-actin with LFA-1 nanoclusters and S100A8/9 localization: We appreciate the reviewer's interest in the co-localization between F-actin and LFA-1. Unfortunately, due to the limitations of our GCaMP5 mouse model (with neutrophils labeled with td-Tomato and eGFP for LyzM and Ca2+), we could only stain for either LFA-1 or F-actin at a time. However, in our F-actin movies, we observed that F-actin predominantly localizes at the rear of the cell, while LFA-1 is more uniformly distributed at the plasma membrane.

      Regarding S100A8/A9 localization, as mentioned in response to Reviewer 1's sixth point, we now conducted endpoint measurements. We stained neutrophils with cell tracker green CMFDA and LFA-1, allowed them to crawl on E-selectin, ICAM-1, and CXCL1-coated flow chambers, and then performed intracellular S100A9 staining after fixation and permeabilization. Our analysis shows higher S100A9 intensity at LFA-1 positive areas compared to LFA-1 negative areas (Figure 4K and 4L, new). This indicates that S100A8/A9 indeed concentrates Ca2+ at LFA-1 nanoclusters, supporting adhesion and post-arrest modification events under flow.

      Regarding the extravasation defect under shear stress: To address the reviewer's suggestion, we performed transwell migration assays under static conditions. Our results show no significant difference in transmigration between WT and Mrp14-/- neutrophils without flow, indicating that the extravasation defect in Mrp14-/- neutrophils is shear-dependent. This supports our hypothesis that S100A8/A9-mediated Ca2+ supply at LFA-1 nanoclusters is critical under flow conditions (Author response image 4).

      Author response image 4.

      Static Transmigration assay. (a) Transmigration of WT and Mrp14-/- neutrophils in static transwell assays (3um pore size, 45min migration time) showing spontaneously migration (PBS) or migration towards CXCL1. [mean+SEM, n=3 mice per group, 2way ANOVA, Sidak’s multiple comparison]. ns, not significant; *p≤0.05, **p≤0.01, ***p≤0.001.

      Additional References

      (1) Pruenster, M., et al., E-selectin-mediated rapid NLRP3 inflammasome activation regulates S100A8/S100A9 release from neutrophils via transient gasdermin D pore formation. Nature Immunology, 2023. 24(12): p. 2021-2031.

      (2) Kuwano, Y., et al., Rolling on E- or P-selectin induces the extended but not high-affinity conformation of LFA-1 in neutrophils. Blood, 2010. 116(4): p. 617-24.

      (3) Porse, B., Mouse Hematology – A Laboratory Manual. European Journal of Haematology, 2010. 84(6): p. 554-554.

      (4) Frommhold, D., et al., Protein C concentrate controls leukocyte recruitment during inflammation and improves survival during endotoxemia after efficient in vivo activation. Am J Pathol, 2011. 179(5): p. 2637-50.

      (5) Braach, N., et al., RAGE Controls Activation and Anti-Inflammatory Signalling of Protein C. PLOS ONE, 2014. 9(2): p. e89422.

    2. eLife Assessment

      This important study investigates the contribution of cytosolic S100A/8 to neutrophil migration to inflamed tissues. The authors provide convincing evidence for how the loss of cytosolic S100A/8 specifically affects the ability of neutrophils to crawl and subsequently adhere under shear stress. This study will be of interest in fields where inflammation is implicated, such as autoimmunity or sepsis.

    3. Reviewer #1 (Public review):

      Summary:

      In this manuscript by Napoli et al, the authors study the intracellular function of Cytosolic S100A8/A9 a myeloid cell soluble protein that operates extracellularly as an alarmin, whose intracellular function is not well characterized. Here, the authors utilize state-of-the-art intravital microscopy to demonstrate that adhesion defects observed in cells lacking S100A8/A9 (Mrp14-/-) are not rescued by exogenous S100A8/A9, thus highlighting an intrinsic defect. Based on this result subsequent efforts were employed to characterize the nature of those adhesion defects.

      Strengths:

      The authors convincingly show that Mrp14-/- neutrophils have normal rolling but defective adhesion caused by impaired CD11b activation (deficient ICAM1 binding). Analysis of cellular spreading (defective in Mrp14-/- cells) are also sound. The manuscript then focuses on selective signaling pathways and calcium measurements. Overall, this is a straightforward study of biologically important proteins and mechanisms.

      Weaknesses:

      Some suggestions are included below to improve this manuscript.

    4. Reviewer #2 (Public review):

      Summary:

      Napoli et al. provide a compelling study showing the importance of cytosolic S100A8/9 in maintaining calcium levels at LFA-1 nano clusters at the cell membrane, thus allowing the successful crawling and adherence of neutrophils under shear stress. The authors show that cytosolic S100A8/9 is responsible for retaining stable and high concentrations of calcium specifically at LFA-1 nanoclusters upon binding to ICAM-1, and imply that this process aids in facilitating actin polymerisation involved in cell shape and adherence. The authors show early on that S100A8/9 deficient neutrophils fail to extravasate successfully into the tissue, thus suggesting that targeting cytosolic S100A8/9 could be useful in settings of autoimmunity/acute inflammation where neutrophil-induced collateral damage is unwanted.

      Strengths:

      Using multiple complementary methods from imaging to western blotting and flow cytometry, including extracellular supplementation of S100A8/9 in vivo, the authors conclusively prove a defect in intracellular S100A8/9, rather than extracellular S100A8/9 was responsible for the loss in neutrophil adherence, and pinpointed that S100A8/9 aided in calcium stabilisation and retention at the plasma membrane.

      Weaknesses:

      (1) Extravasation is shown to be a major defect of Mrp14-/- neutrophils, but the Giemsa staining in Figure 1H seems to be quite unspecific to me, as neutrophils were determined by nuclear shape and granularity, which could be affected by the angle at which the nucleus is viewed. It would have perhaps been cleaner/clearer to use immunofluorescence staining for neutrophils instead as seen in Supplementary Figure 1A (staining for Ly6G or other markers instead of S100A9).

      Addressed issues:

      (1) The representative image for Mrp14-/- neutrophils used in Figure 4K to demonstrate the Ripley's K function seems to be very different from that shown above in Figure 4C and 4F. In their response to reviewers, the authors reassure that all data has been included in the analysis.

      (2) In the initial submission the authors needed to provide a more direct linkage between cytosolic S100A8/9 and actin polymerisation, which subsequently results in the arrest and adherence of neutrophils. The authors did an additional experiment indicating the co-localization of S100A8/9 with LFA-1, indicating that the spatial localisation of S100A8/9 does shift towards the membrane with activation. Further, the authors confirm that the defect is only apparent only in conditions of shear stress, as transwell migration of Mrp14-/- neutrophils is not affected.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study characterized the cellular and molecular mechanisms of spike timing-dependent long-term depression (t-LTD) at the synapses between excitatory afferents from lateral (LPP) and medial (MPP) perforant pathways to granule cells (GC) of the dentate gyrus (DG) in mice.

      Strengths:

      The electrophysiological experiments are thorough. The experiments are systematically reported and support the conclusions drawn.

      This study extends current knowledge by elucidating additional plasticity mechanisms at PP-GC synapses, complementing existing literature.

      We thank the reviewer for the positive assessment of our work and the constructive suggestions to improve the manuscript.

      Weaknesses:

      To more conclusively define the pivotal role of astrocytes in modulating t-LTD at MPP and LPP GC synapses through SNARE protein-dependent glutamate release, as posited in this study, the authors could adopt additional methods, such as alternative mouse models designed to regulate SNARE-dependent exocytosis, as well as optogenetic or chemogenetic strategies for precise astrocyte manipulation during t-LTD induction. This would provide more direct evidence of the influence of astrocytic activity on synaptic plasticity.

      We thank the reviewer for the suggestion. As stated in the manuscript and in figure 4, we already used two different approaches (aBAPTA to interfere with astrocyte calcium signalling and dnSNARE mice (that have vesicular release impaired) to determine the involvement of astrocytes in the discovered forms of LTD, and both approaches clearly indicated the requirement of astrocytes for t-LTD. In BAPTA-treated astrocytes and in dnSNARE mice, t-LTD was prevented. Notwithstanding this, and as suggested by the reviewer, we used two additional approaches to confirm astrocyte participation. We loaded astrocytes with the light chain of the tetanus toxin (TeTxLC), which is known to block exocytosis by cleaving the vesicle-associated membrane protein, an important part of the SNARE complex (Schiavo et al., 1992, Nature 359, 832-835). In this experimental condition, we observed a clear lack of t-LTD at both (lateral and medial) pathways, thus confirming the requirement of astrocytes and the SNARE complex and vesicular release for both types of t-LTD. In addition, to gain more insight into the fact that glutamate is released by astrocytes, we blocked glutamate release from astrocytes by loading the astrocytes with Evans blue, known to interfere with glutamate uptake into vesicles as it inhibits the vesicular glutamate transporter (VGLUT). In this experimental condition, again t-LTD was prevented, indicating that t-LTD requires Ca2+dependent exocytosis of glutamate from astrocytes.

      Reviewer #2 (Public Review):

      Summary:

      This work reports the existence of spike timing-dependent long-term depression (t-LTD) of excitatory synaptic strength at two synapses of the dentate gyrus granule cell, which are differently connected to the entorhinal cortex via either the lateral or medial perforant pathways (LPP or MPP, respectively). Using patch-clamp electrophysiological recording of tLTD in combination with either pharmacology or a genetically modified mouse model, they provide information on the differences in the molecular mechanism underlying this t-LTD at the two synapses.

      Strengths:

      The two synapses analyzed in this study have been understudied. This new data thus provides interesting new information on a plasticity process at these synapses, and the authors demonstrate subtle differences in the underlying molecular mechanisms at play. Experiments are in general well controlled and provide robust data that are properly interpreted.

      We thank the reviewer for the positive assessment of our work and the constructive suggestions to improve the manuscript.

      Weaknesses:

      • Caution should be taken in the interpretation of the results to extrapolate to adult brain as the data were obtained in P13-21 days old mice, a period during which synapses are still maturing and highly plastic.

      We thank the reviewer for noticing this. In fact, our experiments were intentionally performed in young animals (P13-21), just knowing that this is a critical period of plasticity. We indicate that in the methods, results, and discussion (where we discuss that in some detail) sections.

      • In experiments where the drug FK506 or thapsigargin are loaded intracellularly, the concentrations used are as high as for extracellular application. Could there be an error of interpretation when stating that the targeted actors are necessarily in the post-synaptic neuron? Is it not possible for the drug to diffuse out of the cell as it is evident that it can enter the cell when applied extracellularly?

      We thank the reviewer for rising this point. While it would be possible that these compounds cross the cell membranes, to do it and to pass to other cells, this would, in principle, require a relatively long time to occur. Additionally, to have any effect, the same concentration or a relatively high concentration of that we put into the pipette has to reach other cells. Furthermore, even if a compound is able to cross a cell membrane during the duration of an experiment, after this, it may be exposed to the extracellular fluid where will be diluted and most probably washed out. For all these reasons, we do not see this very plausible. Notwithstanding this, and as suggested, we have repeated the experiments using lower concentrations of thapsigargin (1 uM) and FK506 (1 uM), and have obtained the same results. These data are now included in the figure 3 and in the text.

      • The experiments implicating glutamate release from astrocytes in t-LTD would require additional controls to better support the conclusions made by the authors. As the data stand, it is not clear, how the authors identified astrocytes to load BAPTA and if dnSNARE expression in astrocytes does not indirectly perturb glutamate release in neurons.

      We thank the reviewer for rising this point. We now indicate how astrocytes have been identified to load BAPTA. We reply to this in detail in the “Recommendations for the authors” from reviewer 2.

      Significance:

      While this is the first report of t-LTD at these synapses, this plasticity process has been mechanistically well investigated at other synapses in the hippocampus and in the cortex. Nevertheless, this new data suggests that mechanistic differences in the induction of t-LTD at these two DG synapses could contribute to the differences in the physiological influence of the LPP and MPP pathways.

      Reviewer #3 (Public Review):

      Coatl et al. investigated the mechanisms of synaptic plasticity of two important hippocampal synapses, the excitatory afferents from lateral and medial perforant pathways (LPP and MPP, respectively) of the entorhinal cortex (EC) connecting to granule cells of the hippocampal dentate gyrus (DG). They find that these two different EC-DG synaptic connections in mice show a presynaptically expressed form of long-term depression (LTD) requiring postsynaptic calcium, eCB synthesis, CB1R activation, astrocyte activity, and metabotropic glutamate receptor activation. Interestingly, LTD at MPP-GC synapses requires ionotropic NMDAR activation whereas LTD at LPP-GC synapse is NMDAR independent. Thus, they discovered two novel forms of t-LTD that require astrocytes at EC-GC synapses. Although plasticity of EC-DG granule cell (GC) synapses has been studied using classical protocols, These are the first analysis of the synaptic plasticity induced by spike timing dependent protocols at these synapses. Interestingly, the data also indicate that t-LTD at each type of synapse require different group I mGluRs, with LPP-GC synapses dependent on mGluR5 and MPP-GC t-LTD requiring mGluR1.

      The authors performed a detailed analysis of the coefficient of variation of the EPSP slopes, miniature responses and different approaches (failure rate, PPRs, CV, and mEPSP frequency and amplitude analysis) they demonstrate a decrease in the probability of neurotransmitter release and a presynaptic locus for these two forms of LTD at both types of synapses. By using elegant electrophysiological experiments and taking advantage of the conditional dominant-negative (dn) SNARE mice in which doxycycline administration blocks exocytosis and impairs vesicle release by astrocytes, they demonstrate that both LTD forms require the release of gliotransmitters from astrocytes. These data add in an interesting way to the ongoing discussion on whether LTD induced by STDP participates in refining synapses potentially weakening excitatory synapses under the control of different astrocytic networks. The conclusions of this paper are mostly well supported by data, but some aspects the results must be clarified and extended.

      We thank the reviewer for the positive assessment of our work and the constructive suggestions to improve the manuscript.

      (1) It should be clarified whether present results are obtained with or without the functional inhibitory synapse activation. It is not clear if GABAergic synapses are blocked or not. If GABAergic synapses are not blocked authors must discuss whether the LTD of the EPSPs is due to a decrease in glutamatergic receptor activation or an increase in GABAergic receptor activation. Moreover, it should be recommended to analyze not only the EPSPs but also the EPSCs to address whether the decrease in synaptic transmission is caused by a decrease in the input resistance or by a decrease in the space constant (lambda).

      We thank the reviewer for rising these points. GABAergic inhibition was not blocked in our experiments. The observed forms of t-LTD seem to be due to a decrease in glutamate release probability as indicated in the manuscript, mediated by the mechanism we uncover and describe here. To determine and clarify whether GABA receptors have any role in these forms of t-LTD, we repeated the experiments in the presence of the GABAA and GABAB receptors antagonists bicuculline and SCH50911, respectively. Blocking GABA receptors do not prevent or affect t-LTD at LPP- or MPP-GC synapses, that is still present and with a similar magnitude that controls. These results indicating that these receptors are not involved in these forms of t-LTD. These results are now included in the text in the results section (page 8) and as a new figure S1. In our experiments, no changes in input resistance or space constant were observed, and importantly, no changes were observed in the amplitude/slopes of EPSP in the control pathway that does not undergo plasticity protocol that we routinely use in our experiments.

      (2) Authors show that Thapsigargin loaded in the postsynaptic neuron prevents the induction of LTD at both synapses. Analyzing the effects of blocking postsynaptic IP3Rs (Heparin in the patch pipette) and Ryanodine receptors (Ruthenium red in the patch pipette) is recommended for a deeper analysis of the mechanism implicated in the induction of this novel forms of LTD in the hippocampus.

      We thank the reviewer for this suggestion. We repeated the experiments loading the postsynaptic cell with heparin and ruthenium red using the path pipette. In these experimental conditions, we observed that t-LTD was not affected by the heparin treatment (discharging a role of IP3Rs), but that it was prevented by the ruthenium red treatment (indicating the requirement of ryanodine receptors). We include now this data in the text (page 12) and in the Figure 3a, b, e, f.

      (3) Authors nicely demonstrate that CB1R activation is required in these forms of LTD by blocking CB1Rs with AM251, however an interesting unanswered question is whether CB1R activation is sufficient to induce this synaptic plasticity. This reviewer suggests studying whether applying puffs of the CB1R agonist, WIN 55,212-2, could induce these forms of LTD.

      We thank the reviewer for this suggestion. We repeated the experiments adding WIN55, 212-2 as suggested.  The activation of CB1R by puffs of the agonist WIN 55, 212-2 to the astrocyte, directly induced LTD at both LPP- and MPP-GC synapses. We include now this data in the text (page 14) and in the Figure 3c, d, g, h.

      (4) Finally, adding a last figure with a cartoon summarizing the proposed model of action in these novel forms of LTD would add a positive value and would help the reading of the manuscript, especially in those aspects related with the discussion of the results.

      We thank the reviewer for the suggestion. We include now a figure showing the proposed mechanisms (Figure 5).

      The extension of these results would improve the manuscript, which provides interesting results showing two novel forms of presynaptic t-LTD in the brain synapses with different action mechanisms probably implicated in the different aspects of information processing.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      There are just a few aspects that could be clarified to bolster the authors' conclusions.

      The author centered the conclusion of their study on the role of astrocytic activity in regulating these two forms of plasticity (see title). To strengthen the evidence that astrocytes are key regulators of t-LTD at MPP and LPP GC synapses by regulating SNARE protein-dependent glutamate release, additional complementary approaches should be considered, such as other mouse models enabling the control of SNARE-dependent exocytosis and/or optogenetic/chemogenetic tools to selectively manipulate astrocytes during the induction of t-LTD, thereby directly assessing the impact of astrocytic activity on synaptic plasticity. Implementing calcium imaging or glutamate sensors to visualize the dynamics of astrocytic calcium signaling and glutamate release during t-LTD could be also considered.

      We thank the reviewer for the suggestion. As stated in the manuscript and in figure 4, we already used two different approaches (aBAPTA to interfere with astrocyte calcium signalling and dnSNARE mice (that have vesicular release impaired) to determine the involvement of astrocytes in the discovered forms of LTD, and both approaches clearly indicated the requirement of astrocytes for t-LTD. In BAPTA-treated astrocytes and in dnSNARE, t-LTD was prevented. Notwithstanding this, and as suggested by the reviewer, we used two additional approaches to confirm astrocytes participation. We loaded astrocytes with the light chain of the tetanus toxin (TeTxLC), which is known to block exocytosis by cleaving the vesicle-associated membrane protein, an important part of the SNARE complex (Schiavo et al., 1992, Nature 359, 832-835). In this experimental condition, we observed a clear lack of t-LTD at both (lateral and medial) pathways, thus confirming the requirement of astrocytes and the SNARE complex and vesicular release for both types of t-LTD. In addition, to gain more insight into the fact that glutamate is released by astrocytes, we blocked glutamate release from astrocytes by loading the astrocytes with Evans blue, known to interfere with glutamate uptake into vesicles as it inhibits the vesicular glutamate transporter (VGLUT). In this experimental condition, again t-LTD was prevented, indicating that t-LTD requires Ca2+-dependent exocytosis of glutamate from astrocytes. This information is now included in the text, pages 14 and 15 and in figure 4.

      • How were astrocytes identified to be loaded with BAPTA? The author should clarify this methodological aspect and provide confocal images of patched astrocytes situated 50-100 um from the recorded neuron.

      We thank the reviewer for the comment. We include now this information in the Methods section (page 6) and in figure S3. Astrocytes were identified by their rounded morphology under differential interference contrast microscopy, and were characterized by low membrane potential, low membrane resistance and passive responses (they do not show action potentials) to both negative and positive current injection.

      • Please provide confocal images of EGFP expression in the DG astrocytes of dnSNARE mice both on and off Dox, to verify transgene expression in astrocytes

      We thank the reviewer for this suggestion. We now include an image of GFP expression in the DG astrocytes of off Dox dnSNARE mice. We did not provide the animals with doxycycline since birth and thus the gene was constantly expressed. We now show this image in Fig. S3. All the pups and mice are not DOX fed, meaning that the transgenes are continuously being expressed and therefore the exocytosis should be blocked in astrocytes.

      Minor points:

      Lines 250-253: It is mentioned that TTX is added at baseline, washed out for the t-LTD experiment, and then reapplied post t-LTD. I suggest clarifying the timing and rationale for this application for a broad audience.

      We thank the reviewer for the suggestion. We now include some information related to the timing and rationale of the experiment phases (page 9).

      The discussion is quite detailed and provides a comprehensive overview of the study's findings. To enhance clarity and impact, the authors might consider to,

      • add subheadings and bullet points for key findings. This will improve readability.

      • this section could benefit from streamlining to avoid redundancy.

      • some sentences could be made more concise without losing meaning.

      We thank the reviewer for these suggestions. We now include subheadings in the discussion section to improve readability and have made some sentences more concise and simple without losing meaning.

      In figure legends, consistency with capitalization should be maintained, for example in the statistical significance notation, ***P < 0.001" or ***p < 0.001")

      We now include p<0.001 in the figure legend 4 for consistency.

      Reviewer #2 (Recommendations For The Authors):

      Major:

      • All results were obtained in young still quite immature synapses. To strengthen the significance of the findings, the authors could repeat some of the main experiments in adult mice (8 weeks and beyond). If not, they should state clearly that these mechanisms were only evidenced in early post-natal conditions.

      We thank the reviewer for noticing this. In fact, our experiments were intentionally performed in young animals (P13-21), just knowing that this is a critical period of plasticity. As the reviewer suggests, we indicate that in the methods (page 5), results (page 8), and discussion (page 19) (where we discuss that in some detail) sections.

      • Lines 246-249 and fig 1f,p: Authors need to perform a statistical test on these two graphs to support their claim that 'A plot of CV-2 versus the change in the mean evoked EPSP 246 slope (M) before and after t-LTD mainly yielded points below the diagonal line at LPP-GC and MPP-GC synapses'.

      That could not be clear in the previous version. We observed an error in the points (with some points missing) of one of the graphs that we have corrected. In addition, and as suggested by the reviewer we performed a regression analysis that confirms the conclusions stated. This is now included in the text (page 9). Thus, we have added information about mean values ± SEM in the text and the linear regression of the data for LPP-GC (Mean = 0.607 ± 0.054 vs 1/CV2 = 0.439 ± 0.096, R2 = 0.337; n = 14) and MPP-GC synapses (Mean = 0.596 ± 0.056 vs 1/CV2 = 0.461 ± 0.090, R2 = 0.168; n = 13), respectively. Data yielded on the dotted horizontal line, 1/CV2 = 1, indicates no change in the probability of release, in contrast, data yielded below the dotted diagonal line is suggestive of a change in the probability of release parameters (for review, see Brock et al., 2020, Front Synaptic Neurosci 12, 11).

      • We are not sure that the experiment with the MK801 provided in the patch pipet can be interpreted correctly (Figure 2 a,b and e,f). How sure are the authors that, when applying MK801 in the patch pipet, it can reach its binding site within the pore? The concentration of MK801 is also very high (500 microM) and used at the same concentration extracellularly and intracellularly. Why did the authors not use lower concentration when applied intracellularly?

      We thank the reviewer for rising this point. MK801 in the pipette is reaching the pore when loaded postsynaptically as when we record NMDA currents from postsynaptic neurons loaded with MK801, these currents are blocked. We include now a control experiment showing the effect of postsynaptic MK801 on NMDA current in the text (page 10). NMDA currents has been recorded at +40 mV, blocking AMPAR and GABAR with NBQX and bicuculline. Related to the concentration, it has been described that the affinity from the internal site is much lower (several orders of magnitude) than from the extracellular side(Sun et al., 2018 Neuropharmacology, 143, 122-129) and the concentrations used have been extensively used in previous studies. It is clear that the concentrations used in the present work blocked NMDAR currents but did not prevent LTD.

      • Linked to the point above, for the intracellular application of FK506 and thapsigargin, the concentrations used extracellularly and intracellularly are identical. The authors could have used lower concentrations for the intracellular application. Also, how can they be sure of the correct interpretation of these data as the drug essentially reaching a post-synaptic target when applied intracellularly? If the drug can enter the neuron, why could it not diffuse out of the neuron especially when loaded at a high concentration? Maybe using a lower concentration when applied intracellularly could at least partially address this issue.

      It is evident that it can enter the cell when applied extracellularly?

      We thank the reviewer for rising this point. While it would be possible that these compound cross the cell membranes, to do it and to pass to other cells, this would, in principle, require a relatively long time to occur. Additionally, to have any effect, the same concentration or a relatively high concentration of that we put into the pipette has to reach other cells. Furthermore, even if a compound is able to cross a cell membrane during the duration of an experiment, after this, it may be exposed to the extracellular fluid where it will be diluted and most probably washed out. For all these reasons, we do not see this very plausible. Notwithstanding this, we have repeated the experiments using lower concentrations of thapsigargin (1 uM) and FK506 (1 uM) and have obtained the same results. These data are now included in the figure 3 and the numbers in the text have been updated (pages 12-13).

      • The data supporting the possibility of glutamate release by astrocytes as a main source of glutamate to promote t-LTD needs to be strengthened. In experiment Figure a-h, it is not clear how the authors recognize astrocytes to patch. No details are provided in the methods or in the main text. If we understand correctly, it is only by performing a current steps protocol to ensure that the patched cell did not produce action potentials. If this was the case, the authors need to be more specific and provide details of this protocol. More importantly, the one trace that was provided in Figures 4a and 4f suggests, albeit by a rough estimation that we made with a ruler, that the highest current step only depolarized the cell to about -40 mV. This is not sufficient to ensure that the recorded cell is not a neuron. The authors should increase their steps to high depolarizing currents to ensure that the patched cell is not a neuron. Better yet, they should load the cell with an dye to process the slice after the electrophysiological recording for immunohistochemistry to ensure that it was indeed an astrocyte. Alternatively, they can try to aspirate the cell content at the end of the recording to perform a qPCR for astrocyte markers eg. GFAP.

      We thank the reviewer for the comment. We include now information regarding how astrocytes were identified (also raised by reviewer 1) in the Methods section (page 6) and in figure S3. Astrocytes were identified by their rounded morphology under differential interference contrast microscopy, eGFP fluorescence (astrocytes from dnSNARE mice), and were characterized by low membrane potential, low membrane resistance and passive responses (they do not show action potentials) to both negative and positive current injection.

      We agree with the reviewer that in figure 4a and 4f, the step protocol might not be completely clear. For this, we revised that and now include in a clearer way that we applied pulses that depolarized astrocytes beyond -20 mV, with no action potentials found at any point. We also include now this in figure S3.

      • Related to the point above, the use of the model expressing dnSNARE in astrocytes is elegant. Yet, to really interpret the data obtained in these slices as a lack of vesicle release (and most importantly glutamate) we think that the authors should ensure that glutamate release from nearby neurons is not impacted. They could patch nearby neurons in dnSNARE slices and test PPR or synaptic fatigue when stimulating either the LPP or MPP. The authors should avoid overinterpretation of these results. As it stands, it is not evident that dnSNARE expression does not perturb other mechanisms within the astrocyte that in turn perturb pre-synaptic glutamate release. Adding back glutamate as puffs does not help to disentangle this issue.

      To gain more insight into the fact that glutamate is released by astrocytes we blocked glutamate release from astrocytes by loading the astrocytes with Evans blue, known to interfere with glutamate uptake into vesicles as it inhibits the vesicular glutamate transporter (VGLUT). In this experimental condition, as indicated above, t-LTD was prevented, indicating that t-LTD requires Ca2+-dependent exocytosis of glutamate from astrocytes. This is included in the text (page 15) and in figure 4d,e, i, j.

      In addition, we loaded astrocytes with the light chain of the tetanus toxin (TeTxLC) which is known to block exocytosis by cleaving the vesicle-associated membrane protein, an important part of the SNARE complex (Schiavo et al., 1992, Nature 359, 832-835). In this experimental condition, we observed a clear lack of t-LTD at both (lateral and medial) pathways, thus confirming the requirement of astrocytes and the SNARE complex and vesicular release for both types of t-LTD. These data indicate that t-LTD requires Ca2+-dependent exocytosis of glutamate from astrocytes. This information is now included in the text, page 14 and in figure 4.

      Minor points:

      • line 107, did the authors mean t-LTP and t-LTD? we don't understand STDP mentioned here.

      We meant to say t-LTP. This is now corrected.

      • line 108: should STDP be replaced by t-LTD as the authors only focused on this plasticity mechanism.

      We agree, we indicate now t-LTD.

      • line 131-132 : it is not clear when the animals were fed with doxycycline. If it was from birth, then the 'not' should be removed. Otherwise the authors should clearly state when the doxycyline was provided.

      DOX was not provided and that means that the transgene was continuously expressed and therefore the exocytosis should be blocked in astrocytes. We express that clearer in page 5, methods section.

      • line 223 : which hippocampal synapses? needs to be stated

      As suggested this is now included in the text as for cortical synapses. Synapses are Schaffer collaterals SC-CA1 for hippocampus and layer L4-L2/3 for cortical synapses (page 8).

      • line 273: what do the authors mean when writing 'from'? We don't understand the data provided on this line.

      We thank the reviewer for noticing this. That refers to the amplitude of NMDAR-mediated currents average before and after D-AP5 or MK801. We express this now in a clearer way (page 10, from 57±8 pA to 6±5 pA).

      • line 286 : why do the authors point out work on GluN2B and GluN3A only here when they first investigate GluN2A contribution to t-LTD? what about previous data on GluN2A?

      We have now expressed this in a different way to make it clear. We wanted to indicate that the available data for presynaptic NMDAR at MPP-GC synapses has been indicated to contain GluN2B and GluN3A subunits and to our knowledge, no data indicate that they contain GluN2A subunits.

      • line 428 : what do the authors mean by 'not least' ?

      This is a typo and we have removed that from the text.

      Reviewer #3 (Recommendations For The Authors):

      My only suggestion for improving data presentation in the manuscript would be to split some figures of the paper. In my opinion, the figures are too dense and therefore difficult to follow for the broad audience of eLife readers. In addition, a real image of the recorded dentate granule cells in the slice showing also the location of the real stimulation electrodes would significantly improve the presentation of Figure 1.

      We thank the reviewer for the suggestion, but we would prefer to let the figures as they are organized, as while we agree in some cases they are a bit big, in this way it is easier to compare lateral and medial pathways. For this, it could be better to let information regarding the two pathways in the same figure. Nevertheless, we try now to make figures clearer to use a columnar organization of the figures for each pathway what we think, would make easier to compare pathways. As the reviewer suggests we include now a real image of the recorded dentate granule cells in the slice showing also the location of the real stimulation electrodes in Figure 1, that we agree will improve the presentation of this figure and thank the reviewer for the suggestion.

    2. eLife Assessment

      This valuable study reports the existence of specific spike-timing dependent synaptic plasticity processes at two excitatory synapses of the dentate gyrus granule cells. These synapses link the entorhinal cortex and the dentate gyrus but via different circuits. With state-of-the-art patch-clamp electrophysiological analysis, the authors provide convincing information on the molecular mechanisms underlying these 2 forms of synaptic plasticity showing a critical role for astrocytes in both alongside some features distinctive to each pathway. These results will be of interest to neuroscientists as they uncover detailed plasticity mechanisms involving the hippocampus.

    3. Reviewer #1 (Public review):

      Summary:

      The study characterized the cellular and molecular mechanisms of spike timing-dependent long-term depression (t-LTD) at the synapses between excitatory afferents from lateral (LPP) and medial (MPP) perforant pathways to granule cells (GC) of the dentate gyrus (DG) in mice.

      Strengths:

      The electrophysiological experiments are thorough. The experiments are systematically reported and support the conclusions drawn.<br /> This study extends current knowledge by elucidating additional plasticity mechanisms at PP-GC synapses, complementing existing literature.

      Comments on the revised version:

      The revised study introduces two additional approaches to confirm astrocyte involvement in t-LTD: loading astrocytes with tetanus toxin light chain to inhibit exocytosis, and using Evans blue to block vesicular glutamate uptake. These new findings further reinforce the conclusion that t-LTD relies on Ca2+-dependent glutamate exocytosis from astrocytes.

    4. Reviewer #2 (Public review):

      Summary:

      This work reports the existence of spike timing-dependent long-term depression (t-LTD) of excitatory synaptic strength at two synapses of the dentate gyrus granule cell, which are differently connected to the entorhinal cortex via either the lateral or medial perforant pathways (LPP or MPP, respectively). Using patch-clamp electrophysiological recording of tLTD in combination with either pharmacology or a genetically modified mouse model, they provide information on the differences in the molecular mechanism underlying this t-LTD at the two synapses.

      Strengths:

      The two synapses analyzed in this study have been understudied. This new data thus provides interesting new information on a plasticity process at these synapses, and the authors demonstrate subtle differences in the underlying molecular mechanisms at play. Experiments are in general well controlled and provide robust data that are properly interpreted.<br /> The data provided to demonstrate that glutamate release from astrocytes is necessary for these plasticity mechanisms are strong. This is particularly interesting as another example of how astrocytes regulate synapse plasticity.

      Weaknesses:

      This work was performed at young synapses and the highlighted mechanisms are therefore pertinent to this age, as acknowledged by the authors. We currently don't know if these mechanisms are still at play at the adult synapse.

      Significance:

      While this is the first report of t-LTD at these synapses, this plasticity process has been mechanistically well investigated at other synapses in the hippocampus and in the cortex. Nevertheless, this new data suggests that mechanistic differences in the induction of t-LTD at these two DG synapses could contribute to the differences in the physiological influence of the LPP and MPP pathways.

    5. Reviewer #3 (Public review):

      Coatl et al. investigated the mechanisms of synaptic plasticity of two important hippocampal synapses, the excitatory afferents from lateral and medial perforant pathways (LPP and MPP, respectively) of the entorhinal cortex (EC) connecting to granule cells of the hippocampal dentate gyrus (DG). They find that these two different EC-DG synaptic connections in mice show a presynaptically expressed form of long-term depression (LTD) requiring postsynaptic calcium, eCB synthesis, CB1R activation, astrocyte activity, and metabotropic glutamate receptor activation. Interestingly, LTD at MPP-GC synapses requires ionotropic NMDAR activation whereas LTD at LPP-GC synapse is NMDAR independent. Thus, they discovered two novel forms of t-LTD that require astrocytes at EC-GC synapses. Although plasticity of EC-DG granule cell (GC) synapses has been studied using classical protocols, These are the first analyses of the synaptic plasticity induced by spike timing dependent protocols at these synapses. Interestingly, the data also indicate that t-LTD at each type of synapse require different group I mGluRs, with LPP-GC synapses dependent on mGluR5 and MPP-GC t-LTD requiring mGluR1.

      The authors performed a detailed analysis of the coefficient of variation of the EPSP slopes, miniature responses and different approaches (failure rate, PPRs, CV, and mEPSP frequency and amplitude analysis) they demonstrate a decrease in the probability of neurotransmitter release and a presynaptic locus for these two forms of LTD at both types of synapses. By using elegant electrophysiological experiments and taking the advantage of the conditional dominant-negative (dn) SNARE mice in which doxycycline administration blocks exocytosis and impairs vesicle release by astrocytes, they demonstrate that both LTD forms require the release of gliotransmitters from astrocytes. These data add in an interesting way to the ongoing discussion on whether LTD induced by STDP participates in refining synapses potentially weakening excitatory synapses under the control of different astrocytic networks. The conclusions of this paper are well supported by data.

    1. Author response:

      The following is the authors’ response to the original reviews.

      The reviewers found this manuscript to present convincing evidence for associative and non-associative behaviors elicited in male and female mice during a serial compound stimulus Pavlovian fear conditioning task. The work adds to ongoing efforts to identify multifaceted behaviors that reflect learning in classic paradigms and will be valuable to others in the field. The reviewers do note areas that would benefit from additional discussion and some minor gaps in data reporting that could be filled by additional analyses or experiments.

      We thank the reviewers and the editors for their thoughtful and constructive critiques of our manuscript. We have updated our manuscript with data from additional experiments as suggested by the reviewers, and we have significantly edited the text and figures to reflect these additions. Our detailed, point-by-point responses are below.

      Reviewer #1 (Public Review):

      The main goal of the study was to tease apart the associative and non-associative elements of cued fear conditioning that could influence which defensive behaviors are expressed. To do this, the authors compared groups conditioned with paired, unpaired, or shock only procedures followed by extinction of the cue. The cue used in the study was not typical; serial presentation of a tone followed by a white noise was used in order to assess switches in behavior across the transition from tone to white noise. Many defensive behaviors beyond the typical freezing assessments were measured, and both male and female mice were included throughout. The authors found changes in behavioral transitions from freezing to flight during conditioning as the tone transitioned into white noise, and a switch in freezing during extinction such that it became high during the white noise as flight behavior decreased. Overall, this was an interesting analysis of transitions in defensive behaviors to a serially presented cue consisting of two auditory stimuli during conditioning and then extinction.

      We thank the Reviewer for their supportive insight.

      There are some concerns regarding the possibility that the white noise is more innately aversive than the tone, inducing more escape-like behaviors compared to a tone, especially since the shock only group also showed increased escape-like behaviors during the white noise versus tone. This issue would have been resolved by adding a control group where the order of the auditory stimuli was reversed (white noise->tone).

      We appreciate this concern, and we have added two additional groups to address this possibility. We have conducted the same experimental paradigm with 2 reverse-SCS groups (WN—tone), one with paired (new PA-R group), and one with unpaired (new UN-R group), presentations to shock during conditioning. These experiments revealed that during conditioning day 2 in both reverse order groups, WN causes reductions in freezing and increases in locomotor activity (see revised Figure 2D), an effect that is stronger in the UN-R compared to the PA-R group. This locomotor effect is neither darting nor escape jumping in the PA-R group (revised Figure 3G, I; Figure 4G). In the UN-R group, WN induces more activity than the PA-R group (Figure 2D), including some jumping at WN onset (Figure 3H), but no darting (Figure 4G). It is worth noting that WN does not elicit defensive behavior before conditioning at the sound intensity we use (75dB; see Fadok et al. 2017, Borkar et al. 2020, Borkar et al. 2024). Together, these results suggest that WN is an inherently more salient stimulus than tone, and it can elicit defensive behaviors in shock-sensitized mice through non-associative mechanisms. Indeed, stimulus salience is a key factor in this paradigm for inducing activity (see Hersman et al. 2020).

      While the more complete assessment of defensive behaviors beyond freezing is welcomed, the main conclusions in the discussion are overly focused on the paired group and the associative elements of conditioning, which would likely not be surprising to the field. If the goal, as indicated in the title, was to tease apart the associative and non-associative elements of conditioning and defensive behaviors, there needs to be a more emphasized discussion and explicit identification of the non-associative findings of their study, as this would be more impactful to the field.

      We have rewritten the Discussion to provide a greater emphasis on the findings of the study that are more related to non-associative mechanisms. For example, we argue that cue-salience and changes in stimulus intensity can induce non-associative increases in locomotor behavior and tail rattling in shock-sensitized mice.

      Reviewer #2 (Public Review):

      Summary:

      The authors examined several defensive responses elicited during Pavlovian conditioning using a serial compound stimulus (SCS) as the conditioned stimulus (CS) and a shock unconditioned stimulus (US) in male and female mice. The SCS consisted of tone pips followed by white noise. Their design included 3 treatment groups that were either exposed to the CS and US in a paired fashion, in an unpaired fashion, or only exposed to the shock US. They compared freezing, jumping, darting, and tail rattling across all groups during conditioning and extinction. During conditioning, strong freezing responses to the tone pips followed by strong jumping and darting responses to the white noise were present in the paired group but less robust or not present in the unpaired or shock only groups. During extinction, tone-induced freezing diminished while the jumping was replaced by freezing and darting in the paired group. Together, these findings support the idea that associative pairings are necessary for conditioned defensive responses.

      Strengths:

      The study has strong control groups including a group that receives the same stimuli in an unpaired fashion and another control group that only receives the shock US and no CS to test the associative value of the SCS to the US. The authors examine a wide variety of defensive behaviors that emerge during conditioning and shift throughout extinction: in addition to the standard freezing response, jumping, darting, and tail rattling were also measured.

      We thank the Reviewer for their supportive appraisal of this study’s strengths.

      Weaknesses:

      This study could have greater impact and significance if additional conditions were added (e.g., using other stimuli of differing salience during the SCS), and determining the neural correlates or brain regions that are differentially recruited during different phases of the task across the different groups.

      In the revised manuscript, we have conducted experiments with 2 reverse-SCS groups (WN—tone): one with paired (new PA-R group), and one with unpaired (new UN-R group), presentations to shock during conditioning. These experiments revealed that during conditioning day 2 in both reverse order groups, WN causes reductions in freezing and increases in locomotor activity (see revised Figure 2D), an effect that is stronger in the UN-R compared to the PA-R group. This locomotor effect is neither darting nor escape jumping in the PA-R group (revised Figure 3G, I; Figure 4G). In the UN-R group, WN induces more activity than the PA-R group (Figure 2D), including some jumping at WN onset (Figure 3H), but no darting (Figure 4G). Indeed, stimulus salience is a key factor in this paradigm for inducing activity (see Hersman et al. 2020). Together, these results suggest that WN is an inherently more salient stimulus than tone, and it can elicit defensive behaviors in shock-sensitized mice through non-associative mechanisms. It is worth noting that WN does not elicit defensive behavior before conditioning at the sound intensity we use (75dB; see Fadok et al. 2017, Borkar et al. 2020, Borkar et al. 2024).

      We agree that determining the neuronal correlates and brain regions that are involved in defensive ethograms at various stages within this paradigm is of great importance, but we feel that those experiments are beyond the scope of the current study, which is focused on identifying behavioral differences based on associative and non-associative factors.

      Reviewer #1 (Recommendations For The Authors):

      In LINES 72-73, authors say they used a "truly random procedure" as one of their control groups. Then in LINES 113-116, they describe this group as "unpaired" where the "SCS could not reliably predict footshock". Combined, it is unclear if this group is random or unpaired. The "truly random procedure" is defined, by the cited Rescorla paper, as "the two events are programmed entirely randomly and independently in such a way that some "pairings" of CS and US may occur by chance alone". So, truly random would indicate that the shock may occur during the cue, while unpaired indicates the shock was explicitly unpaired from the cue. If the authors used a random procedure, the groups need to be labeled as random, not unpaired, and the # of cues that happened to coincide with footshock per animal needs to be reported somewhere. If the authors used an unpaired procedure (which appears to be the case based on 40-60s ITI between SCS and footshock being reported), it needs to be clearer and consistent throughout that it was explicitly unpaired, as well as removing the claim in LINE 72-73 that they used a "truly random procedure".

      We did indeed use an explicitly unpaired procedure. We have adjusted the text and figures to better reflect this, and we removed any mentions of randomness with regards to the presentations of SCS and footshock.

      Despite the lack of significant sex differences, it would still be helpful if data panels with individual data points (e.g. Fig 2E-J), were presented as identifiable by sex (e.g. closed vs open circles for males vs females).

      The revised manuscript now compares four or five groups per figure, making data presentation complicated. Providing the individual data points in each panel reduces figure clarity, therefore, we feel it is best to present the data as box-and-whisker plots without them. However, the source data files for each figure are available to the reader and the data are clearly labeled to be identifiable by sex.

      Is it not odd that all groups showed similar levels of contextual freezing during the 3min baseline? If shocks are unsignaled in the UN and SO groups, one would expect higher levels of contextual freezing compared to a paired group.

      We are not certain why one would expect higher levels of contextual freezing in the UN and SO groups compared to the PA group at the beginning of conditioning day 2. Another study also looked at baseline freezing in a contextual fear group (which is the same as shock only in our study) and in an auditory cued fear conditioning group within the conditioning context, and their data show that freezing during the baseline period is equivalent between groups (Sachella et al., 2022).

      During baseline on Extinction Day 1, it does seem that the unpaired and SO groups tend to have higher freezing levels compared to the paired groups. Author response image 1 shows baseline freezing during the first 3 minutes of extinction day 1. After two days of conditioning in the conditioned flight paradigm, contextual freezing either is, or trends to be significantly higher in the UN, UN-R, and SO groups than the PA and PA-R groups.

      Author response image 1.

      Baseline Freezing levels for all groups during the first extinction session. Baseline period is defined as the first 180 seconds of the session, before any auditory stimulus was presented. PA, Paired; UN, Unpaired; SO, Shock Only; PA-R, Paired Reverse; UN-R, Unpaired Reverse. *p<0.05, **p<0.01, ****p<0.0001.

      Do the tone and WN elicit similar levels of defensive behaviors in a naïve mouse? Or have the authors tested WN followed by tone? Is there a potential issue that the WN may be innately aversive which is then amplified with training? i.e. does a tone preferentially induce freezing while WN induces active behaviors, regardless of which sensory stimulus is temporally closer to the shock? If the change in behavior is really due to the pairing and temporal proximity to shock, then there should be increased jumps, etc to the tone if trained with WN->tone.

      WN can indeed be used as an aversive stimulus under certain conditions and at sufficiently high decibel levels. In the conditioned flight paradigm, WN is presented at 75dB, which is below the threshold for eliciting an acoustic startle response in a C57BL/6J mouse (Fadok et al. 2009). Also, during pre-exposure, when animals are naïve to the SCS, tone and WN stimuli do not elicit defensive behaviors (see Fadok et al. 2017, Borkar et al. 2020, 2024).

      As suggested by the Reviewer, during revision we have included reverse-SCS paired (PA-R) and unpaired (UN-R) groups to test for the role of stimulus salience and stimulus order on defensive ethograms. During conditioning day 2, the PA-R group exhibited little freezing to the WN, with a slightly elevated activity index, and they exhibited robust freezing during tone (revised Figure 2A-H). The activity during the WN in the PA-R group was significantly lower than that of the PA group (Figure 2L). The PA-R group also did not respond to WN with escape jumps or darting (Figure 3I, 4G). The UN-R group displayed greater activity during the WN than the UN and PA-R groups, but less activity than the PA group (Figure 2D, H). The UN-R group did not dart but this group displayed some jumping at WN onset (Figure 3H), like what was observed in the UN group.

      These data suggest that WN has inherent, salient properties that can induce some non-associative activity after the mouse has been sensitized by shock (see also Hersman et al. 2020 for more detailed analysis of stimulus salience in the conditioned flight paradigm). However, only in the PA group is robust flight behavior (comprised of high numbers of escape jumps and darting) observed. Therefore, both stimulus salience and temporal order are important for eliciting transitions from freezing to flight.

      Fig 3G/4G are hard for me to understand. The figure legends say they're survival graphs but the y-axis labels "Latency to initial jump/dart (% of cohort)" confuses me. What is the purpose of these graphs? Perhaps they are not needed. Or consider presenting them similar to Fig 7C, D as those were more intuitive and faster for me to grasp.

      We had intended these plots to show that a greater proportion of the paired group jumps and darts during WN compared to the unpaired group, and that the percentage of the cohort that jumps and darts increases across conditioning trials. Because these graphs were not clear, we have removed them, and we have replaced them with graphs comparing total cohort percentages that jumped (Figure 3I) or darted (Figure 4G) over the whole CD2 session.

      For the extinction data, I did not see within group analyses for within or between session fear extinction to the tone. So, for the paired group, were the last 4 trials of Ext 1 significantly lower than the first 4 trials? If not, then they did not show within-session extinction. Also, for the paired group, were the last 4 trials of Ext 1 significantly different than the first 4 trials of Ext 2? This would test for long-term retention and spontaneous recovery.

      In the original submission and in the revised manuscript, we calculated a delta change score for freezing during tone in the early versus late blocks of 4 trials, and then we statistically compared these differences across groups (Figure 5C, D). This allowed us to assess between-group differences in changes to tone-evoked freezing during extinction. Freezing to tone did decrease significantly over the first extinction session for the paired group (Early Ext1 vs Late Ext1, paired t-test, t(31) \= 6.23, p<0.0001), and when comparing late Ext1 and early Ext2, we found that tone-evoked freezing did significantly increase (Late Ext1 vs Early Ext2, paired t-test, t(31) \= 5.26, p<0.0001). This increase in cue-induced freezing between days of extinction is characteristic of C57BL/6J mice (Hefner et al., 2008). Our study did not test for more distal timepoints, so we cannot comment on the efficacy of long-term retention or spontaneous recovery.

      For the conditioning and extinction data across Figs 2, 5 and 6, what I gather from them is that freezing is high to the tone and low to the WN during conditioning, and then low to the tone, and high to the WN across extinction. Then for activity levels I see they are low to the tone and high to the WN during conditioning, and then low to the WN during extinction. The piece that is missing is what are activity levels like to the tone during extinction. Are they low like in conditioning and remain low in extinction? Or do they increase across extinction as freezing decreases? As I was going through these graphs I drew myself out step function summaries of the freezing and activity levels between tone/WN for conditioning vs extinction; maybe the authors could consider a summary figure.

      We thank the Reviewer for their interest. We found that within the paired group, activity to tone remained low throughout both days of extinction (though increased within each session) and did not return to normal activity levels. We present this data in Author response image 2. We thank the Reviewer for the suggestion of a summary figure, but we feel there are too many axes of classification (between-group, within-group, multiple behaviors, tone/WN, conditioning/extinction) to coherently present our findings in a single figure.

      Author response image 2.

      Trial-by-trial plot of activity index during the tone period of SCS across both extinction sessions for the PA group. SCS, Serial compound stimulus; Ext, extinction; PA, Paired.

      In the discussion (LINE 592-3), they discuss that shock sensitization in the SO group may prime a stressed animal to dart more readily to WN upon stimulus transition. Should this not also happen during the transition of silence to tone? What is special about a transition between two auditory stimuli that would result in panic like behavior in an animal that only received shock presentations? This also gets back to an earlier concern above regarding the potentially innately aversiveness of the WN.

      After 2 days of shock sensitization, we observe that mice exhibit freezing to the tone during the first three trials of extinction day 1 (Figure 5A). This non-associative freezing response is like that observed in other studies of non-associative fear processing (please see Kamprath and Wotjak, 2004). As trials progress during extinction day 1, mice do become mildly activated during the tone (Author response image 3). The transition to WN in the shock-only group during extinction induces non-associative darting responses, but it does not induce escape jumping behavior (Figure 7).  We hypothesize that the innate salience of the WN is a vital factor contributing to these escalated responses. The importance of stimulus salience in conditioned flight was also demonstrated by Hersman et al., 2020 for SCS conditioning, and by Furuyama et al., 2023 for single tone conditioning.  Just as with conditional freezing responses (Kamprath and Wotjak, 2004), we believe that conditional flight is controlled by summative components, one being associative and the other non-associative.

      Author response image 3.

      Trial-by-trial plot of activity index during the tone period of SCS across both extinction sessions for the SO group. SCS, Serial compound stimulus; Ext, extinction; SO, Shock Only.

      In the discussion (LINE 583), they say that the development of explosive defensive behaviors are "not achievable with traditional single-cue Pavlovian conditioning paradigms". The authors should include a caveat here that the current study did not compare their results to a group of mice that received just WN-shock pairings.

      We thank the reviewer for this comment. This statement was meant to highlight that traditional paradigms do not offer an element of signaling the temporal imminence of threat, only its inevitability. It was not our intention to state that defensive escape behaviors were unachievable in single-cue conditioning paradigms, and we regret not making this clear. Indeed, the supplement of Fadok et al. 2017 shows that WN-shock conditioning is capable of inducing flight, Furuyama et al. 2023 shows that tone-shock conditioning is capable of inducing flight under specific parameters, and Gruene et al. 2015 demonstrates that single CS-US pairings induce conditional darting behaviors in female rats. We have adjusted the text to better reflect our intent.  

      Minor comment to LINE 613-5: Speaking as someone who has done fear conditioning in both mice and rats, tail rattling may be specific to mice (I have seen this often) and likely not observable in rats (never seen it).

      We thank the Reviewer for this information. We have adjusted our text to mainly discuss mouse-specific tail rattling.

      Reviewer #2 (Recommendations For The Authors):

      The research questions in this study are novel and bring new insight to the field. However, there are some issues that can be addressed to improve the overall quality of the study, namely, the reader is left wanting to know more, especially about how neural circuits contribute to these different defensive behaviors during this task. Below are some recommendations for the authors that would greatly improve the impact and significance of this study.

      (1) What are the neural correlates or circuits recruited during these different defensive behaviors across the course of conditioning and extinction? How might they differ between the PA and UN groups? What differences might emerge when an animal is shifting their defensive behavior from freezing to darting, for example? Answering these questions would require intensive additional experiments, therefore more discussion of possible neural mechanisms that might be recruited during this task would be appreciated, given the scope of the subject area.

      We agree that understanding the neural circuits recruited during these behaviors and across conditioning and extinction is of vital importance. We are actively working on these questions, and we have published on the role of central amygdala circuits (Fadok et al. 2017) as well as on top-down control of flight by the medial prefrontal cortex (Borkar et al. 2024). Because the current manuscript is focused on learning mechanisms influencing defensive behavior, we would prefer to focus our discussion on that, rather than speculating on possible neural mechanisms. However, we have added a statement in the Discussion (LINES 706-707) emphasizing that future studies should investigate the neuronal mechanisms contributing to threat associations and different defensive behaviors.

      (2) Were any vocalizations observed during conditioning or extinction phases? If not, could you speculate how type and occurrence of vocalizations might correlate with the different defensive responses observed?

      Audible vocalizations were only observed during footshock presentations (squeaks). Unfortunately, we do not have the proper specialized recording equipment to monitor the full spectrum of mouse vocalizations, especially those in the ultrasonic range. Thus, we cannot speculate on the nuances of vocalizations in mice with respect to this behavioral paradigm. To the best of our knowledge, mice have not been reported to emit specific ultrasonic calls during conditioned threat like those of rats. That said, it would be of interest to determine if mice emit different vocalizations during different defensive behaviors.

      (3) The transition from freezing to flight during the SCS is thought to be due to the close proximity of threat imminence between the WN CS and shock US. What if you switched the order of the SCS stimuli to WN followed by tone stimuli? If the salience of the WN stimulus is truly driving the jumping behavior, then it would be observed even if the WN stimulus preceded the pure tone stimulus and that would bring additional evidence that it is the associative value of the stimuli rather than its salience that's driving the defensive behaviors. What do you predict you would observe in rodents that were given a WN-tone SCS paired and unpaired in the same design of this study?

      As suggested by the reviewer, we collected data from reverse-SCS paired and unpaired groups and reported our findings within the manuscript. Our detailed findings are also discussed above. Overall, we find that a combination of stimulus salience and temporal proximity, and a summation of non-associative and associative mechanisms, are necessary to elicit explosive flight behavior (escape jumping and darting).

      References

      Borkar CD, Dorofeikova M, Le QE, Vutukuri R, Vo C, Hereford D, Resendez A, Basavanhalli S, Sifnugel N, Fadok JP (2020) Sex differences in behavioral responses during a conditioned flight paradigm. Behavioural Brain Research 389:112623.

      Borkar CD, Stelly CE, Fu X, Dorofeikova M, Le QE, Vutukuri R, Vo C, Walker A, Basavanhalli S, Duong A, Bean E, Resendez A, Parker JG, Tasker JG, Fadok JP (2024) Top-down control of flight by a non-canonical cortico-amygdala pathway. Nature 625: 743-749.

      Fadok JP, Krabbe S, Markovic M, Courtin J, Xu C, Massi L, Botta P, Bylund K, Müller C, Kovacevic A, Tovote P, Lüthi A (2017) A competitive inhibitory circuit for selection of active and passive fear response. Nature 542:96-100.

      Furuyama T, Imayoshi A, Iyobe T, Ono M, Ishikawa T, Ozaki N, Kato N, Yamamoto R (2023) Multiple factors contribute to flight behaviors during fear conditioning. Scientific Reports 13:10402. 

      Gruene TM, Flick K, Stefano A, Shea SD, Shansky RM (2015) Sexually divergent expression of active and passive conditioned fear responses in rats. eLIfe 4:e11352.

      Hefner K, Whittle N, Juhasz J, Norcross M, Karlsson RM, Saksida LM, Bussey TJ, Singewald N, Holmes A (2008) Impaired Fear Extinction Learning and Cortico-Amygdala Circuit Abnormalities in a Common Genetic Mouse Strain. Journal of Neuroscience 6:8074-8085.

      Hersman S, Allen D, Hashimoto M, Brito SI, Anthony T (2020) Stimulus salience determines defensive behaviors elicited by aversively conditioned serial compound auditory stimuli. elife 9:e53803. 

      Kamprath K and Wotjak CT (2004) Nonassociative learning processes determine expression and extinction of conditioned fear in mice. Learning & Memory 11:770-786.

      Sachella TE, Ihidoype MR, Proulx CD, Pafundo DE, Medina JH, Mendez P & Piriz J (2022) A novel role for the lateral habenula in fear learning. Neuropsychopharmacology 47:1210-1219.

    2. eLife Assessment

      This study is deemed to be an important work that carefully deconstructs multi-faceted conditioned fear behavior in mice. The well-controlled experiments provide convincing data that will be of interest to other researchers in the field.

    3. Reviewer #1 (Public review):

      Summary

      The main goal of the study was to tease apart the associative and non-associative elements of cued fear conditioning that could influence which defensive behaviors are expressed. To do this, the authors compared groups conditioned with paired, unpaired, or shock only procedures followed by extinction of the cue. The cue used in the study was not typical; serial presentation of a tone followed by a white noise (or reversed) was used in order to assess switches in behavior across the transition from tone to white noise. Many defensive behaviors beyond the typical freezing assessments were measured, and both male and female mice were included throughout. The authors found changes in behavioral transitions from freezing to flight during conditioning as the tone transitioned into white noise, and a switch in freezing during extinction such that it became high during the white noise as flight behavior decreased. Overall, this was an interesting analysis of transitions in defensive behaviors to a serially presented cue consisting of two auditory stimuli during conditioning and then extinction.

      Strengths

      The highlights in this study were the significant switches in freezing and escape-like behaviors as the cue transitioned between the two auditory stimuli during fear conditioning, and then adjustment of those behaviors across extinction.

      These main findings were a result of thorough behavioral analyses with key control groups (reversed stimulus order, unpaired conditioning, and shock only groups), assessing freezing, jumping, darting and tail rattling to try to parse out associative versus non-associative features of the behavioral profiles.

      Weaknesses

      While the detailed analyses of defensive behaviors in mice in a situation of signaled imminent threat adds valuable knowledge to those studying fear conditioning, the caveat is that it is unclear how broadly applicable these findings truly will be. It makes sense that similar transitions in defensive behaviors will occur across organisms, but each organism and each psychiatric disorder will have unique profiles.

    4. Reviewer #2 (Public review):

      Summary:

      The authors examined several defensive responses elicited during Pavlovian conditioning using a serial compound stimulus (SCS) as the conditioned stimulus (CS) and a shock unconditioned stimulus (US) in male and female mice. The SCS consisted of a tone pips followed by white noise. Their design included conditions in which mice were exposed to the CS and US in a paired fashion, in an unpaired fashion, or only exposed to the shock US, as well as paired and unpaired conditions that reversed the order of the SCS. They compared freezing, jumping, darting, and tail rattling across all groups during conditioning and extinction. During conditioning, strong freezing responses to the tone pips followed by strong jumping and darting responses to the white noise were present in the paired group but less robust or not present in the unpaired or shock only groups. During extinction, tone-induced freezing diminished while the jumping was replaced by freezing and darting in the paired group. Together, these findings support the idea that associative pairings are necessary for conditioned defensive responses.

      Strengths:

      The study has strong control groups including a group that receives the same stimuli in an unpaired fashion and another control group that only receives the shock US and no CS to test the associative value of the SCS to the US. The authors examine a wide variety of defensive behaviors that emerge during conditioning and shift throughout extinction: in addition to the standard freezing response, jumping, darting, and tail rattling were also measured.

      The revised version has greatly strengthened this study by including additional control groups (e.g., reversing the order of the compound stimuli in both paired and unpaired conditions).

    1. Author response:

      The following is the authors’ response to the current reviews.

      We thank the Reviewer for all their effort and suggestions over multiple drafts. Their comments have encouraged us to read and think more deeply about the issue under discussion (BLA spiking in response to CS/US inputs), and to find the papers whose contents we think provide a potential solution. We agree that there is more to understand about the mechanisms underlying associative learning in the BLA. We offer our paper as providing a new way of understanding the role of circuit dynamics (rhythms) in guiding associative learning via STDP. As we pointed out in our response to the previous review, the issue highlighted by the Reviewer is an issue for the entire field of associative learning in BLA: our discussion of the issue suggests why the experimentally observed BLA spiking in response to CS inputs, performed in the absence of US inputs (as done in the papers cited by the Reviewer), may not be what occurs in the presence of the US. Since our explanation involves the role of neuromodulators, such as ACh and dopamine, the suggestion is open to further testing.


      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Public Review’s only objection: “Deficient in this study is the construction of the afferent drive to the network, which does elicit activities that are consistent with those observed to similar stimuli. It still remains to be demonstrated that their mechanism promotes plasticity for training protocols that emulate the kinds of activities observed in the BLA during fear conditioning.”

      Recommendations for the Authors: “The authors have successfully addressed most of my concerns. I commend them for their thorough response. The one nagging issue is the unrealistic activation used to drive CS and US activation in their network. While I agree that their stimulus parameters are consistent with a contextual fear task, or one that uses an olfactory CS, this was not the focus of their study as originally conceived. Moreover, the types of activation observed in response to auditory cues, which is the focus of their study, do not follow what is reported experimentally. Thus, I stand by the critique that the proposed mechanism has not been demonstrated to work for the conditioning task which the authors sought to emulate (Krabbe et al. 2019). Frustratingly, addressing this is simple: run the model with ECS neurons driven so that they fire bursts of action potentials every ~1 sec for 30 sec, and with the US activation noncontiguous with that. If the model does not produce plasticity in this case, then it suggests that the mechanisms embedded in the model are not sufficient, and more work is needed to identify them. While 'memory' effects are possible that could extend the temporal contiguity of the CS and US, the authors need to provide experimental evidence for this occurring in the BLA under similar conditions if they want to invoke it in their model. 

      (1) Fair response. I accept the authors arguments and changes. 

      (2) The authors rightly point out that the simulated afferents need not perfectly match the time courses of the peripheral inputs, since what the amygdala receives them indirectly via the thalamus, cortex, etc. However, it is known how amygdala neurons respond to such stimuli, so it behooves the authors to incorporate that fact into their model. 

      Quirk et al. 1997 show that the response to the tone plummets after the first 100 ms in Figs 5A and 6B. The Herry et al. 2007 paper emphasizes the transient response to tone pips, with spiking falling back to a poisson low firing rate baseline outside of the time when the pip is delivered. 

      Regarding potential metabotropic glutamate activation, the stimulus in Whittington et al. 1995 was electrical stimulation at 100 Hz that would synchronously activate a large volume of tissue, which is far outside the physiological norm. I appreciate that metabotropic glutamate receptors may play a role here, but ultimately the model depends upon spiking activity for the plastic process to occur, and to the best of my knowledge the spiking activity in BLA in response to a sustained, unconditioned tone, is brief (see also Quirk, Repa, and Ledoux 1995). Perhaps a better justification for the authors would be Bordi and Ledoux 1992, which found that 18% of auditory responsive neurons showed a 'sustained' response, but the sustained response neurons appear to show much weaker responses than those with transient ones (Fig 2).  I am willing to say that their paper IS relevant to contextual fear, but that is not what the authors set out to do. 

      (3) Fair response. 

      (4) Very good response! 

      Minor points: All points were addressed.”

      We thank Reviewer 1 (R1) for the positive feedback and also for pointing out that, in R1’s opinion, there is still a nagging issue related to the activation in response to CS we modeled. In (Krabbe et al., 2019), CS is a pulsed input and US is delivered right after the CS offset. The current objection of R1 is that instead, we are modeling CS and US as continuous and overlapping. R1 suggested that we add the actual input and see if they will produce the desired outputs. The answer is simple: it will not work because we need the effects of CS and US on pyramidal cells to overlap. We note that the fear learning community appears to agree with us that such contingency is necessary for synaptic plasticity (Sun et al., 2020; Palchaudhuri et al., 2024). To the best of our understanding, the source of that overlap is not understood in the community, and the gap has been much noticed (Sun et al., 2020). We do note, however, that STDP may not be the only kind of plasticity in fear learning (Li et al., 2009; Kim et al., 2013, 2016).

      It is important to emphasize that it is not the aim of our paper to model the origin of the overlap. Rather, our intent is to demonstrate the roles of brain rhythms in producing the appropriate timing for STDP, assuming that ECS and F cells can continue to be active after the offset of CS and US, respectively. This assumption is very close to how the field now treats the plasticity, even for auditory fear conditioning (Sun et al., 2020). Thus, our methodology does not contradict known results. However, the question raised by R1 is indeed very interesting, if not the point of our paper. Hence, below we give details about why our hypothesis is reasonable.

      Several papers (Quirk, Repa and LeDoux, 1995; Herry et al, 2007; Bordi and Ledoux 1992) show that the pips in auditory fear conditioning increase the activity of some BLA neurons: after an initial transient, the overall spike rate is still higher than baseline activity. As R1 points out, we did not model the transient increase in BLA spiking activity that occurs in response to each pip in the auditory fear conditioning paradigm. However, we did model the low-level sustained activity that occurs in between pips of the CS in the absence of US (Quirk, Repa and LeDoux, 1995, Fig. 2) and after CS offset (see Fig. 2B, left hand part of our manuscript). We read the data of Quirk et al., 1995 as suggesting that the low-level activity can be sustained for some indefinite time after a pip (cut off of recording was at 500 ms with no noticeable decrease in activity). As such, even if the pips and the US do not overlap in time, as in (Krabbe et al., 2019), the spiking of the ECS can be sustained after CS offset and thus overlap with US, a condition necessary in our model for plasticity through STDP. In Herry et al., 2007 Fig. 3 shows that BLA neurons respond to a pip at the population level with a transient increase in spiking and return to a baseline Poisson firing rate. However, a subset of cells continues to fire at an increased-over-baseline rate after the transient effect wears off (Fig. 3C, top few neurons) and this increased rate extends to the end of the recording time (here ~ 300 ms). These are the cells we consider to be ECS in our model. In Quirk et al., 1997, Fig. 5A also shows sustained low level activity of neurons in BLA in response to a pip. The low-level activity is shown to increase after fear learning, as is also the case in our model since ECS now entrains F so that there are more pyramidal cells spiking in response to CS. The question remains as to whether the spiking is sustained long enough and at a high enough rate for STDP to take place when US is presented sometime after the stop of the CS. 

      Experimental recordings cannot speak to the rate of spiking of BLA neurons during US due to recording interference from the shock. However, evidence seems to suggest that ECS activity should increase during the US due to the release of acetylcholine (ACh) from neurons in the basal forebrain (BF) (Rajebhosale et al., 2024). Pyramidal cells of the BLA robustly express M1 muscarinic ACh receptors (Muller et al., 2013; McDonald and Mott, 2021). Thus, ACh from BF should elicit a depolarization in pyramidal cells. Indeed, the pairing of ACh with even low levels of spiking of BLA neurons results in a membrane depolarization that can last 7 – 10 s (Unal et al., 2015). This should induce higher spiking rates and more sustained activity in the ECS and F neurons during and after the presentation of US, thus ensuring a concomitant activation of ECS and fear (F) neurons necessary for STDP to take place. Other modulators, including dopamine, may also play a role in producing the sustained activity. Activation of US leads to increased dopamine release in the BLA (Harmer and Phillips, 1999; Suzuki et al., 2002). D1 receptors are known to increase the membrane excitability of BLA projection neurons by lowering their spiking threshold (Kröner et al., 2005). Thus, the activation of the US can lead to continued and higher firing rates of ECS and F. The effect of dopamine can last up to 20 minutes (Kröner et al., 2005). For CS-positive neurons, the ACh modulation coming from the firing of US may lead to a temporary extension of firing that is then amplified and continued by dopaminergic effects.

      Hence, we suggest that a solution to the problem raised by R1 may be solved by considering the roles of ACh and dopamine in the BLA. The involvement of neuromodulators is consistent with the suggestion of (Sun et al., 2020). The model we have may be considered a “minimal” model that puts in by hand the overlap in activity due to the neuromodulation without explicitly modeling it. As R1 says, it is important for us to give the motivation of our hypotheses. We have used the simplest way to model overlap without assumptions about timing specificity in the overlap.

      To account for these points in the manuscript, we first specified that we consider the effects of the US and CS inputs on the neuronal network as overlapping, while the actual inputs may not overlap. To do that, we added the following text:

      (1) In the introduction: 

      “In this paper, we aim to show 1) How a variety of BLA interneurons (PV, SOM and VIP) lead to the creation of these rhythms and 2) How the interaction of the interneurons and the rhythms leads to the appropriate timing of the cells responding to the US and those responding to the CS to promote fear association through spike-timing-dependent plasticity (STDP). Since STDP requires overlap of the effects of the CS and US, and some conditioning paradigms do not have overlapping US and CS, we include as a hypothesis that the effects of the CS and US overlap even if the CS and US stimuli do not. In the Discussion, we suggest how neuromodulation by ACh and/or dopamine can provide such overlap. We create a biophysically detailed model of the BLA circuit involving all three types of interneurons and show how each may participate in producing the experimentally observed rhythms and interacting to produce the necessary timing for the fear learning.”

      (2) In the Result section “With the depression-dominated plasticity rule, all interneuron types are needed to provide potentiation during fear learning”:

      “The 40-second interval we consider has both ECS and F, as well as VIP and PV interneurons, active during the entire period: an initial bout of US is known to produce a long-lasting fear response beyond the offset of the US (Hole and Lorens, 1975) and to induce the release of neuromodulators. The latter, in particular acetylcholine and dopamine that are known to be released upon US presentation (Harmer and Phillips, 1999; Suzuki et al., 2002; Rajebhosale et al., 2024), may induce more sustained activity in the ECS, F, VIP, and PV neurons during and after the presentation of US, thus ensuring a concomitant activation of those neurons necessary for STDP to take place (see “Assumptions and predictions of the model” in the Discussion).”

      (3) In the Discussion section “Synaptic plasticity in our model”:

      “Synaptic plasticity is the mechanism underlying the association between neurons that respond to the neutral stimulus CS (ECS) and those that respond to fear (F), which instantiates the acquisition and expression of fear behavior. One form of experimentally observed long-term synaptic plasticity is spike-timing-dependent plasticity (STDP), which defines the amount of potentiation and depression for each pair of pre- and postsynaptic neuron spikes as a function of their relative timing (Bi and Poo, 2001; Caporale and Dan, 2008). All forms of STDP require that there be an overlap in the firing of the pre- and postsynaptic cells. In some fear learning paradigms, the US and the CS do not overlap. We address this below under “Assumptions and predictions of the model”, showing how the effects of US and CS on the spiking of the relevant neurons can overlap even in the absence of overlap of US and CS.”

      To fully present our reasoning about the origin of the overlap of the effects of US and CS, we modified and added to the last paragraph of the Discussion section “Assumptions and predictions of the model”, which now reads as follows:

      “Finally, our model requires the effect of the CS and US inputs on the BLA neuron activity to overlap in time in order to instantiate fear learning through STDP. Such a hypothesis, that learning uses spike-timing-dependent plasticity, is common in the modeling literature (Bi and Poo, 2001; Caporale and Dan, 2008; Markram et al., 2011). Current paradigms of fear conditioning include examples in which the CS and US stimuli do not overlap (Krabbe et al., 2019). Such a condition might seem to rule out the mechanisms in our paper. Nevertheless, the argument below suggests that the effects of the CS and US can cause an overlap in neuronal spiking of ECS, F, VIP, and SOM, even when CS and US inputs do not overlap.

      Experimental recordings cannot speak to the rate of spiking of BLA neurons during US due to recording interference from the shock. However, evidence suggests that ECS activity should increase during the US due to the release of acetylcholine (ACh) from neurons in the basal forebrain (BF) (Rajebhosale et al., 2024). Pyramidal cells of the BLA robustly express M1 muscarinic ACh receptors (McDonald and Mott, 2021). Thus, ACh from BF should elicit a depolarization in pyramidal cells. Indeed, the pairing of ACh with even low levels of spiking of BLA neurons results in a membrane depolarization that can last 7 – 10 s (Unal et al., 2015).   Other modulators, including dopamine, may also play a role in producing the sustained activity. Activation of US leads to increased dopamine release in the BLA (Harmer and Phillips, 1999; Suzuki et al., 2002). D1 receptors are known to increase the membrane excitability of BLA projection neurons by lowering their spiking threshold (Kröner et al., 2005). Thus, neuromodulator release should induce higher spiking rates and more sustained activity in the ECS and F neurons during and after the presentation of US, thus ensuring a concomitant activation of ECS and fear (F) neurons necessary for STDP to take place. Thus, the activation of the US can lead to continued and higher firing rates of ECS and F. The effect of dopamine can last up to 20 minutes (Kröner et al., 2005). For CS-positive neurons, the ACh modulation coming from the firing of US may lead to a temporary extension of firing that is then amplified and continued by dopaminergic effects.

      Hence, we suggest that a solution to the problem apparently posed by the non-overlap US and CS in some paradigms of auditory fear conditioning (Krabbe et al., 2019) may be solved by considering the roles of ACh and dopamine in the BLA. The model we have may be considered a “minimal” model that puts in by hand the overlap in activity due to the neuromodulation without explicitly modeling it. We have used the simplest way to model overlap without assumptions about timing specificity in the overlap. We note that, even though ECS and F neurons have the ability to fire continuously when ACh and dopamine are involved, the participation of the interneurons enforces periodic silence needed for the depression-dominated STDP.”

      In the Discussion (in section “Involvement of other brain structures”), we also acknowledged that the overlap between the effects of US and CS in the BLA may be provided by other brain structures by writing the following:

      “In our model, the excitatory projection neurons and VIP and PV interneurons show sustained activity during and after the US presentation, thus allowing potentiation through STDP to take place. The medial prefrontal cortex and/or the hippocampus may provide the substrates for the continued firing of the BLA neurons after the 2-second US stimulation. We also discuss below that this network sustained activity may originate from neuromodulator release induced by US (see section “Assumptions and predictions of the model” in the Discussion).”

      We also improved our discussion about the (Grewe et al., 2017) paper, which questions Hebbian plasticity in the context of fear conditioning based on several critiques. We included a new section in the Discussion entitled “Is STDP needed in fear conditioning?” to discuss those critiques and how our model may address them, which reads as follows:

      “Is STDP needed in fear conditioning? The study in (Grewe et al., 2017) questions the validity of the Hebbian model in establishing associative learning during fear conditioning. There are several critiques we discuss here. The first critique is that Hebbian plasticity does not explain the experimental finding showing that both upregulation and downregulation of stimulus-evoked responses are present between coactive neurons. The upregulation is provided by our model, so the issue is the downregulation, which is not addressed by our model. However, our model highlights that coactivity alone does not create potentiation; the fine timing of the pre- and postsynaptic spikes determines whether there is potentiation or depression. Here, we find that PING networks are instrumental in setting up the fine timing for potentiation. We suggest that networks not connected to produce the PING may undergo depression when coactive.

      The second critique raised by (Grewe et al., 2017) is that Hebbian plasticity alone does not explain why most of the cells exhibiting enhanced responses to the CS did not react to the US before fear conditioning. They suggest that neuromodulators may provide a third condition (besides the activity of the pre- and postsynaptic neurons) that changes the plasticity rule. Our model also does not explicitly address this experimental finding since it requires F to be initially activated by US in order for the fear association to be established. We agree that the fear cells described in (Grewe et al. 2017) may be depolarized by the US without reaching the spiking threshold; however, with neuromodulation provided during the fear training, the same input can lead to spiking, enabling the conditions for Hebbian plasticity. Our discussions above about how neuromodulators affect excitability are relevant to this point. We do not exclude that other forms of plasticity may play a role during fear conditioning in cells not initially activated by the US, but this is not the topic of our modeling study.

      The third critique raised by (Grewe et al., 2017) is that Hebbian plasticity cannot explain why the majority of cells that were US- and CS-responsive before training have a reduced CS-evoked response afterward. The reduced response happens over multiple exposures of CS without US; this can involve processes similar to those present in fear extinction, which require plasticity in further networks, especially involving the infralimbic cortex (Milad and Quirk, 2002; Burgos-Robles et al., 2007). An extension of our model could investigate such mechanisms. In the fourth critique, (Grewe et al., 2017) suggests that the Hebbian plasticity rule cannot easily account for the reduction of the responses of many CS+-responsive cells, but not of the CS−-responsive cells. We suggest that the circuits involving paradigms similar to fear extinction do not involve the CS- cells.

      Overall, we agree with (Grewe et al., 2017) that neuromodulators play a crucial role in fear conditioning, especially in prolonging the US- and CS-encoding activity as discussed in (see section “Assumptions and predictions of the model” in the Discussion), or even participating in changing the details of the plasticity rule. A possible follow-up of our work involves investigating how fear ensembles form and modify through fear conditioning and later stages. This follow-up work may involve using a tri-conditional rule, as suggested in (Grewe et al., 2017), in which the potential role of neuromodulators is taken into account in the plasticity rule in addition to the pre- and postsynaptic neuron activity. Another direction is to investigate a possible relationship between neuromodulation and a depression-dominated Hebbian rule.”

      Finally, we made additional minor changes to the manuscript:

      (1) In the Result section “Interneurons interact to modulate fear neuron output”, we specified the following:

      “The US input on the pyramidal cell and VIP interneuron is modeled as a Poisson spike train at ~ 50 Hz and an applied current, respectively. In the rest of the paper, we will use the words “US” as shorthand for “the effects of US”.” 

      (2) In the Result section “Interneuron rhythms provide the fine timing needed for depression dominated STDP to make the association between CS and fear”, we also reported the following:

      “Similarly to the US, in the rest of the paper, we will use the words “CS” as shorthand for “the effects of CS”. In our simulations, CS is modeled as a Poisson spike train at ~ 50 Hz, independent of the US input. Thus, we hypothesize that the time structure of the inputs sometimes used for the training (e.g., a series of auditory pips) is not central to the formation of the plasticity in the network.”  

      Reviewer #2 (Public Reviews):

      The authors of this study have investigated how oscillations may promote fear learning using a network model. They distinguished three types of rhythmic activities and implemented an STDP rule to the network aiming to understand the mechanisms underlying fear learning in the BLA. 

      After the revision, the fundamental question, namely, whether the BLA networks can or cannot intrinsically generate any theta rhythms, is still unanswered. The author added this sentence to the revised version: "A recent experimental paper, (Antonoudiou et al., 2022), suggests that the BLA can intrinsically generate theta oscillations (3-12 Hz) detectable by LFP recordings under certain conditions, such as reduced inhibitory tone." In the cited paper, the authors studied gamma oscillations, and when they applied 10 uM Gabazine to the BLA slices observed rhythmic oscillations at theta frequencies. 10 uM Gabazine does not reduce the GABA-A receptor-mediated inhibition but eliminates it, resulting in rhythmic populations burst driven solely by excitatory cells. Thus, the results by Antonoudiou et al., 2022 contrast with, and do not support, the present study, which claims that rhythmic oscillations in the BLA depend on the function of interneurons. Thus, there is still no convincing evidence that BLA circuits can intrinsically generate theta oscillations in intact brain or acute slices. If one extrapolates from the hippocampal studies, then this is not surprising, as the hippocampal theta depends on extrahippocampal inputs, including, but not limited to the entorhinal afferents and medial septal projections (see Buzsaki, 2002). Similarly, respiratory related 4 Hz oscillations are also driven by extrinsic inputs. Therefore, at present, it is unclear which kind of physiologically relevant theta rhythm in the BLA networks has been modelled. 

      In our public reply to the Reviewer’s point, we reported the following:

      (1) We kindly disagree that (Antonoudiou et al., 2022) contrasts with our study. (Antonoudiou et al., 2022) is a slice study showing that the BLA theta power (3-12 Hz) increases with gabazine compared to baseline. With all GABAergic currents omitted due to gabazine, the LFP is composed of excitatory currents and intrinsic currents. In our model, the high theta (6-12 Hz) comes from the spiking activity of the SOM cells, which increase their activity if the inhibition from VIP cells is removed. Thus, the model produces high theta in the presence of gabazine (see Fig. 1 in our replies to the Reviewers’ public comments). The model also shows that a PING rhythm is produced without gabazine, and that this rhythm goes away with gabazine because PING requires feedback inhibition from PV to fear cells. Thus, the high theta increase and gamma reduction with gabazine in the (Antonoudiou et al., 2022) paper can be reproduced in our model.

      (2) We agree that (Antonoudiou et al., 2022) alone is not sufficient evidence that the BLA can produce low theta (3-6 Hz); we discussed a new paper (Bratsch-Prince et al., 2024) that provides further evidence of BLA ability to produce low theta and under what circumstances. The authors reported that intrinsic BLA theta is produced in slices with ACh stimulation (without needing external glutamate input) which, in vivo, would be provided by the basal forebrain (Rajebhosale et al., eLife, 2024) in response to salient stimuli. The low theta depends on muscarinic activation of CCK interneurons, a group of interneurons that overlaps with the VIP neurons in our model (Krabbe 2017; Mascagni and McDonald, 2003). We suspect that the low theta produced in (Bratsch-Prince et al., 2024) is the same as the low theta in our model. In future work, we will aim to show that ACh activates the BLA VIP cells, which are essential to the low theta generation in the network.

      In the manuscript, we added to and modified the Discussion section “Where the rhythms originate, and by what mechanisms”. This text aims to better discuss (Antonoudiou et al. 2022) and introduce (Bratsch-Prince et al., 2024) with its connection to our hypothesis that the theta oscillations can be produced within the BLA. The new version is:

      “Where the rhythms originate, and by what mechanisms. A recent experimental paper (Antonoudiou et al., 2022) suggests that the BLA can intrinsically generate theta oscillations (312 Hz) detectable by LFP recordings when inhibition is totally removed due to gabazine application. They draw this conclusion in mice by removing the hippocampus, which can volume conduct to BLA, and noticing that other nearby brain structures did not display any oscillatory activity. In our model, we note that when inhibition is removed, both AMPA and intrinsic currents contribute to the network dynamics and the LFP. Thus, interneurons with their specific intrinsic currents (i.e., D-current in the VIP interneurons, and NaP- and H- currents in SOM interneurons) can indeed affect the model LFP and support the generation of theta and gamma rhythms (Fig. 6G). 

      Another slice study, (Bratsch-Prince et al., 2024), shows that BLA is intrinsically capable of producing a low theta rhythm with ACh stimulation and without needing external glutamate input. ACh is produced in vivo by the basal forebrain in response to US (Rajebhosale et al., 2024). Although we did not explicitly include the BF and ACh modulation of BLA in our model, we implicitly include the effect of ACh in BLA by increasing the activity of the VIP cells, which then produce the low theta rhythm. Indeed, low theta in the BLA is known to depend on the muscarinic activation of CCK interneurons, a group of interneurons that overlaps with the class of VIP neurons in our model (Mascagni and McDonald, 2003; Krabbe et al., 2018). 

      Although the BLA can produce these rhythms, this does not rule out that other brain structures also produce the same rhythms through different mechanisms, and these can be transmitted to the BLA. Specifically, it is known that the olfactory bulb produces and transmits the respiratoryrelated low theta (4 Hz) oscillations to the dorsomedial prefrontal cortex, where it organizes neural activity (Bagur et al., 2021). Thus, the respiratory-related low theta may be captured by BLA LFP because of volume conduction or through BLA extensive communications with the prefrontal cortex. Furthermore, high theta oscillations are known to be produced by the hippocampus during various brain functions and behavioral states, including during spatial exploration (Vanderwolf, 1969) and memory formation/retrieval (Raghavachari et al., 2001), which are both involved in fear conditioning. Similarly to the low theta rhythm, the hippocampal high theta can manifest in the BLA. It remains to understand how these other rhythms may interact with the ones described in our paper. However, we emphasize that there is also evidence (as discussed above) that these rhythms arise within the BLA.”

      Reviewer #2 (Recommendations for the Authors):

      (1) Three different types of VIP interneurons with distinct firing patterns have been revealed in the BLA (Rhomberg et al., 2018). Does the generation of rhythmic activities depend on the firing features of VIP interneurons? Does it matter whether VIP interneurons fire burst of action potentials or they discharge more regularly?  

      (2) The authors used data for modeling SST interneurons obtained e.g., in the hippocampus. However, there are studies in the BLA where the intrinsic characteristics of SST interneurons have been reported (Unal et al., 2020; Guthman et al., 2020; Vereczki et al., 2021). Have the authors considered using results of studies that were conducted in the BLA? 

      We thank the Reviewer for their questions, which have helped us further improve our manuscript in response to similar queries from Reviewer 3 in the previous review round. More in detail:

      (1) Although other electrophysiological types exist (Sosulina et al., 2010), we hypothesized that the electrophysiological type of VIP neurons that display intrinsic stuttering is the type that would be involved in mediating low theta oscillations during fear conditioning. This is because VIP intrinsic stuttering in cortical neurons is thought to involve the D-current, which helps create low theta bursting oscillations in the neuronal spiking patterns (Chartove et al., 2020). We think that the other subtypes of VIP interneurons are not essential for the low theta oscillatory dynamics observed during fear conditioning and, thus, did not provide an essential constraint for the phenomena we are trying to capture. VIP interneurons in our network must fire bursts at low theta to be effective in creating the pauses in ECS and F spiking needed for potentiation; single spikes at theta are not sufficient to create these pauses.

      (2) In our model, we used the results conducted in a BLA study (Sosulina et al., 2010). SOM cells in the BLA display several physiologic types. We chose to include in our model the type showing early adaptation in response to a depolarizing current and inward (outward) rectification upon the initiation (release) of a hyperpolarizing current. We hypothesize that this type can produce high theta oscillations, a prominently observed rhythm in the BLA. Unal et al., 2020 (Unal et al., 2020) found two populations of SOM cells in the BLA, which have been previously recorded in (Sosulina et al., 2010), including the one type we chose to model. This SOM cell type shows a low threshold spiking profile characterized by spike frequency adaptation and voltage sag indicative of an H-current used in our model. Guthman et al., 2020, (Guthman et al., 2020), also found a population of SOM cells with hyperpolarization induced sag.

      Our model also uses a NaP-current for which there is no data in the BLA. However, it is known to exist in hippocampal SOM cells and that NaP- and H- currents can produce such a high theta in hippocampal cells. It is a standard practice in modeling to use the best possible replacement for unknown currents. Of course, it is unfortunate to have to do this. We also note that models can be considered proof of principle, that can be proved or disproved by further experimental work. Both (Guthman et al., 2020) and (Vereczki et al., 2021) also uncover further heterogeneity among BLA SOM interneurons involving more than electrophysiology. We hypothesize that such a level of heterogeneity revealed by these three studies is not key to the question we are asking (where crucial ingredients are the rhythms) and, therefore, was not included in our minimal model.

      We modified the Discussion section titled “Assumptions and predictions of the model” as follows:

      “Our model, which is a first effort towards a biophysically detailed description of the BLA rhythms and their functions, does not include the neuron morphology, many other cell types, conductances, and connections that are known to exist in the BLA; models such as ours are often called “minimal models” and constitute most biologically detailed models. For example, although there is considerable variability in the activity patterns of both VIP cells and SOM cells (Sosulina et al., 2010; Guthman et al., 2020; Ünal et al., 2020; Vereczki et al., 2021), our focus was specifically on those subtypes that generate critical rhythms within the BLA. Such minimal models are used to maximize the insight that can be gained by omitting details whose influence on the answers to the questions addressed in the model are believed not to be qualitatively important. We note that the absence of these omitted features constitutes hypotheses of the model: we hypothesize that the absence of these features does not materially affect the conclusions of the model about the questions we are investigating. Of course, such hypotheses can be refuted by further work showing the importance of some omitted features for these questions and may be critical for other questions. Our results hold when there is some degree of heterogeneity of cells of the same type, showing that homogeneity is not a necessary condition.”

      (3) The authors may double-check the reference list, as e.g., Cuhna-Reis et al., 2020 is not listed. 

      We thank the Reviewer for spotting this. We checked the reference list and all the references are now listed.

      Finally, we wanted to acknowledge that we made other changes to the manuscript unrelated to the reviewers’ questions with the purpose of gaining clarity. More specifically:

      (1) We included a section titled “Significance” after the abstract and keywords, which reads as follows:

      “Our paper accounts for the experimental evidence showing that amygdalar rhythms exist, suggests network origins for these rhythms, and points to their central role in the mechanisms of plasticity involved in associative learning. It is one of the few papers to address high-order cognition with biophysically detailed models, which are sometimes thought to be too detailed to be adequately constrained. Our paper provides a template for how to use information about brain rhythms to constrain biophysical models. It shows in detail, for the first time, how multiple interneurons help to provide time scales necessary for some kinds of spike-timing-dependent plasticity (STDP). It spells out the conditions under which such interactions between interneurons are needed for STDP and why. Finally, our work helps to provide a framework by which some of the discrepancies in the fear learning literature might be reevaluated. In particular, we discuss issues about Hebbian plasticity in fear learning; we show in the context of our model how neuromodulation might resolve some of those issues. The model addresses issues more general than that of fear learning since it is based on interactions of interneurons that are prominent in the cortex, as well as the amygdala.”

      (2) The Result section “Physiology of the interneuron types is critical to their role in depression-dominated plasticity”, which is now titled “Mechanisms by which interneurons contribute to potentiation in depression-dominated plasticity”, now reads as follows:

      “Mechanisms by which interneurons contribute to potentiation during depressiondominated plasticity. The PV cell is necessary to induce the correct pre-post timing between ECS and F needed for long-term potentiation of the ECS to F conductance. In our model, PV has reciprocal connections with F and provides lateral inhibition to ECS. Since the lateral inhibition is weaker than the feedback inhibition, PV tends to bias ECS to fire before F. This creates the fine timing needed for the depression-dominated rule to instantiate plasticity. If we used the classical Hebbian plasticity rule (Bi and Poo, 2001) with gamma frequency inputs, this fine timing would not be needed and ECS to F would potentiate over most of the gamma cycle, and thus we would expect random timing between ECS and F to lead to potentiation (Fig. S4). In this case, no interneurons are needed (See Discussion “Synaptic plasticity in our model” for the potential necessity of the depression-dominated rule). 

      In this network configuration, the pre-post timing for ECS and F is repeated robustly over time due to coordinated gamma oscillations (PING, as shown in Fig. 4A, Fig. 1C) arising through the reciprocal interactions between F and PV (Feng et al., 2019). PING can arise only when PV is in a sufficiently low excitation regime such that F can control PV activity (Börgers et al., 2005), as in Fig. 4A. However, although such a low excitation regime establishes the correct fine timing for potentiation, it is not sufficient to lead to potentiation (Fig. 4A, Fig. S2C): the depression-dominated rule leads to depression rather than potentiation unless the PING is periodically interrupted. During the pauses, made possible only in the full network by the presence of VIP and SOM, the history-dependent build-up of depression decays back to baseline, allowing potentiation to occur on the next ECS/F active phase. (The detailed mechanism of how this happens is in the Supplementary Information, including Fig. S2). Thus, a network without the other interneuron types cannot lead to potentiation. Though a low excitation level for a PV cell is necessary to produce a PING, a higher excitation level is necessary to produce a pause in the ECS and F. This higher excitation level is consistent with the experimental literature showing a strong activation of PV after the onset of CS (Wolff et al., 2014). The higher excitation happens when the VIP cell is silent, whereas a low excitation level is achieved when the VIP cell fires and partially inhibits the PV cell (Fig. 4B, Fig. S2D). The interruption in the ECS and F activity requires the participation of another interneuron, the SOM cell (Figs. 2B, S2): the pauses in inhibition from the VIP periodically interrupt ECS and F firing by releasing PV and SOM from inhibition and thus indirectly silencing ECS and F. Without these pauses, depression dominates (see SI section “ECS and F activity patterns determine overall potentiation or depression”).”

      We also removed a supplementary figure (Fig. S2).

      (3) We wanted to be clear and motivate our choice to extend the low theta range to 2-6 Hz and the high theta range to 6-14 Hz, compared to the 3-6 Hz and 6-12 Hz, respectively in the BLA experimental literature. Our main reason for extending the ranges was because the peaks of low and high theta power in the VIP and SOM cells, respectively, (the cells that generate these oscillations) occurred at the borders of the experimental ranges. Thus, in order to include the peaks of the model LFP, we lowered the low theta range by 1 Hz and increased the high theta range by 2 Hz.

      We present a new supplementary figure (Fig. S1) containing the power spectra of VIP, which is the source of low theta in our model, and SOM interneuron, which is the source of high theta:

      We mention Fig. S1 in the Result section “Rhythms in the BLA can be produced by interneurons”, where we added the following text: o “In the baseline condition, the condition without any external input from the fear conditioning paradigm (Fig. 1B, top), our VIP neurons exhibit short bursts of gamma activity (~38 Hz) at low theta frequencies (~2-6 Hz) (peaking at ~3.5 Hz) (see Fig. S1A).” o “In our baseline model, SOM cells have a natural frequency of ~12 Hz (Fig. 1B, middle; Fig. S1B), which is at the upper limit of the experimental high theta range; this motivates our choice to extend the high theta range up to 14 Hz in order to include the peak.” 

      Knowing the natural frequencies of VIP and SOM interneurons from the Result section “Rhythms in the BLA can be produced by interneurons”, we specified more clearly that we quantify the change of power in the low and high theta range around the power peaks in those ranges. Specifically, we changed some sentences in the first paragraph of the Result section “Increased low-theta frequency is a biomarker of fear learning” as follows:

      “We find that fear conditioning leads to an increase in low theta frequency power of the network spiking activity compared to the pre-conditioned level (Fig. 6 A,B); there is no change in the high theta power. We also find that the LFP, modeled as the linear sum of all the AMPA, GABA, NaP-, D-, and H- currents in the network, similarly reveals a low theta power increase when considering the peak of the low theta power, and no significant variation in the high theta power again when considering the peak of the high theta power (Fig. 6 C,D,E).”

      Finally, we made a few other small changes:

      In the Introduction, we mention the following: “We also note that there is not uniformity on the exact frequencies associated with low and high theta, e.g., ((Lorétan et al., 2004) used 2-6 Hz for low theta). Here, we use 2-6 Hz for the theta range and 6-14 Hz for the high theta range.”

      In Fig. 6DE (reported below point 3)), we reran the statistics using a smaller interval for high theta (11.5-13 Hz) to focus around the peak. Our initial result showing significant change in low theta between pre and post fear conditioning and no change in high theta still holds.

      In Fig. 6 of the Result section “Increase low-theta frequency is a biomarker of fear learning”, we switched the order of panels F and G. This change allows us to first focus on the AMPA currents, which are the major contributors of the low theta power increase, and to specify what AMPA current drives that increase. After that, we present the power spectrum of the GABA currents, as well.

      The corresponding text in the Result section, now reads as follows:

      “We find that fear conditioning leads to an increase in low theta frequency power of the network spiking activity compared to the pre-conditioned level (Fig. 6 A,B); there is no change in the high theta power. We also find that the LFP, modeled as the linear sum of all the AMPA, GABA, NaP-, D-, and H- currents in the network, similarly reveals a low theta power increase when considering the peak of the low theta power, and no significant variation in the high theta power again when considering the peak of the high theta power (Fig. 6 C,D,E). These results are consistent with the experimental findings in (Davis et al., 2017). Specifically, the newly potentiated AMPA synapse from ECS to F ensures F is active after fear conditioning, thus generating strong currents in the PV cells to which it has strong connections (Fig. 6F). It is the AMPA currents to the PV interneurons that are directly responsible for the low theta increase; it is the newly potentiated ECS to F synapse that paces the AMPA currents in the PV interneurons to go at low theta. Thus, the low theta increase is due to added excitation provided by the new learned pathway.”

      (4) In the Discussion section “Assumptions and predictions of the model”, we specified the following:

      “Our model predicts that blockade of D-current in VIP interneurons (or silencing VIP interneurons) will both diminish low theta and prevent fear learning. Finally, the model assumes the absence of significantly strong connections from the excitatory projection cells ECS to PV interneurons, unlike the ones from F to PV. Including those synapses would alter the PING rhythm created by the interactions between F and PV, which is crucial for fine timing between ECS and F needed for LTP.”

      (5) Finally, to broaden the potential interest of our study, we added the following sentences:

      At the conclusion of the abstract:

      “The model makes use of interneurons commonly found in the cortex and, hence, may apply to a wide variety of associative learning situations.” - At the conclusion of the introduction:

      “Finally, we note that the ideas in the model may apply very generally to associative learning in the cortex, which contains similar subcircuits of pyramidal cells and interneurons: PV, SOM and VIP cells.” 

      Also, changes in the emphasis of the paper led us to remove the following from the abstract: “Finally, we discuss how the peptide released by the VIP cell may alter the dynamics of plasticity to support the necessary fine timing.”

    2. eLife Assessment

      This valuable modeling study explores how biophysical properties of different interneuron subtypes in the basolateral amygdala (BLA) enable production of oscillations that facilitate functions such as spike-timing-dependent plasticity. Simulated networks provide solid evidence that highlights the importance of interactions between interneurons for some forms of spike-timing dependent plasticity. This work will likely be of interest to investigators studying interactions among interneurons, rhythms in the amygdala, and mechanisms of plasticity thought to underlie associative learning.

    3. Reviewer #1 (Public review):

      Plasticity in the basolateral amygdala (BLA) is thought to underlie the formation of associative memories between neutral and aversive stimuli, i.e. fear memory. Concomitantly, fear learning modifies the expression of BLA theta rhythms, which may be supported by local interneurons. Several of these interneuron subtypes, PV+, SOM+, and VIP+, have been implicated in the acquisition of fear memory. However, it was unclear how they might act synergistically to produce BLA rhythms that structure the spiking of principal neurons so as to promote plasticity. Cattani et al. explored this question using small network models of biophysically detailed interneurons and principal neurons.

      Using this approach, the authors had four principal findings:

      (1) Intrinsic conductances in VIP+ interneurons generate a slow theta rhythm that periodically inhibits PV+ and SOM+ interneurons, while disinhibiting principal neurons.<br /> (2) A gamma rhythm arising from the interaction between PV+ and principal neurons establishes the precise timing needed for spike-timing-dependent plasticity.<br /> (3) Removal of any of the interneuron subtypes abolishes conditioning-related plasticity.<br /> (4) Learning-related changes in principal cell connectivity enhance expression of slow theta in the local field potential.

      The strength of this work is that it explores the role of multiple interneuron subtypes in the formation of associative plasticity in the basolateral amygdala. The authors use biophysically detailed cell models that capture many of their core electrophysiological features, which helps translate their results into concrete hypotheses that can be tested in vivo. Moreover, they try to align the connectivity and afferent drive of their model with those found experimentally.

      A drawback to this study is the construction of the afferent drive to the network, which does not elicit activities that are consistent with the majority of those observed to similar stimuli. The authors discuss this issue in depth, and provide potential mechanisms that may overcome it.

      Setting aside the issues with the conditioning protocol, the study offers a model for the generation of multiple rhythms in the BLA that is ripe for experimental testing. The most promising avenue would be in vivo experiments testing the role of local VIP+ neurons in the generation of slow theta. That would go a long way to resolving whether BLA theta is locally generated or inherited from medial prefrontal cortex or ventral hippocampus afferents.

      The broader importance of this work is that it illustrates that we must examine the function of neurons not just in terms of their behavioral correlates, but by their effects on the microcircuit they are embedded within. No one cell type is instrumental in producing fear learning in the BLA. Each contributes to the orchestration of network activity to produce plasticity. Moreover, this study reinforces a growing literature highlighting the crucial role of theta and gamma rhythms in BLA function.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The manuscript could be improved by addressing the following issues.

      (1) Fig. 3: The analgesic effects after astrocyte ablation appear to recover after one week. Is this due to repopulation of astrocytes?

      Although we did not detect the proliferation of astrocytes, we hypothesized that it was likely related to the microglia phagocytosis of astrocyte debris after astrocyte ablation. Microglia are known to have the function of phagocytosis of cell debris. Diphtheria toxin-mediated cell ablation caused AAV2/5-GfaABC1D-Cre labeled astrocytes death and cell fragmentation. We hypothesized that the microglia could phagocyte the astrocyte fragments and were stimulated to activate type I interferon signal. When microglia phagocyte debris ended, the activation of type I interferon signal was also declined. Reduced activation of type I interferon signal may also be accompanied by recurrence of pain.

      (2) Fig. 3: Please justify the large sample size of n=30-36. Is this sample size based on previous studies or statistical estimation?

      The number of mice was based on our previous report [1], and the increased number of mice may also ensure that the pain data would also be reliable. Not only did we explore the differences between the sexes, and we also needed to obtain samples at different times for different experiments.

      (3) Please try to plot individual data points for some critical time points to demonstrate data distribution. It is also helpful to plot male and female data points separately for some time points.

      Individual data have been plotted as your request and added in the supplementary material.

      (4) It is unclear if the same number of males and females were used in this study, as females were typically used for SCI studies. I wonder if you can use repeated measures Two-Way ANOVA for statistical analysis.

      According to our observations, the number of males and females was not the same, while both of them were sufficient for statistical analysis. In addition, in the process of breeding transgenic mice, we would obtain both male and female mice, and rational use of mice may be better for us. Indeed, previous studies have shown that female mice are more commonly used in pain studies. Although we did not observe a gender difference in this study, it has been reported in the previous studies that gender is one of the factors for pain differences. According to your suggestion, we adopted the Two-Way ANOVA for statistical analysis and updated it in the part of statistical methods, but the statistical results were consistent with the previous results, so we did not modify the statistical results of the pictures.

      (5) Fig. 3C, D: The effects of astrocyte ablation on mechanical pain are mild, compared to thermal pain. Electronic von Frey apparatus may be difficult for mice. It works very well for rats and large animals.

      Since the animals involved in this study were all mice, we did not know how electronic von Frey was used in rats and large animals. But after the using of electronic von Frey, it seems to us that electronic von Frey is very suitable for mouse experiments. Best of all, our electronic von Frey can achieve accuracy as low as 0.01g. This allows us to detect very sensitive pain data, which may be more accurate and intuitive than before.

      (6) Fig. 2B: In the figure legend it states n = 3 biological repeats. There are many more dots in each column. Are these individual animals or spinal cord sections?

      As we describe in our method, n = 3 biological repeats represented three biological repeats per group, i.e., three mice/group with three IF per mouse. We take three or more values in each ascending tract (depending on the partition size of the different ascending tracts of lumbar enlargements). So, we would get more data as shown in Figure 2, which could be also more reliable.

      (7) Fig. 4C: It appears that GFAP is increased by toxin treatment. Please explain this result.

      This figure was calculated for astrocyte activation in the lesion area (T9-10), but not for the lumbar enlargement.

      Reviewer #2 (Recommendations For The Authors):

      Specific Comments:

      RNA-Sequencing Analysis: The strength of the RNA-sequencing data in elucidating the impact of astrocyte elimination is compelling. While the focus on IFN signaling is well-supported, the manuscript overlooks other differentially expressed genes. A deeper analysis or at least a discussion of these genes could enrich the study's conclusions, offering a more holistic view of the underlying mechanisms.

      Although we did not focus more on other relevant differential genes, we focused on the most significant differential genes, for these differential genes have a more significant effect on pain.

      Q2: Figure Presentation: Consolidating Figures 1-3 could increase the clarity of the result presentation, reducing distractions from the main narrative. Certain aspects, such as the comparison of different tracts in Figure 2B and the body weight data in Figure 3C, seem tangential and might be better suited for supplementary materials.

      The comparison of astrocyte activation in different ascending tracts of lumbar enlargements explained the relationships between astrocyte activation and pain, and laid the foundation for the subsequent astrocyte elimination. The weight data is also important, reflecting not only the changes in the overall recovery process after spinal cord injury, but also the effect of astrocyte elimination on the overall effect of mice. Thus, the weight data together with the pain test results will be more intuitive for the reader to understand the change of overall conditions of mice after astrocyte elimination.

      Q3: Schematic Clarity: The schematic in Figure 1A is confusing, particularly in distinguishing between transgenic mice and viral constructs. The inconsistent naming of Cre recombinase (alternatively referred to as Cre, CRE, and sometimes DRE) further complicates understanding. Standardizing these elements would greatly enhance clarity for the readers.

      As we described in the part of method, Gt(ROSA)26Sorem1(CAG-LSL-RSR-tdTomato-2A-DTR)Smoc mice contain both Loxp-stop-Loxp sequence and Rox-stop-Rox sequence. In the process of reproduction, Gt(ROSA)26Sorem1(CAG-LSL-RSR-tdTomato-2A-DTR)Smoc mice crossed with C57BL/6JSmoc-Tg(CAG-Dre)Smoc mice could remove the Rox-stop-Rox sequence, which could further crossed with mice containing Cre recombinase, or with AAV2/5-GfaABC1D-Cre intervention to remove the Loxp-stop-Loxp sequence and induce the expression of tdTomato and DTR.

      Q4: Pathway Analysis: The discussion of the signal pathway analysis in Figure 8 leans heavily on speculation without direct evidence from the study. Distinguishing clearly between findings and literature-derived hypotheses is crucial. A more detailed discussion that properly cites sources for each pathway element would strengthen the manuscript.

      According to your question, we have added this figure to the supplementary picture.

      Q5: Statistical Analysis: The use of one-way ANOVA, despite presenting data in groups, is misaligned with the data's structure. Employing two-way ANOVA followed by post-hoc comparisons is appropriate for statistical analysis.

      According to your suggestions, we adopted the Two-Way ANOVA for statistical analysis and updated it in the part of statistical methods, but the statistical results are consistent with the previous ones. Therefore, we did not modify the statistical results of the pictures.

    2. eLife Assessment

      This important study demonstrated that ablation of astrocytes in the lumbar spinal cord not only reduced neuropathic pain but also caused microglia activation. The findings presented add considerable value to the current understanding of the role of astrocyte elimination in neuropathic pain, offering convincing evidence that supports existing hypotheses and insights into the interactions between astrocytes and microglial cells, likely through IFN-mediated mechanisms

    3. Reviewer #1 (Public Review):

      Summary:

      In this study the authors demonstrated that ablation of astrocytes in lumbar spinal cord not only reduced neuropathic pain but also caused microglia activation. Furthermore, RNA sequencing and bioinformatics revealed an activation of STING/type I IFNs signal pathway in spinal cord microglia after astrocyte ablation.

      Strengths:

      The findings are novel and interesting and provide new insights into astrocyte-microglia interaction in neuropathic pain. This study may also offer a new therapeutic strategy for the treatment of debilitating neuropathic pain in patients with SCI.

      Weaknesses:

      The authors have provided a satisfactory explanation of the comments on sample size, statistics, and the sex of the animals. The statistic was reworked.

    4. Reviewer #2 (Public Review):

      Summary:

      In the manuscript, Zhao et al. have carried out a thorough examination of the effects of targeted ablation of resident astrocytes on behavior, cellular responses, and gene expression after spinal cord injury. Employing transgenic mice models alongside pharmacogenetic techniques, the authors have successfully achieved the selective removal of these resident astrocytes. This intervention led to a notable reduction in neuropathic pain and induced a shift in microglial cell reactivation states within the spinal cord, significantly altering transcriptome profiles predominantly associated with interferon (IFN) signaling pathways.

      Strengths:

      The findings presented add considerable value to the current understanding of the role of astrocyte elimination in neuropathic pain, offering convincing evidence that supports existing hypotheses and valuable insights into the interactions between astrocytes and microglial cells, likely through IFN-mediated mechanisms. This contribution is highly relevant and suggests that further exploration in this direction could yield meaningful results.

      Weaknesses:

      The authors have satisfactorily addressed the comments regarding further clarifications and statistical methods.

    1. eLife Assessment

      The study is valuable to the field, introducing a new model to test BM-periosteal stem cell function in vivo. The authors' findings suggested that periosteal stem cells are linked to hematopoietic regeneration. More comparisons with the conventional model and direct examination of periosteal stem cell factors in hematopoietic regeneration are missing. The observations are solid, however, the limitations in their experimental model made the overall impact incomplete; there is potential for improvements to be made in this area.

    2. Reviewer #1 (Public review):

      The manuscript under review investigates the role of periosteal stem cells (P-SSC) in bone marrow regeneration using a whole-bone subcutaneous transplantation model. While the model is somewhat artificial, the findings were interesting, suggesting the migration of periosteal stem cells into the bone marrow and their potential to become bone marrow stromal cells. This indicates a significant plasticity of P-SSC consistent with previous reports using fracture models (Cell Stem Cell 29:1547, Dev Cell 59:1192).

      Major Concerns

      (1) The authors assert that the periosteal layer was completely removed in their model, which is crucial for their conclusions. To substantiate this claim, it is recommended that the authors provide evidence of the successful removal of the entire periosteal stem cell (P-SSC) population. A colony-forming assay, with and without periosteal removal, could serve as a suitable method to demonstrate this.

      (2) The observation that P-SSCs do not express Kitl or Cxcl12, while their bone marrow stromal cell (BM-MSC) derivatives do, is a key finding. To strengthen this conclusion, the authors are encouraged to repeat the experiment using Cxcl12 or Scf reporter alleles. Immunofluorescence staining that confirms the migration of periosteal cells and their transformation into Cxcl12- or Scf-reporter-positive cells would significantly enhance the paper's key conclusion.

      (3) On page 8, line 20, the authors' statement regarding the detection of Periostin+ cells outside the periosteum layer could be misinterpreted due to the use of the periostin antibody. Given that periostin is an extracellular matrix protein, the staining may not accurately represent Periostin-expressing cells but rather the presence of periostin in the extracellular matrix. The authors should revise this section for greater precision.

    3. Reviewer #2 (Public review):

      Summary:

      The authors have established a femur graft model that allows the study of hematopoietic regeneration following transplantation. They have extensively characterized this model, demonstrating the loss of hematopoietic cells from the donor femur following transplantation, with recovery of hematopoiesis from recipient cells. They also show evidence that BM MSCs present in the graft following transplantation are graft-derived. They have utilized this model to show that following transplantation, periosteal cells respond by first expanding, then giving rise to more periosteal SSCs, and then migrating into the marrow to give rise to BM MSCs.

      Strengths:

      These studies are notable in several ways:

      (1) Establishment of a novel femur graft model for the study of hematopoiesis;

      (2) Use of lineage tracing and surgery models to demonstrate that periosteal cells can give rise to BM MSCs.

      Weaknesses:

      There are a few weaknesses. First, the authors do not definitively demonstrate the requirement of periosteal SSC movement into the BM cavity for hematopoietic recovery. Hematopoiesis recovers significantly before 5 months, even before significant P-SSC movement has been shown, and hematopoiesis recovers significantly even when periosteum has been stripped. Second, it is not clear how the periosteum is changing in the grafts. Which cells are expanding is unclear, and it is not clear if these cells have already adopted a more MSC-like phenotype prior to entering the marrow space. Indeed, given the presence of host-derived endothelial cells in the BM, these studies are reminiscent of prior studies from this group and others that re-endothelialization of the marrow may be much more important for determining hematopoietic regeneration, rather than the P-SSC migration. Third, the studies exploring the preferential depletion of BM MSCs vs P-SSCs are difficult to interpret. The single metabolic stress condition chosen was not well-justified, and the use of purified cell populations to study response to stress ex vivo may have introduced artifacts into the system.

    4. Reviewer #3 (Public review):

      Summary:

      Marchand, Akinnola, et al. describe the use of the novel model to study BM regeneration. Here, they harvest intact femurs and subcutaneously graft them into recipient mice. Similar to standard BM regeneration models, there is a rapid decrease in cellularity followed by a gradual recovery over 5 months within the grafts. At 5 months, these grafts have robust HSC activity, similar to HSCs isolated from the host femur. They find that periosteum skeletal stem cells (p-SSCs) are the primary source of BM-MSCs within the grafted femur and that these cells are more resistant to the acute stress of grafting the femur.

      Strengths:

      This is an interesting manuscript that describes a novel model to study BM regeneration. The model has tremendous promise.

      Weaknesses:

      The authors claim that grafting intact femurs subcutaneously is a model of BM regeneration and can be used as a replacement for gold standard BM regeneration assays such as sublethal chemo/irradiation. However, there isn't enough explanation as to how this model is equivalent or superior to the traditional models. For instance, the authors claim that this model allows for the study of "BM regeneration in vivo in response to acute injury using genetic tools." This can and has been done numerous times with established, physiologically relevant BM regeneration models. The onus is on the authors to discuss or perform the necessary experiments to justify the use of this model. For example, standard BM regeneration models involve systemic damage that is akin to therapies that require BM regeneration. How is studying the current model that provides only an acute injury more relevant and useful than other models? As it stands, it seems as if the authors could have done all the experiments demonstrating the importance of these p-SSCs in the traditional myelosuppressive BM regeneration models to be more physiologically relevant. Along these lines, the use of a standard BM regeneration model (e.g., sublethal chemo/irradiation) as a critical control is missing and should be included. Even if the control doesn't demonstrate that p-SSCs can contribute to the BM-MSC during regeneration, it will still be important because it could be the justification for using the described model to specifically study p-SSCs' regulation of BM regeneration.

      The authors perform some analysis that suggests that grafting a whole femur mimics BM regeneration, but there are many experiments missing from the manuscript that will be necessary to support the use of this model. To demonstrate that this new model mimics current BM regeneration models, the authors need to perform a careful examination of the early kinetics of hematopoietic recovery post-transplant. Complete blood counts should be performed on the grafts, focusing on white blood cells (particularly neutrophils), red blood cells, platelets, all critical indicators of BM regeneration. This analysis should be done at early time points that include weekly analysis for a minimum of 28 days following the graft. Additionally, understanding how and when the vasculature recovers is critical. This is particularly important because it is well-established that if there is a delay in vascular recovery, there is a delay in hematopoietic recovery. As mentioned above, a standard BM regeneration model should be used as a control.

      The contribution of donor and host cells to the BM regeneration of the graft is interesting. Particularly, the chimerism of the vasculature. One can assume that for the graft to undergo BM regeneration, there needs to be the delivery of nutrients into the graft via the vasculature. The chimerism of the vascular network suggests that host endothelial cells anastomose with the graft. Host mice should have their vascular system labeled with a dye such as dextran to determine if anastomosis has occurred. If not, the authors need to explain how this graft survives up to 5 months. If anastomosis does occur, then it is very surprising that the hematopoietic system of the graft is not a chimera because this would essentially be a parabiosis model. This needs to be explained.

      Most of the data presented for the resistance of p-SSCs to stress suggests DNA damage response. Do p-SSCs demonstrate a higher ability to resolve DNA damage? Do they accumulate less DNA damage? Staining for DNA damage foci or performing comet assays could be done to further define the mechanism of stress resistance properties of p-SSCs.

      Given the importance of BM-MSCs in hematopoiesis and that the majority of the emerging BM-MSCs appear to be derived from p-SSCs, the authors should perform experiments to determine if p-SSC-derived BM-MSCs are critical regulators of BM regeneration. For example, the authors could test this by crossing the Postn-creER mice with iDTR mice to ablate these cells and see if recovery is inhibited or delayed. This should be done with the described periosteum-wrapped femur graft model as well as a control BM regeneration model. Demonstrating that the deletion of these cells affects BM regeneration in both models would further justify the physiological relevance and utility of the femur graft model.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      We appreciate the valuable and constructive comments of Reviewer #1 on our manuscript. We have addressed the comments from Reviewer #1 in the public review in the response to the recommendations for the authors, as the public review comments largely overlap with that of the recommendations for the authors.

      Reviewer #1 (Recommendations For The Authors):

      (1.1) Figure 1 did not use a mock-infected control for the development of R-loops but only a time before infection. I think it would have been a good control to have that after the same time of infection non-infected cells did not show increases in R-loops and this is not a product of the cell cycle.

      We prepared our DRIPc-seq library using cell extracts harvested at 0, 3, 6, and 12 h post-infection (hpi), all at the same post-seeding time point. Each sample was infected with HIV-1 virus in a time-dependent manner. Therefore, it is unlikely that the host cellular R-loop induction observed in our DRIPc-seq results was due to R-loop formation during the cell cycle. In Lines 93–95 of the Results section of the revised manuscript, we have provided a more detailed description of our DRIPc-seq library experimental scheme. Thank you. 

      (1.2) Figure 2 should have included a figure showing the proportion of DRIPc-seq peaks located in different genome features relative to one another instead of whether they were influenced by time post-infection. Figure 2C was performed in HeLa cells, but primary T cell data would have been more relevant as primary CD4+ T cells are more relevant to HIV infection.

      We have included a new figure presenting the relative proportion of DRIPc-seq peaks mapped to different genomic features at each hpi (Fig. 2C of the revised manuscript). We found that the proportion of DRIPc-seq peaks mapped to various genomic compartments remained consistent over the hours following the HIV-1 infection. This further supports our original claim that HIV-1 infection does not induce R-loop enrichment at specific genomic features but that the accumulation of R-loops after HIV-1 infection is widely distributed.

      We considered HeLa cells as the primary in vitro infection model, therefore, we conducted RNA-seq only on HeLa cells. However, we agree with the reviewer's opinion that data from primary CD4+ T cells may be more physiologically relevant. Nevertheless, as demonstrated in the new figure (Fig. 2C of the revised manuscript), HIV-1 infection did not significantly alter the proportion of R-loop peaks mapped to specific genomic compartments, such as gene body regions, in HeLa, primary CD4+ T, and Jurkat cells. Therefore, we anticipate no clear correlation between changes in gene expression levels and R-loop peak detection upon HIV-1 infection, even in primary T cells. Thank you.   

      (1.3) Figure 5G is very hard to see when printed, is there a change in brightness or contrast that could be used? The arrows are helpful but they don't seem to be pointing to much.

      We have highlighted the intensity of the PLA foci and magnified the images in Fig. 5G in the revised manuscript. While editing the images according to your suggestion, we found a misannotation regarding the multiplicity of infection in the number of PLA foci per nucleus quantification analysis graph in Fig. 5G of the original manuscript. We have corrected this issue and hope that it is now much clearer. 

      (1.4) The introduction provided a good background for those who may not have a comprehensive understanding of DNA-RNA hybrids and R-loops, but the rationale that integration in non-expressed sequence implies that R-loops may be involved is very weak and was not addressed experimentally. A better rationale would have been to point out that, although integration in genes is strongly associated with gene expression, the association is not perfect, particularly in that some highly expressed genes are, nonetheless, poor integration targets.

      In accordance with the reviewer's comment, we revised the Introduction. We have deleted the statement and reference in the introduction "... the most favored region of HIV-1 integration is an intergenic locus, ...”, which may overstate the relevance of the R-loop in HIV-1 integration events in non-expressed sequences. Instead, we introduced a more recent finding that high levels of gene expression do not always predict high levels of integration, together with the corresponding citation (Lines 46– 47 of the revised manuscript), according to the reviewer’s suggestion in the reviewer's public review 2)-(a).

      (1.5) The discussion was seriously lacking in connecting their conclusions regarding R-loop targeting of integration to how integration works at the structural level, where it is very clear that concerted integration on the two DNA strands ca 5 bp apart is essential to correct, 2-ended integration. It is very difficult to visualize how this would be possible with the triple-stranded R-loop as a target. The manuscript would be greatly strengthened by an experiment showing concerted integration into a triplestranded structure in vitro using PICs or pure integrase.

      We believe there has been a misunderstanding of our interpretation regarding the putative role of R-loop structures in the HIV-1 integration site mechanism because of some misleading statements in our original manuscript. Based primarily on our current data, we believe that R-loop structures are bound by HIV-1 integrase proteins and lead to HIV-1 viral genome integration into the vicinity regions of the host genomic R-loops. By carefully revising our manuscript, we found that the title, abstract, and discussion of our original manuscript includes phrases, such as “HIV-1 targets R-loops for integration,” which may overstate our finding on the role of R-loop in HIV-1 integration site selection. We replaced these phrases. For example, we used phrases, such as, “HIV-1 favors vicinity regions of R-loop for the viral genome integration,” in the revised manuscript. We apologize for the inconvenience caused by the unclear and nonspecific details of our findings.  

      Using multiple biochemical experiments, we successfully demonstrated the interaction between the cellular R-loop and HIV-1 integrase proteins in cells and in vitro (Fig. 5 of the revised manuscript). However, we could not validate whether the center of the triple-stranded R-loops is the extraction site of HIV-1 integration, where the strand transfer reaction by integrase occurs. This is because an R-loop can be multi-kilobase in size (1, 2); therefore, we displayed a large-scale genomic region (30-kb windows) to present the integration sites surrounding the R-loop centers. Nevertheless, we believe that we validated R-loop-mediated HIV-1 integration in R-loop-forming regions using our pgR-poor and pgR-rich cell line models. When infected with HIV-1, pgR-rich cells, but not pgR-poor cells, showed higher infectivity upon R-loop induction in designated regions following DOX treatment (Fig. 3C and 3D of the revised manuscript). In addition, we quantified site-specific integration events in R-loop regions, and found that a greater number of integration events occurred in designated regions of the pgR-rich cellular genome upon R-loop induction by DOX treatment, but not in pgR-poor cells (Fig. 3E–G of the revised manuscript). 

      We agree with the reviewer that an experiment showing the concerted integration of purified PICs into a triple-stranded structure in vitro would greatly strengthen our manuscript. We attempted the purification of viral DNA (vDNA)-bound PICs using either Sso7d-tagged HIV-1 integrase proteins or non-tagged HIV-1 integrase proteins (F185K/C280S) procured from the NIH HIV reagent program (HRP-20203), following the method described by Passos et al., Science, 2017; 355 (89-92) (3). Despite multiple attempts, we could not purify the nucleic acid-bound protein complexes for in vitro integration assays. However, we believe that pgR-poor and pgR-rich cell line models provide a strong advantage in specificity of our primer readouts. Compounded with our in cellulo observation, we believe that our work provides strong evidence for a causative relationship between R-loop formation/R-loop sites and HIV-1 integration.

      Additionally, in the Discussion section of the revised manuscript, we have expanded our discussion on the role of genomic R-loops contributing in molding the host genomic environment for HIV-1 integration site selection, and the potential explanation on how R-loops are driving integration over long-range genomic regions. Thank you. 

      (1.6) There are serious concerns with the quantitation of integration sites used here, which should be described in detail following line 503 but isn't. In Figure 3, E-G, they are apparently shown as reads per million, while in Figure 4B as "sites (%)" and in 4C as log10 integration frequency." Assuming the authors mean what they say, they are using the worst possible method for quantitation. Counting reads from restriction enzyme-digested, PCR-digested DNA can only mislead. At the numbers provided (MOI 0.6, 10 µg DNA assayed) there would be about 1 million proviruses in the samples assayed, so the probability of any specific site being used more than once is very low, and even less when one considers that a 10% assay efficiency is typical of integration site assays. Although the authors may obtain millions of reads per experiment, the number of reads per site is an irrelevant value, determined only by technical artefacts in the PCR reactions, most significantly the length of the amplicons, a function of the distance from the integration site to the nearest MstII site, further modified by differences in Tm. Better is to collapse identical reads to 1 per site, as may have been done in Figure 4B, however, the efficiency of integration site detection will still be inversely related to the length of the amplicon. Indeed, if the authors were to plot the read frequency against distance to the nearest MstII site, it is likely that they would get plots much like those in Figure 4B.

      Detailed methods for integration site sequencing data processing are described in the Materials and Methods section of the revised manuscript (Line 621–631 of the revised manuscript). We primarily followed HIV-1 integration site sequencing data processing methods previously described by Li et al., mBio, 2020; 11(5) (4).  

      While it may be correct that the HIV-1 integration event cannot occur more than once at a given site, our Fig. 3E, 4C, and 4D of the revised manuscript present the number of integration-site sequencing read counts expressed in reads-per-million (RPM) units or as log10-normalized values. Based on the number of mapped reads from the integration site sequencing results, we can infer that there was an integration event at this site, whether it was a single or multiple event.

      We believe that the original annotation of y-axis, “Integration frequency,” may be misleading as it can be interpreted as a probability of any specific site being used for HIV-1 integration. Therefore, we corrected it as “number of mapped read” for clarity (Fig. 3E–G, 4C and 4D, and the corresponding figure legends of the revised manuscript). We apologize for any confusion. Thank you.

      Other points:

      (1.7) Overall: There are numerous grammatical and usage errors, especially in agreement of subject and verb, and missing articles, sometimes multiple times in the same sentence. These must be corrected prior to resubmission.

      The revised manuscript was edited by a professional editing service. Thank you.

      (1.8) Line 126-134: A striking result, but it needs more controls, as discussed above, including a dose-response analysis.

      We determined the doses of NVP and RAL inhibitors in HeLa cells by optimizing the minimum dose of drug treatment that provided a sufficient inhibitory effect on HIV1 infection (Author response image 1). The primary objective of this experiment was to determine R-loop formation while reverse transcription or integration of the HIV-1 life cycle was blocked, therefore, we do not think that a dose-dependent analysis of inhibitors is required.

      Author response image 1.

      (A and B) Representative flow cytometry histograms of VSV-G-pseudotyped HIV-1-EGFP-infected HeLa cells at an MOI of 1, harvested at 48 hpi. The cells were treated with DMSO, the indicated doses of nevirapine (NVP) (A) or indicated doses of raltegravir (RAL) (B) for 24 h before infection. 

      (1.9) Line 183: Please tell us what ECFP is and why it was chosen. Is there a reference for its failure to form R-loops?

      Ibid: The human AIRN gene is a very poor target for HIV integration in PBMC.

      A high GC skew value (> 0) is a predisposing factor for R-loop formation at the transcription site. This is because a high GC skew causes a newly synthesized RNA strand to hybridize to the template DNA strand, and the non-template DNA strand remains looped out in a single-stranded conformation (5) (Ref 36 in the revised manuscript). The ECFP sequence possessed a low GC skew value, as previously used for an R-loop-forming negative sequence (6) (Ref 17 of the revised manuscript). We have added this description and the corresponding references to Lines 188–192 of the revised manuscript.  

      The human AIRN gene (RefSeq DNA sequence: NC_000006.12) sequence possesses a GC skew value of -0.04, in a window centered at base 2186, while the mouse AIRN (mAIRN) sequence is characterized by a GC skew value of 0.213. The ECFP sequence gave a GC skew value of -0.086 in our calculation. We anticipated that the human AIRN gene region does not form a stable R-loop, and in fact, it did not harbor R-loop enrichment upon HIV-1 infection in our DRIPc-seq data analysis of multiple cell types (Author response image 2)

      Author response image 2.

      Genome browser screenshot over the chromosomal regions in 20-kb windows centered on human AIRN showing results from DRIPc-seq in the indicated HIV-1-infected cells (blue, 0 hpi; yellow, 3 hpi; green, 6 hpi; red, 12 hpi)

      (1.10) Line 190: You haven't shown dependence. Associated is a better word.

      Thank you for the suggestion. We have changed “R-loop-dependent site-specific HIV-1 integration events...” to “R-loop-associated site-specific HIV-1 integration events...” (Line 198 of the revised manuscript) according to the reviewer’s suggestion in the revised manuscript. 

      (1.11) Line 239: What happened to P1? What is the relationship of the P and N regions to genes?

      We have added superimpositions of the P1 chromatin region on DRIPc-seq and the HIV-1 integration frequency to Figure 4C of the revised manuscript. We observed a relevant integration event within the P1 R-loop region, but to a lesser extent than in the P2 and P3 R-loop regions, perhaps because the P1 region has relatively less R-loop enrichment than the P2 and P3 regions, as examined by DRIP-qPCR in S3A Fig. of the revised manuscript.

      Genome browser screenshots with annotations of accommodating genes in the P and N regions are shown in S2A–E Fig. of the revised manuscript, and RNA-seq analysis of the relative gene expression levels of the P1-3 and N1,2 R-loop regions are shown in S4 Table of the revised manuscript. Thank you.

      (1.12) Line 261: But the binding affinity of integrase to the R-loop is somewhat weaker than to double-stranded DNA according to Figure 5A.

      Nucleic acid substrates were loaded at the same molarity, and the percentage of the unbound fraction was calculated by dividing the intensity of the unbound fraction in each lane by the intensity of the unbound fraction in the lane with 0 nM integrase in the binding reaction. The calculated percentages of the unbound fraction from three independent replicate experiments are shown in Fig. 5A, right of the revised manuscript. In our analysis and measurements, the integrase proteins showed higher binding affinities to the R-loop and R-loop comprising nucleic acid structures than to dsDNA in vitro. We hope that this explanation clarifies this point. 

      (1.13) Line 337: "accumulate". This is a not uncommon misinterpretation of the results of studies on the distribution of intact proviruses in elite controllers. The only possible correct interpretation of the finding is that proviruses form everywhere else but cells containing them are eliminated, most likely by the immune system.

      Thank you for the suggestion. We have changed the Line 337 of the original manuscript to “... HIV-1 proviruses in heterochromatic regions are not eliminated but selected by immune system,” in Lines 361-363 of the revised manuscript. 

      (1.14) Line 371 How many virus particles per cell does this inoculum amount to?

      We determined the amount of GFP reporter viruses required to transduce ∼50% of WT Jurkat T cells, corresponding to an approximate MOI of 0.6. We repeatedly obtained 30–50% of VSV-G-pseudotyped HIV-1-EGFP positively infected cells for HIV1 integration site sequencing library construction for Jurkat T cells. 

      (1.15) Line 503 and Figures 3 and 4: There must be a clear description of how integration events are quantitated.

      Detailed methods for integration site sequencing data processing are described in the Materials and Methods section of the revised manuscript (Line 621–631 of the revised manuscript). We primarily followed HIV-1 integration site sequencing data processing methods previously described in Li et al., mBio, 2020; 11(5) (4).

      Reviewer #2 (Public Review):

      Retroviral integration in general, and HIV integration in particular, takes place in dsDNA, not in R-loops. Although HIV integration can occur in vitro on naked dsDNA, there is good evidence that, in an infected cell, integration occurs on DNA that is associated with nucleosomes. This review will be presented in two parts. First, a summary will be provided giving some of the reasons to be confident that integration occurs on dsDNA on nucleosomes. The second part will point out some of the obvious problems with the experimental data that are presented in the manuscript.

      We appreciate your comments. We have carefully addressed the concerns expressed as follows (your comments are in italics):  

      (2.1) 2017 Dos Passos Science paper describes the structure of the HIV intasome. The structure makes it clear that the target for integration is dsDNA, not an R-loop, and there are very good reasons to think that structure is physiologically relevant. For example, there is data from the Cherepanov, Engelman, and Lyumkis labs to show that the HIV intasome is quite similar in its overall structure and organization to the structures of the intasomes of other retroviruses. Importantly, these structures explain the way integration creates a small duplication of the host sequences at the integration site. How do the authors propose that an R-loop can replace the dsDNA that was seen in these intasome structures?

      We do appreciate the current understanding of the HIV-1 integration site selection mechanism and the known structure of the dsDNA-bound intasome. Our study proposes an R-loop as another contributor to HIV-1 integration site selection. Recent studies providing new perspectives on HIV-1 integration site targeting motivated our current work. For instance, Ajoge et al., 2022 (7) indicated that a guanine-quadruplex (G4) structure formed in the non-template DNA strand of the R-loop influences HIV-1 integration site targeting. Additionally, I. K. Jozwik et al., 2022 (8) showed retroviral integrase protein structure bound to B-to-A transition in target DNA. R-loop structures are a prevalent class of alternative non-B DNA structures (9). We acknowledge the current understanding of HIV-1 integration site selection and explore how R-loop interactions may contribute to this knowledge in the Discussion section of our manuscript. 

      Primarily based on our current data, we believe that R-loop structures are bound by HIV-1 integrase proteins and lead to HIV-1 viral genome integration into the vicinity regions of the host genomic R-loops, but we do not claim that R-loops completely replace dsDNA as the target for HIV-1 integration. An R-loop can be multi-kilobase in size and the R-loop peak length widely varies depending on the immunoprecipitation and library construction methods (1, 2), therefore, we could not validate whether the center of triple-stranded R-loops is the extraction site of HIV-1 integration where the strand transfer reaction by integrase occurs. Therefore, we replaced phrases such as, “HIV-1 targets R-loops for integration,” which may overstate our finding on the role of R-loop in HIV-1 integration site selection, with phrases, such as, “HIV-1 favors vicinity regions of R-loop for the viral genome integration,” in the revised manuscript. We apologize for the inconvenience caused by the unclear and non-specific details of our findings. Nevertheless, we believe that we validated R-loop-mediated HIV-1 integration in R-loop-forming regions using our pgR-poor and pgR-rich cell line models. We quantified site-specific integration events in the R-loop regions, and found that a greater number of integration events occurred in designated regions of the pgR-rich cellular genome upon R-loop induction by DOX treatment, but not in pgR-poor cells (Fig. 3E–G of the revised manuscript). 

      dsDNA may have been the sole target of the intasome demonstrated in vitro possibly because dsDNA has only been considered as a substrate for in vitro intasome assembly. We hope that our work will initiate and advance future investigations on target-bound intasome structures by considering R-loops as potential new targets for integrated proteins and intasomes.  

      (2.2) As noted above, concerted (two-ended) integration can occur in vitro on a naked dsDNA substrate. However, there is compelling evidence that, in cells, integration preferentially occurs on nucleosomes. Nucleosomes are not found in R loops. In an infected cell, the viral RNA genome of HIV is converted into DNA within the capsid/core which transits the nuclear pore before reverse transcription has been completed. Integration requires the uncoating of the capsid/core, which is linked to the completion of viral DNA synthesis in the nucleus. Two host factors are known to strongly influence integration site selection, CPSF6 and LEDGF. CPSF6 is involved in helping the capsid/core transit the nuclear pore and associate with nuclear speckles. LEDGF is involved in helping the preintegration complex (PIC) find an integration site after it has been released from the capsid/core, most commonly in the bodies of highly expressed genes. In the absence of an interaction of CPSF6 with the core, integration occurs primarily in the lamin-associated domains (LADs). Genes in LADs are usually not expressed or are expressed at low levels. Depending on the cell type, integration in the absence of CPSF6 can be less efficient than normal integration, but that could well be due to a lack of LEDGF (which is associated with expressed genes) in the LADs. In the absence of an interaction of IN with LEDGF (and in cells with low levels of HRP2) integration is less efficient and the obvious preference for integration in highly expressed genes is reduced. Importantly, LEDGF is known to bind histone marks, and will therefore be preferentially associated with nucleosomes, not R-loops. LEDGF fusions, in which the chromatin binding portion of the protein is replaced, can be used to redirect where HIV integrates, and that technique has been used to map the locations of proteins on chromatin. Importantly, LEDGF fusions in which the chromatin binding component of LEDGF is replaced with a module that recognizes specific histone marks direct integration to those marks, confirming integration occurs efficiently on nucleosomes in cells. It is worth noting that it is possible to redirect integration to portions of the host genome that are poorly expressed, which, when taken with the data on integration into LADs (integration in the absence of a CPSF6 interaction) shows that there are circumstances in which there is reasonably efficient integration of HIV DNA in portions of the genome in which there are few if any R-loops.

      Although R-loops may not wrap around nucleosomes, long and stable R-loops likely cover stretches of DNA corresponding to multiple nucleosomes (10). For example, R-loops are associated with high levels of histone marks, such as H3K36me3, which LEDGF recognizes (2, 11). R-loops dynamically regulate the chromatin architecture. Possibly by altering nucleosome occupancy, positioning, or turnover, R-loop structures relieve superhelical stress and are often associated with open chromatin marks and active enhancers (2, 10). These features are also distributed over HIV-1 integration sites (12). In the Discussion section of the revised manuscript, we explored the R-loop molding mechanisms in the host genomic environment for HIV-1 integration site selection and its potential collaborative role with LEDGF/p75 and CPSF6 governing HIV-1 integration site selection. 

      By carefully revising our original manuscript, with respect to the reviewer's comment, we recognized the need to tone down our statements. We found that the title, abstract, and discussion of our original manuscript includes phrases, such as, “HIV-1 targets Rloops for integration,” which may overstate our finding on the role of R-loop in HIV-1 integration site selection. We replaced these phrases. For example, we used phrases, such as “HIV-1 favors vicinity regions of R-loop for the viral genome integration,” in the revised manuscript. We apologize for the inconvenience caused by the unclear and non-specific details of our findings.

      (2.3) Given that HIV DNA is known to preferentially integrate into expressed genes and that R-loops must necessarily involve expressed RNA, it is not surprising that there is a correlation between HIV integration and regions of the genome to which R loops have been mapped. However, it is important to remember that correlation does not necessarily imply causation.

      We understand the reviewer's concern regarding the possibility of a coincidental correlation between the R-loop regions and HIV-1 integration sites, particularly when the interpretation of this correlation is primarily based on a global analysis. 

      Therefore, we designed pgR-poor and pgR-rich cell lines, which we believe are suitable models for distinguishing between integration events driven by transcription and the presence of R-loops. Although the two cell lines showed comparable levels of transcription at the designated region upon DOX treatment via TRE promoter activation (Fig. 3B of the revised manuscript), only pgR-rich cells formed R-loops at the designated regions (Fig. 3C of the revised manuscript). When infected with HIV1, pgR-rich cells, but not pgR-poor cells, showed higher infectivity after DOX treatment (Fig. 3D of the revised manuscript). Moreover, we quantified site-specific integration events in the R-loop regions, and found that a greater number of integration events occurred in designated regions of the pgR-rich cellular genome upon R-loop induction by DOX treatment, but not in pgR-poor cells (Fig. 3E of the revised manuscript). Therefore, we concluded that transcriptional activation without an R-loop (in pgR-poor cells) may not be sufficient to drive HIV-1 integration. We believe that our work provides strong evidence for a causative relationship between R-loop formation/Rloop sites and HIV-1 integration. We hope that our explanation addresses your concerns. Thank you.

      If we consider some of the problems in the experiments that are described in the manuscript:

      (2.4) In an infected individual, cells are almost always infected by a single virion and the infecting virion is not accompanied by large numbers of damaged or defective virions. This is a key consideration: the claim that infection by HIV affects R-loop formation in cells was done with a VSVg vector in experiments in which there appears to have been about 6000 virions per cell. Although most of the virions prepared in vitro are defective in some way, that does not mean that a large fraction of the defective virions cannot fuse with cells. In normal in vivo infections, HIV has evolved in ways that avoid signaling infected the cell of its presence. To cite an example, carrying out reverse transcription in the capsid/core prevents the host cell from detecting (free) viral DNA in the cytoplasm. The fact that the large effect on R-loop formation which the authors report still occurs in infections done in the absence of reverse transcription strengthens the probability that the effects are due to the massive amounts of virions present, and perhaps to the presence of VSVg, which is quite toxic. To have physiological relevance, the infections would need to be carried out with virions that contain HIV even under circumstances in which there is at most one virion per cell.

      Our virus production and in vitro and ex vivo HIV-1 infection experimental conditions, designed for infecting cell types, such as HeLa cells and primary CD4+ T cells with VSV-G pseudotyped HIV, were based on a comprehensive review of numerous references. At the very beginning of this study, we tested HIV-1-specific host genomic R-loop induction using empty virion particles (virus-like particles, VLP) or other types of viruses (non-retrovirus, SeV; retroviruses, FMLV and FIV), all produced with a VSV G protein donor. We could not include a control omitting the VSV G protein or using natural HIV-1 envelope protein to prevent viral spread in culture. We observed that despite all types of virus stocks being prepared using VSV-G, only cells infected with HIV-1 viruses showed R-loop signal enrichment (Author response image 3). Therefore, we omitted the control for the VSV G protein in subsequent analyses, such as DRIPcseq. We have also revised our manuscript to provide a clearer description of the experimental conditions. In particular, we now clearly stated that we used VSV-G pseudotyped HIV-1 in this study, throughout the abstract, results, and discussion sections of the revised manuscript. Thank you.

      Author response image 3.

      (A) Dot blot analysis of the R-loop in gDNA extracts from HIV-1 infected U2OS cells with MOI of 0.6 harvested at 6 hpi. The gDNA extracts were incubated with or without RNase H in vitro before membrane loading (anti-S9.6 signal). (B) Dot blot analysis of the R-loop in gDNA extracts from HeLa cells infected with 0.3 MOI of indicated viruses. The infected cells were harvested at 6 hpi. The gDNA extracts were incubated with or without RNase H in vitro before membrane loading (anti-S9.6 signal).

      HIV-1 co-infection may also be expected in cell-free HIV-1 infections. However, it was previously suggested that the average number of infection events varies within 1.02 to 1.65 based on a mathematical model that estimates the frequency of multiple infections with the same virus (Figure 4c of Ito et al., Sci. Rep, 2017; 6559) (13). 

      (2.5) Using the Sso7d version of HIV IN in the in vitro binding assays raises some questions, but that is not the real question/problem. The real problem is that the important question is not what/how HIV IN protein binds to, but where/how an intasome binds. An intasome is formed from a combination of IN bound to the ends of viral DNA. In the absence of viral DNA ends, IN does not have the same structure/organization as it has in an intasome. Moreover, HIV IN (even Sso7d, which was modified to improve its behavior) is notoriously sticky and hard to work with. If viral DNA had been included in the experiment, intasomes would need to be prepared and purified for a proper binding experiment. To make matters worse, there are multiple forms of multimeric HIV IN and it is not clear how many HIV INs are present in the PICs that actually carry out integration in an infected cell.

      As the reviewer has noted, HIV IN, even with Sso7d tagging, is difficult. We attempted the purification of viral DNA (vDNA)-bound PICs using either Sso7d-tagged HIV-1 integrase proteins or non-tagged HIV-1 integrase proteins (F185K/C280S), procured from the NIH HIV reagent program (HRP-20203), following the method described by Passos et al., Science, 2017; 355 (89-92) (3). Despite multiple attempts, we were unable to purify the vDNA-bound IN protein complexes for in vitro assays. However, through multiple biochemical experiments, we believe that we have successfully demonstrated the interaction between cellular R-loops and HIV-1 integrase proteins both in cells and in vitro (Fig. 5A–F of the revised manuscript). We also observed a close association between integrase proteins and host cellular Rloops in HIV-1-infected cells, using a fluorescent recombinant virus (HIV-IN-EGFP) with intact IN-EGFP PICs (Fig. 5G of the revised manuscript). 

      (2.6) As an extension of comment 2, the proper association of an HIV intasome/PIC with the host genome requires LEDGF and the appropriate nucleic acid targets need to be chromatinized.

      The interaction between cellular R-loops and HIV-1 integrase proteins in HeLa cells endogenously expressing LEDGF/p75 was examined using reciprocal immunoprecipitation assays in Fig. 5C–F, S6B, and S6D Fig. of the revised manuscript. In addition, as discussed in more detail in our response to comment [28], we observed a close association between host cellular R-loops and HIV-1 integrase proteins by PLA assay, in HIV-1-infected HeLa cells. 

      (2.7) Expressing any form of IN, by itself, in cells to look for what IN associates with is not a valid experiment. A major factor that helps to determine both where integration takes place and the sites chosen for integration is the transport of the viral DNA and IN into the nucleus in the capsid core. However, even if we ignore that important part of the problem, the IN that the authors expressed in HeLa cells won't be bound to the viral DNA ends (see comment 2), even if the fusion protein would be able to form an intasome. As such, the IN that is expressed free in cells will not form a proper intasome/PIC and cannot be expected to bind where/how an intasome/PIC would bind.

      As discussed in more detail in our response to comment [2-8], we believe that our PLA experiment using the pVpr-IN-EGFP virus, which has previously been examined for virion integrity, as well as the IN-EGFP PICs (14), demonstrated a close association between host cellular R-loops and HIV-1 integrase proteins in HIV-1-infected cells. 

      (2.8) As in comment 1, for the PLA experiments presented in Figure 5 to work, the number of virions used per cell (which differs from the MOI measured by the number of cells that express a viral marker) must have a high, which is likely to have affected the cells and the results of the experiment. However, there is the additional question of whether the IN-GFP fusion is functional. The fact that the functional intasome is a complex multimer suggests that this could be a problem. There is an additional problem, even if IN-GFP is fully functional. During a normal infection, the capsid core will have delivered copies of IN (and, in the experiments reported here, the IN-GFP fusion) into the nucleus that is not part of the intasome. These "free" copies of IN (here IN-GFP) are not likely to go to the same sites as an intasome, making this experiment problematic (comment 4).

      The HIV-IN-EGFP virus stock was produced by polyethylenimine-mediated transfection of HEK293T cells with 6 µg of pVpr-IN-EGFP, 6 µg of HIV-1 NL4-3 noninfectious molecular clone (pD64E; NIH AIDS Reagent Program 10180), and 1 µg of pVSV-G as previously described in (14), and described in the Materials and Methods section of our manuscript. The pVpr-IN-EGFP vector used to produce HIV-1-IN-EGFP virus stock was provided by Anna Cereseto group (Albanese et al., PLOS ONE, 2008; 6(6); Ref 34 of the revised manuscript). It was previously reported that the HIV-1INEGFP virions produced by IN-EGFP trans-incorporation through Vpr are intact and infective viral particles (Figure 1 of Albanese et al., PLOS ONE, 2008; 6(6)). Therefore, we believe that the HIV-IN-EGFP used in our PLA experiments was functional. 

      Additionally, Albanese et al. showed that the EGFP signal of HIV-IN-EGFP virions colocalizes with the viral protein matrix (p17MA) and capsid (P24CA) as well as with the newly synthesized cDNA produced by reverse transcriptase by labeling and visualizing the synthesized cDNA (14). In addition, the fluorescent recombinant virus (HIV-INEGFP) was structurally intact at the nuclear level (Figure 6 of Albanese et al., PLOS ONE, 2008; 6(6)). Therefore, we believe that our PLA experimental result is not likely misled as the reviewer concerns due to the integrity of the HIV-IN-EGFP virion as well as IN-EGFP PICs.

      Furthermore, the in vitro HIV-1 infection setting of our PLA experiments was carefully determined based on multiple studies that performed image-based assays on HIV-1infected cells. For instance, Albanese et al. infected 4 × 104 cells with viral loads equivalent to 1.5 or 3 µg of HIV-1 p24 for their immunofluorescence analysis, in their previous report (14). We titrated the fluorescent HIV-1 virus stocks by examining both the multiplicity of infection (MOI) and quantifying the HIV-1 p24 antigen content (Author response image 4). In our calculation, we infected 5 × 104 HeLa cells with viral loads equivalent to 1.3 ug of HIV-1 p24, which is indicated as 2 MOI in Fig. 5G of our manuscript, for our PLA experiments. 

      Image-Based Assays often require increased and enhanced signal for statistical robustness. For example, Achuthan et al. infected cells with VSV-G-pseudotyped HIV1 at the approximate MOI of 350 for vDNA and PIC visualization (15). Therefore, we believe our experimental condition for PLA experiments, which we carefully designed based on previous study that are frequently referred, are reasonable. We really hope that our discussion sufficiently addressed the reviewer’s concern. 

      Author response image 4.

      Gating strategy used to determine HIV-1-infectivity in HeLa cells at 48 hpi. Cells were infected with a known p24 antigen content in the stock of the VSV-G-pseudotyped HIV-1-EGFP-virus. The percentages of GFP-positive cell population are indicated.

      (2.9) In the Introduction, the authors state that the site of integration affects the probability that the resulting provirus will be expressed. Although this idea is widely believed in the field, the actual data supporting it are, at best, weak. See, for example, the data from the Bushman lab showing that the distribution of integration sites is the same in cells in which the integrated proviruses are, and are not, expressed. However, given what the authors claim in the introduction, they should be more careful in interpreting enzyme expression levels (luciferase) as a measure of integration efficiency in experiments in which they claim proviruses are integrated in different places.

      We thank the reviewer for the constructive comment. We have changed the statement in Lines 41–42 in the Introduction section of our original manuscript to “The chromosomal landscape of HIV-1 integration influences proviral gene expression, persistence of integrated proviruses, and prognosis of antiretroviral therapy.” (Lines 39-41 of the revised manuscript). We believe that this change can tone-down the relevance between the site of integration and the provirus expression level.

      The piggyBac transposase randomly insert the “cargo (transposon)” into TTAA chromosomal sites of the target genome, generating efficient insertions at different genomic loci (16, 17). We believe that this random insertion of the pgR-poor/rich vector mediated by the piggyBac system allows us not to mislead the R-loop-mediated HIV1 integration site because of the genome locus bias of the vector insertion. Therefore, Figure 3 in our manuscript does not claim any relevance between the site of integration and the resulting provirus expression levels. Instead, as noted in Line 214 of the revised manuscript, using the luciferase reporter HIV-1 virus, we attempted to examine HIV-1 infection in cells with an "extra number of R-loops” in the host cellular genome. We observed that pgR-rich cells showed higher luciferase activity upon DOX treatment than pgR-poor cells (Fig. 3D of the revised manuscript). We believe that this is because a greater number of HIV-1 integration events may occur in pgR-rich cells, where DOX-inducible de novo R-loop regions are introduced. This has been further examined in Fig. 3E–G of the revised manuscript. We hope this explanation clarifies the Figure 3. Thank you. 

      (2.10) Using restriction enzymes to create an integration site library introduces biases that derive from the uneven distribution of the recognition sites for the restriction enzymes.

      As described in the Materials and Methods section, we adopted a sequencing library construction method using a previously established protocol (18, 19). Although we recognize the advantages of DNA fragmentation by sonication, in in vitro or ex vivo HIV-1 infection settings, where the multiplicity of infection is carefully determined based on multiple references, more copies of integrated viral sequences are expected compared to that in samples from infected patients (18). Therefore, in these settings, restriction enzyme-based DNA fragmentation and ligation-mediated PCR sequencing are well-established methods that provide significant data sources for HIV-1 integration site sequencing (15, 20-22). Furthermore, our data showing the proportion of integration sites over R-loop regions (Fig. 4B of the revised manuscript) are presented alongside the respective random controls (i.e., proportion of integration sites within the 30-kb windows centered on randomized DRIPc-seq peaks, gray dotted lines; control comparisons between randomized integration sites with DRIPc-seq peaks, black dotted lines; and randomized integration sites with randomized DRIPcseq peaks, gray solid lines), which do not show such a correlation between the HIV-1 integration sites and nearby areas of the R-loop regions. Therefore, we believe that our results from the integration site sequencing data analysis are unlikely to be biased. 

      Reviewer #3 (Public Review):

      In this manuscript, Park and colleagues describe a series of experiments that investigate the role of R-loops in HIV-1 genome integration. The authors show that during HIV-1 infection, R-loops levels on the host genome accumulate. Using a synthetic R-loop prone gene construct, they show that HIV-1 integration sites target sites with high R-loop levels. They further show that integration sites on the endogenous host genome are correlated with sites prone to R-loops. Using biochemical approaches, as well as in vivo co-IP and proximity ligation experiments, the authors show that HIV-1 integrase physically interacts with R-loop structures.

      My primary concern with the paper is with the interpretations the authors make about their genome-wide analyses. I think that including some additional analyses of the genome-wide data, as well as some textual changes can help make these interpretations more congruent with what the data demonstrate. Here are a few specific comments and questions:

      We are grateful for the time and effort we spent on our behalf and the reviewer’s appreciation for the novelty of our work, in particular, R-loop induction by HIV-1 infection and the correlation between host R-loops and the genomic site of HIV-1 integration. In the following sections, we provide our responses to your comments and suggestions. Your comments are in italics. We have carefully addressed the following issues.

      (3.1) I think Figure 1 makes a good case for the conclusion that R-loops are more easily detected HIV-1 infected cells by multiple approaches (all using the S9.6 antibody). The authors show that their signals are RNase H sensitive, which is a critical control. For the DRIPc-Seq, I think including an analysis of biological replicates would greatly strengthen the manuscript. The authors state in the methods that the DRIPc pulldown experiments were done in biological replicates for each condition. Are the increases in DRIPc peaks similar across biological replicates? Are genomic locations of HIV-1-dependent peaks similar across biological replicates? Measuring and reporting the biological variation between replicate experiments is crucial for making conclusions about increases in R-loop peak frequency. This is partially alleviated by the locus-specific data in Figure S3A. However, a better understanding of how the genome-wide data varies across biological replicates will greatly enhance the quality of Figure 1.

      DRIPc-seq experiments were conducted with two biological replicates. To define consensus DRIPc-seq peaks using these two replicates, we used two methods applicable to ChIP-seq analysis: the irreproducible discovery rate (IDR) method and sequencing data pooling. We found that the sequencing data pooling method yielded significantly more DRIPc-seq peaks than consensus peak identification through IDR, and we decided to utilize R-loop peaks from pooled sequencing data for our downstream analyses, as described in the figure legends and Materials and Methods of the revised manuscript. 

      As noted by the reviewer, it is important to verify whether the increasing trend in the number of R-loop peaks and genomic locations of HIV-1 dependent R-loops were consistently observed across the two biological replicates. Therefore, we independently performed R-loop calling on each replicate of the sequencing data of primary CD4+ T cells from two individual donors to verify that the increase in R-loop numbers was consistent (Author response image 5). Additionally, the overlap of the R-loop peaks between the two replicates was statistically significant across the genome (Author response table 1). Thank you.

      Author response image 5.

      Bar graph indicating DRIPc-seq peak counts for HIV-1-infected primary CD4+ T cells harvested at the indicated hours post infection (hpi). Pre-immunoprecipitated samples were untreated (−) or treated (+) with RNase H, as indicated. Each dot corresponds to an individual data set from two biologically independent experiments.

      Author response table 1.

      DRIPc-seq peak length and Chi-square p-value in CD4+ T cells from individual donor 1 and 2 

      (3.2) I think that the conclusion that R-loops "accumulate" in infected cells is acceptable, given the data presented. However, in line 134 the authors state that "HIV1 infection induced host genomic R-loop formation". I suggest being very specific about the observation. Accumulation can happen by (a) inducing a higher frequency of the occurrence of individual R-loops and/or (b) stabilizing existing R-loops. I'm not convinced the authors present enough evidence to claim one over the other. It is altogether possible that HIV-1 infection stabilizes R-loops such that they are more persistent (perhaps by interactions with integrase?), and therefore more easily detected. I think rephrasing the conclusions to include this possibility would alleviate my concerns.

      We thank the reviewer for the considerable discussion on our manuscript. We have now changed Line 134 to, “HIV-1 infection induces host genomic R-loop enrichment” (Lines 132-133 of the revised manuscript), and added a new conclusion sentence implicating the possible explanation for the R-loop signal enrichment upon HIV-1 infection (Lines 133–135 of the revised manuscript), according to the reviewer's suggestion.    

      (3.3) A technical problem with using the S9.6 antibody for the detection of R-loops via microscopy is that it cross-reacts with double-stranded RNA. This has been addressed by the work of Chedin and colleagues (as well as others). It is absolutely essential to treat these samples with an RNA:RNA hybrid-specific RNase, which the authors did not include, as far as their methods section states. Therefore, it is difficult to interpret all of the immunofluorescence experiments that depend on S9.6 binding.

      We understand the reviewer's concern regarding the cross-reactivity of the S9.6 antibody with more abundant dsRNA, particularly in imaging applications. We carefully designed the experimental and analytical methods for R-loop detection using microscopy. For example, we pre-extracted the cytoplasmic fraction before staining with the S9.6 antibody and quantified the R-loop signal by subtracting the nucleolar signal. Both of these steps were taken to eliminate the possibility of misdetecting Rloops via microscopy because of the prominent cytoplasmic and nucleolar S9.6 signals, which primarily originate from ribosomal RNA. In addition, we included R-loop negative control samples in our microscopy analysis that were subjected to intensive RNase H treatment (60U/mL RNase H for 36 h) and observed a significant reduction in the S9.6 signal (Figure 1E of the revised manuscript). RNase H-treated samples served as essential and widely accepted negative controls for R-loop detection. 

      We would like to point out that recent studies have reported strong intrinsic specificity of S9.6 anybody for DNA:RNA hybrid duplex over dsDNA and dsRNA, along with the structural elucidations of S9.6 antibody recognition of hybrids (23, 24). Therefore, our interpretation of host cellular R-loop enrichment after HIV-1 infection using S9.6 antibodies in multiple biochemical approaches is well supported. Nevertheless, we agree with the reviewer's opinion that additional negative controls for the detection of R-loops via microscopy, such as RNase T1-and RNase III-treated samples, could improve the robustness and accuracy of R-loop imaging data (25).  

      (3.4) Given that there is no clear correlation between expression levels and R-loop peak detection, combined with the data that show increased detection of R-loop frequency in non-genic regions, I think it will be important to show that the R-loop forming regions are indeed transcribed above background levels. This will help alleviate possible concerns that there are technical errors in R-loop peak detection.

      Figures S5D and S5E in the revised manuscript show the relative gene expression levels of the R-loop-forming positive regions (P1-3) and the referenced Rloop-positive loci (RPL13A and CALM3). The gene expression levels of these R-loopforming regions were significantly higher than those of the ECFP or mAIRN genes without DOX treatment, which can be considered background levels of transcription in cells. Thank you. 

      (3.5) In Figures 4C and D the hashed lines are not defined. It is also interesting that the integration sites do not line up with R-loop peaks. This does not necessarily directly refute the conclusions (especially given the scale of the genomic region displayed), but should be addressed in the manuscript. Additionally, it would greatly improve Figure 4 to have some idea about the biological variation across replicates of the data presented 4A.

      We thank the reviewer for the considerable comment on our study. First of all, we added an annotation for the dashed lines in the figure legends of Figures 4C and 4D in the revised manuscript.

      We agree with the reviewer's interpretation of the relationship between the integration sites and R-loop peaks. Primarily based on our current data, we believe R-loop structures are bound by HIV-1 integrase proteins and lead HIV-1 viral genome integration into the “vicinity” regions of the host genomic R-loops. We displayed a large-scale genomic region (30-kb windows) to present integration sites surrounding R-loop centers because an R-loop can be multi-kilobase in size (1, 2). Depending on the immunoprecipitation and library construction methods, the R-loop peaks varied in size, and the peak length showed a wide distribution (Figure 3B of Malig et al., 2020, Figure 1B of Sanz et al., 2016, and Figure 2A of the revised manuscript). Therefore, presenting integration site events within a wide window of R-loop peaks could be more informative and better reflect the current understanding of R-loop biology.

      R-loop formation recruits diverse chromatin-binding protein factors, such as H3K4me1, p300, CTCF, RAD21, and ZNF143 (Figure 6A and 6B of Sanz et al., 2016) (26), which allow R-loops to exhibit enhancer and insulator chromatin states, which can act as distal regulatory elements (26, 27). We have demonstrated physical interactions between host cellular R-loops and HIV-1 integrase proteins (Figure 5 of the revised manuscript), therefore, we believe that this ‘distal regulatory element-like feature’ of the R-loop can be a potential explanation for how R-loops drive integration over longrange genomic regions.

      According to your suggestion, we added this explanation to the relevant literature in the Discussion section of the revised manuscript.

      Author response image 6 which represents the biological variation across replicates of the data shown in Figure 4A. The integration site sequencing data for Jurkat cells were adopted from SRR12322252 (4), which consists of the integration site sequencing data of HIV-1-infected wild type Jurkat cells with one biological replicate. We hope that our explanations and discussion have successfully addressed your concerns. Thank you. 

      Author response image 6.

      Bar graphs showing the quantified number of HIV-1 integration sites per Mb pair in total regions of 30-kb windows centered on DRIPc-seq peaks from HIV-1 infected HeLa cells and primary CD4+ T cells (magenta) or non-R-loop region in the cellular genome (gray). Each dot corresponds to an individual data set from two biologically independent experiments.

      (3.6) The authors do not adequately describe the Integrase mutant that they use in their biochemical experiments in Figure 5A. Could this impact the activity of the protein in such a way that interferes with the interpretation of the experiment? The mutant is not used in subsequent experiments for Figure 5 and so even though the data are consistent with each other (and the conclusion that Integrase interacts with R-loops) a more thorough explanation of why that mutant was used and how it impacts the biochemical activity of the protein will help the interpretation of the data presented in Figure 5.

      We appreciate the reviewer’s suggestions. In our EMSA analysis, we purified and used Sso7d-tagged HIV-1 integrase proteins with an active-site amino acid substitution, E152Q. First, we used the Sso7d-tagged HIV-1 integrase protein, as it has been suggested in previous studies that the fusion of small domains, such as Sso7d (DNA binding domain) can significantly improve the solubility of HIV integrase proteins without affecting their ability to assemble with substrate nucleic acids and their enzymatic activity (Figure 1B of Li et al., PLOS ONE, 2014;9 (8) (28, 29). We used an integrase protein with an active site amino acid substitution, E152Q, in our mobility shift assay, because the primary goal of this experiment was to examine the ability of the protein to bind or form a complex with different nucleic acid substrates. We thought that abolishing the enzymatic activity of the integrase protein, such as 3'-processing that cleaves DNA substrates, would be more appropriate for our experimental objective. This Sso7d tagged- HIV-1 integrase with the E152Q mutation has also been used to elucidate the structural model of the integrase complex with a nucleic acid substrate by cryo-EM (3) and has been shown to not disturb substrate binding.   Based on the reviewer’s comments, we have added a description of the E152Q mutant integrase protein in Lines 268–270 of the revised manuscript. Thank you.

      Reviewer #3 (Recommendations For The Authors):

      The paper suffers from many grammatical errors, which sometimes interfere with the interpretations of the experiments. In the view of this reviewer, the manuscript must be carefully revised prior to publication. For example, lines 247-248 "Intasomes consist of HIV-1 viral cDNA and HIV-1 coding protein, integrases." It is unclear from this sentence whether there are multiple integrases or multiple proteins that interact with the viral genome to facilitate integration. This makes the subsequent experiments in Figure 5 difficult to interpret. There are many other examples, too numerous to point out individually.

      We thoughtfully revised the original manuscript, making the best efforts to provide clearer details of our findings. We believe that we have made substantial changes to the manuscript, including Lines 247–248 of the original manuscript that the reviewer noted. Furthermore, the revised manuscript was edited by a professional editing service. Thank you.     (1) M. Malig, S. R. Hartono, J. M. Giafaglione, L. A. Sanz, F. Chedin, Ultra-deep Coverage Singlemolecule R-loop Footprinting Reveals Principles of R-loop Formation. J Mol Biol 432, 22712288 (2020).

      (2) L. A. Sanz et al., Prevalent, Dynamic, and Conserved R-Loop Structures Associate with Specific Epigenomic Signatures in Mammals. Mol Cell 63, 167-178 (2016).

      (3) D. O. Passos et al., Cryo-EM structures and atomic model of the HIV-1 strand transfer complex intasome. Science 355, 89-92 (2017).

      (4) W. Li et al., CPSF6-Dependent Targeting of Speckle-Associated Domains Distinguishes Primate from Nonprimate Lentiviral Integration. mBio 11,  (2020).

      (5) P. A. Ginno, Y. W. Lim, P. L. Lott, I. Korf, F. Chedin, GC skew at the 5' and 3' ends of human genes links R-loop formation to epigenetic regulation and transcription termination. Genome Res 23, 1590-1600 (2013).

      (6) S. Hamperl, M. J. Bocek, J. C. Saldivar, T. Swigut, K. A. Cimprich, Transcription-Replication Conflict Orientation Modulates R-Loop Levels and Activates Distinct DNA Damage Responses. Cell 170, 774-786 e719 (2017).

      (7) H. O. Ajoge et al., G-Quadruplex DNA and Other Non-Canonical B-Form DNA Motifs Influence Productive and Latent HIV-1 Integration and Reactivation Potential. Viruses 14,  (2022).

      (8) I. K. Jozwik et al., B-to-A transition in target DNA during retroviral integration. Nucleic Acids Res 50, 8898-8918 (2022).

      (9) F. Chedin, C. J. Benham, Emerging roles for R-loop structures in the management of topological stress. J Biol Chem 295, 4684-4695 (2020).

      (10) F. Chedin, Nascent Connections: R-Loops and Chromatin Patterning. Trends Genet 32, 828838 (2016).

      (11) P. B. Chen, H. V. Chen, D. Acharya, O. J. Rando, T. G. Fazzio, R loops regulate promoterproximal chromatin architecture and cellular differentiation. Nat Struct Mol Biol 22, 9991007 (2015).

      (12) A. R. Schroder et al., HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110, 521-529 (2002).

      (13) Y. Ito et al., Number of infection events per cell during HIV-1 cell-free infection. Sci Rep 7, 6559 (2017).

      (14) A. Albanese, D. Arosio, M. Terreni, A. Cereseto, HIV-1 pre-integration complexes selectively target decondensed chromatin in the nuclear periphery. PLoS One 3, e2413 (2008).

      (15) V. Achuthan et al., Capsid-CPSF6 Interaction Licenses Nuclear HIV-1 Trafficking to Sites of Viral DNA Integration. Cell Host Microbe 24, 392-404 e398 (2018).

      (16) X. Li et al., piggyBac transposase tools for genome engineering. Proc Natl Acad Sci U S A 110, E2279-2287 (2013).

      (17) Y. Cao et al., Identification of piggyBac-mediated insertions in Plasmodium berghei by next generation sequencing. Malar J 12, 287 (2013).

      (18) E. Serrao, P. Cherepanov, A. N. Engelman, Amplification, Next-generation Sequencing, and Genomic DNA Mapping of Retroviral Integration Sites. J Vis Exp,  (2016).

      (19) K. A. Matreyek et al., Host and viral determinants for MxB restriction of HIV-1 infection. Retrovirology 11, 90 (2014).

      (20) G. A. Sowd et al., A critical role for alternative polyadenylation factor CPSF6 in targeting HIV-1 integration to transcriptionally active chromatin. Proc Natl Acad Sci U S A 113, E10541063 (2016).

      (21) B. Lucic et al., Spatially clustered loci with multiple enhancers are frequent targets of HIV-1 integration. Nat Commun 10, 4059 (2019).

      (22) P. K. Singh, G. J. Bedwell, A. N. Engelman, Spatial and Genomic Correlates of HIV-1 Integration Site Targeting. Cells 11,  (2022).

      (23) C. Bou-Nader, A. Bothra, D. N. Garboczi, S. H. Leppla, J. Zhang, Structural basis of R-loop recognition by the S9.6 monoclonal antibody. Nat Commun 13, 1641 (2022).

      (24) Q. Li et al., Cryo-EM structure of R-loop monoclonal antibody S9.6 in recognizing RNA:DNA hybrids. J Genet Genomics 49, 677-680 (2022).

      (25) J. A. Smolka, L. A. Sanz, S. R. Hartono, F. Chedin, Recognition of RNA by the S9.6 antibody creates pervasive artifacts when imaging RNA:DNA hybrids. J Cell Biol 220,  (2021).

      (26) L. A. Sanz, F. Chedin, High-resolution, strand-specific R-loop mapping via S9.6-based DNARNA immunoprecipitation and high-throughput sequencing. Nat Protoc 14, 1734-1755 (2019).

      (27) M. Merkenschlager, D. T. Odom, CTCF and cohesin: linking gene regulatory elements with their targets. Cell 152, 1285-1297 (2013).

      (28) M. Li, K. A. Jurado, S. Lin, A. Engelman, R. Craigie, Engineered hyperactive integrase for concerted HIV-1 DNA integration. PLoS One 9, e105078 (2014).

      (29) M. Li et al., A Peptide Derived from Lens Epithelium-Derived Growth Factor Stimulates HIV1 DNA Integration and Facilitates Intasome Structural Studies. J Mol Biol 432, 2055-2066 (2020).

    2. eLife Assessment

      This study presents two main findings regarding HIV-1 genomic integration. The first, based on convincing evidence in primary cell models, is that HIV-1 induces R loop formation, though the viral driver of this process remains undefined. The second, based on model cell systems with limited physiological relevance to HIV-1, is that a portion of HIV-1 genomes integrates in the vicinity of where R loops form. This finding has the potential to offer fundamental insight into HIV-1 integration, but the strength of the presented evidence was viewed as incomplete and needing additional validation by more direct experimental methods in order to understand what the mechanistic relationship between the formation of R loops and HIV-1 integration is.

    3. Reviewer #1 (Public review):

      (1) Significance of findings and strength of evidence.<br /> (a) The work presented in this manuscript is intended to support the authors' novel idea that HIV DNA integration strongly favors "triple-stranded" R-loops in DNA formed either during transcription of many, but not all, genes or by strand invasion of silent DNA by transcripts made elsewhere, and that HIV infection promotes R-loop formation mediated by incoming virions in the absence of reverse transcription. The authors were able to demonstrate a reverse transcription-independent increase in R-loop formation early during HIV infection, while also demonstrating increased integration into sequences that contain R-loop structures. Furthermore, this manuscript also identifies that R-loops are present in both transcriptional active and silent regions of the genome and that HIV integrase interacts with R-loops. Although the work presented supports a correlation between R-loop formation and HIV DNA integration, it does not prove the authors' hypothesis that R-loops are directly targeted for integration. Direct experimentation, such as in vitro integration into defined DNA targets, will be required. Further, the authors provide no explanation as to how current sophisticated structural models of concerted retroviral DNA integration into both strands of double-stranded DNA targets can accommodate triple-stranded structures. Finally, there are serious technical concerns with interpretation of the integration site analyses.<br /> This resubmitted manuscript has corrected some of the issues raised by the previous reviews - particularly the quality of the English - but otherwise the text and figures remain very much the same and concerns regarding the conclusions drawn regarding integration site specificity remain. The manuscript also still suffers from a lack of description of experimental detail necessary to understand the results as presented. In many cases, explanations given privately in the rebuttal o the earlier reviews need to be made available to all readers, not just the reviewers.

      (2) Public review with guidance for readers around how to interpret the work, highlighting important findings but also mentioning caveats.<br /> (a) Introduction: The authors provide an excellent introduction to R-loops but they base the rationale for this study on mis-citation of earlier studies regarding integration in transcriptionally silent regions of the genome. The "most favored locus" cited in the very old reference 6 comprises only 5 events and has not been reproduced in more recent, much larger datasets For example, see the study of over 300.000 sites in ref 14. The laundry list of IN interactors in lines 43-44 is based on old experiments. It is now quite clear that the only direct interaction of importance is with LEDGF and that should be discussed here. Also discussed should be the role of the capsid in the nuclear entry and targeting. For example, one of the references cited, as well as a mention in the discussion (Line 326) concerns CPSF-6, which is now known to modulate nuclear entry and specificity by interacting with capsid, not integrase. The statement on lines 46-47 regarding that some highly expressed genes are, nonetheless, poor targets for integration is correct, but the experiment cited was done in PBMC with wild-type HIV-1and it is possible that those genes were expressed in non-target cells like B-cells or monocytes.

      (b) Figure 1: Demonstrates models for HIV infections in both cell lines and primary human CD4+ T cells. R-loop formation was determined through a method called DRIPc-seq which utilizes an anti-body specific for DNA-RNA hybrid structures and sequences these regions of the genome using RNaseH treatment to show that when RNA-DNA hybrids are absent then no R-loops are detected. In these models of in vitro and ex vivo infection, the authors show that R-Loop formation increases following HIV infection between 6 hr. post-infection and 12 hrs. post-infection, depending on the cell model. However, these figures lack a mock infected control for each cell model to assess R-loop formation at the same time points. They would also benefit from a control showing that virus entry is necessary, such as omitting the VSV G protein donor.

      (c) Figure 2: This figure shows that cells infected with HIV show more R-loops as well as longer sequences containing R-loop structures. Panel B shows that these R-loops were distributed throughout different genomic features, such as both genic and intergenic regions of the genome. However, the data are presented in such a way that it is impossible to determine the proportion of R-loops in each type of genomic feature. The reader has no way to tell, for example, the proportion of R-loops in genic vs intergenic DNA and how this value changes with time. Furthermore, increased R-loop formation due to HIV infection showed poor correlation with gene expression, suggesting that R-loops were not forming due to transcriptional activation, although the difference between 0 and the remaining timepoints is not apparent, nor is the meaning of the absurd p values.

      The experiments presented in Figures 1 and 2 show that treatment of cells with VSV G-pseudotyped HIV-1 leads to a significant increase in R loops in all parts of the genome. Accumulation of R-loops at so soon after infection, as well as its resistance to RT and Integration inhibitors, rules out the involvement of newly synthesized viral DNA or any newly made viral protein (Figure S3). Rather, some component(s) of the virion, possibly protease, or an accessory gene product such as Vpr or Vif, must be directly responsible e (although the authors neglect to draw this conclusion in the description of these experiments, lines 125-135, leaving it hanging until the Discussion).

      On the whole, and as a non-expert in this area, I find the overall conclusions of this part of the study convincing, but, as pointed out in one of the earlier reviews, the virologic significance of early effects seen at high multiplicity of infection (likely hundreds of particles per cell) needs to be taken with a grain of salt. At a minimum, this point should be discussed. Also, the study would have been greatly strengthened by a simple experiment to identify the virion protein responsible for the effect.<br /> Based on the results in the first two figures, the authors hypothesize that R-Loop induction early in infection plays an important role in HIV replication, specifically by interacting with the intasome and thus directing integration to regions of the host genome favorable for expression of the provirus. Experiments to test this idea and probe the mechanism are described in the remaining 3 figures, which, despite comments in the previous reviews, are unchanged from the previous version and still suffer from serious defects in experimental design and interpretation.

      (d) Figure 3: This figure shows the use of cell lines carrying R-loop inducible (mAIRN) or non-inducible (ECFP) genes to model association of HIV integration with R-loop structures. The authors demonstrate the functional validation of R-loop induction in the cell line model. Additionally, when R-loops are induced there is a significant increase in HIV integration in the R-loop forming vector sequence when R-loops are induced with doxycycline. This result shows a correlation between expression and integration that is much stronger in the R-loop forming gene than in the unreferenced ECFP gene but does not prove that integration directly targets R-loops. It is possible, for example, that some feature of the DNA sequence, such as base composition affects both integration and R-loop formation independently. As described more fully below, there is also a serious concern regarding the method used to quantitate the integration frequencies. As before, There are a number of problems here.<br /> (1) The authors use a classic, but suboptimal integration site assay comprising restriction enzyme digestion followed by PCR to assess integration site distribution, and (despite statements to the contrary in the rebuttal) read counts to quantitate relative frequencies of target site use. See the legend and axis labels in Fig 3E, F, and G. This approach leads to serious bias in the ability to detect and count the use of integration sites that are either too close or too far from the sites of cleavage and can lead to artefactual misrepresentation of their chromosomal distribution.<br /> (2) The result shown in Figure 3D is uninterpretable. It is simply not possible that the 3-fold increase in luciferase activity is due addition of 25 10-kb sequences leading to A 3-fold increase in integration frequency into the target sequence, particularly when panel E shows that the measured frequency is on the order of 20 reads per million. Something else must be going on here.<br /> (3) Panels 3F and G show the read count distribution in the introduced target sequences plotted in a completely nonstandard way and is explained so poorly that I could not be sure what the authors were trying to show. The numbers on the bottom of the 2 plots appear to represent the only sites of integration seen in the 10-kb region studied. If so, this is not the expected result for the authors claim of greatly increasing regional integration. As can easily be seen in the figures of ref 14, high frequency gene targets are characterized by large numbers of sites, not by more frequent targeting of small numbers of sites as implied by the figures.

      (e) Figure 4: This figure shows evidence of increased HIV integration within regions of the genome containing R-loops with additional preference with integration within the R-loop and decrease in frequency of integration further from the R-loop. Identifying a preference for R-loops is very intriguing but the authors do also demonstrate that integration does occur when R-loops are not present. Also Panel A, which shows that regions of cell DNA that form R-loops have a higher frequency of Integration sites than those that do not, should also be controlled for the level of gene expression of the two types of region. the result shown cannot be interpreted to mean that R-loops have anything to do with integration targeting. It is already well-established that about 80% of HIV integration sites are in expressed genes, which comprise about 20% of the genome. Since a gene must be expressed to contain an R-loop, the non-R-loop fraction will contain the 80% of the genome that is a 20-fold poorer target, giving the result shown, whether R-loops are involved or not. The rather weak correlation between R-Loop locations and integration site distribution in Fig 4C and D hardly seems consistent with the curves seen in 4B. Can the authors refute the hypothesis that the apparent correlation is simply because both integration and R-Loop formation frequency must correlate with level of gene expression and therefore their correlation with one another cannot be used to infer causality/ As pointed out in prior reviews, R-loops themselves cannot be targets for integration. In their rebuttal, the authors agree and have made slight modifications to their conclusion in the text, now concluding that Integration favors the vicinity of an R-loop. Why then do the peaks in correlation curves in Fig 4B center exactly on the center of the R-loops? It seems that this result would be more consistent with integration and R-loop formation favoring the same sites, but for different reasons (base composition for example).

      (f) Figure 5: In this figure the authors demonstrate that HIV integrase binds to R-loops through a number of protein assays, but does not show that this binding is associated with enzymatic activity. EMSA of integrase identified increased binding to DNA-RNA over dsDNA. Additionally, precipitation of RNA-DNA hybrids pulled down HIV integrase. A proximity ligation assay detecting R-loops and HIV-integrase showed co-localization within the nucleus of HeLa cells. HeLa cells were probably used due to their efficiency of transduction but are not physiologically relevant cell types. Figure 5 suffers greatly in interpretability from the failure of the authors to use assembled intasomes, since the DNA binding properties are likely to be quite different. The authors excuse that they were unable to prepare intasomes (which needs to be included in the text, not just in the rebuttal) explains but does not justify the use of monomeric IN protein. Figure 5A shows that the IN binding is NOT specific to R-loops, since any single-stranded DNA binds equally. The authors should make this point in the text.<br /> The experiment using integrase overexpression in cells brings up some déjà vu to a retrovirologist. There is some history in retrovirology of experiments like this having been used to draw conclusions (like the role of integrase in nuclear import) that have since proven to be wrong. Also, Fig 5G is not interpretable quantitively, since the distribution of neither IN nor R-loops is probed, and we have no idea what proportion of each is in the PLA spots. Overall, this section would be much more convincing if it also included some direct experimentation, such as in vitro integration using intasomes, or infection of cells with viral mutants (or in the presence of inhibitors) affecting the function of whatever virion protein found to be important for R-loop formation.

      (g) Discussion: In the discussion, the authors address how their work relates to previous evidence of HIV integration by association of LEDGF/p75 and CPSF6. They also cite that LEDGF/p75 has possible R-loop binding capabilities. They also discuss what possible mechanisms are driving increases in R-loop formation during HIV infection, pointing to possible HIV accessory proteins. They also state that how HIV integrates in transcriptionally silent regions is still unknown but do point out that they were able to show R-loops appear in many different regions of the genome but did not show that R-loops in transcriptional inactive regions are integration targets. More seriously, they failed to make a connection between their work and current understanding of the biochemical and structural mechanism of the integration reaction.

    4. Reviewer #3 (Public review):

      In this manuscript, Park and colleagues describe a series of experiments that investigate the role of R-loops in HIV-1 genome integration. The authors show that during HIV-1 infection, R-loops levels on the host genome accumulate. Using a synthetic R-loop prone gene construct, they show that HIV-1 integration sites target sites with high R-loop levels. They further show that integration sites on the endogenous host genome are correlated with sites prone to R-loops. Using biochemical approaches, as well as in vivo co-IP and proximity ligation experiments, the authors show that HIV-1 integrase physically interacts with R-loop structures.

      The major strengths of this work is that the investigators use multiple independent experimental systems and multiple cell types to support their conclusions, including in vivo and biochemical experiments. Furthermore, their use of genome-wide analyses help to support their conclusion that HIV targets genomic regions enriched with R-loops versus those lacking such enrichment.

      This work may have a significant impact on the field of HIV genomic integration by elucidating why transcription levels are not the sole determinant of HIV integration sites.

    1. eLife Assessment

      This important study aimed to identify how chronic heat exposure affects subsequent behavior and brain function. This work positively expands the field of thermoregulation. The data were collected using a myriad of next-generation approaches, including extensive behavior testing, thermal monitoring, electrophysiology, circuit mapping, and manipulations. As a result the strength of evidence is mostly solid, however a few weaknesses drove the some of the conclusions to be incompletely supported. These largely circle around the question of how unique these effects are to thermal stress (as opposed to other forms of stress), a lack of statistical analyses and rigor in some of the experiments and figures, and the specificity of the POA-pPVT pathway compared to other inputs to the PVT in the control of observed effects.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Cao et al. examines an important but understudied question of how chronic exposure to heat drives changes in affective and social behaviors. It has long been known that temperature can be a potent driver of behaviors and can lead to anxiety and aggression. However, the neural circuitry that mediates these changes is not known. Cao et al. take on this question by integrating optical tools of systems neuroscience to record and manipulate bulk activity in neural circuits, in combination with a creative battery of behavior assays. They demonstrate that chronic daily exposure to heat leads to changes in anxiety, locomotion, social approach, and aggression. They identify a circuit from the preoptic area (POA) to the posterior paraventricular thalamus (pPVT) in mediating these behavior changes. The POA-PVT circuit increases activity during heat exposure. Further, manipulation of this circuit can drive affective and social behavioral phenotypes even in the absence of heat exposure. Moreover, silencing this circuit during heat exposure prevents the development of negative phenotypes. Overall the manuscript makes an important contribution to the understudied area of how ambient temperature shapes motivated behaviors.

      Strengths

      The use of state-of-the-art systems neuroscience tools (in vivo optogenetics and fiber photometry, slice electrophysiology), chronic temperature-controlled experiments, and a rigorous battery of behavioral assays to determine affective phenotypes. The optogenetic gain of function of affective phenotypes in the absence of heat, and loss of function in the presence of heat are very convincing manipulation data. Overall a significant contribution to the circuit-level instantiation of temperature-induced changes in motivated behavior, and creative experiments.

      Weaknesses

      (1) There is no quantification of cFos/rabies overlap shown in Figure 2, and no report of whether the POA-PVT circuit has a higher percentage of Fos+ cells than the general POA population. Similarly, there is no quantification of cFos in POA recipient PVT cells for Figure 2 Supplement 2.

      (2) The authors do not address whether stimulation of POA-PVT also increases core body temperature in Figure 3 or its relevant supplements. This seems like an important phenotype to make note of and could be addressed with a thermal camera or telemetry.

      (3) In Figure 3G: is Day 1 vs Day 22 "pre-heat" significant? The statistics are not shown, but this would be the most conclusive comparison to show that POA-PVT cells develop persistent activity after chronic heat exposure, which is one of the main claims the authors make in the text. This analysis is necessary in order to make the claim of persistent circuit activity after chronic heat exposure.

      (4) In Figure 4, the control virus (AAV1-EYFP) is a different serotype and reporter than the ChR2 virus (AAV9-ChR2-mCherry). This discrepancy could lead to somewhat different baseline behaviors.

      (5) In Figure 5G, N for the photometry data: the authors assess the maximum z-score as a measure of the strength of calcium response, however the area under the curve (AUC) is a more robust and useful readout than the maximum z score for this. Maximum z-score can simply identify brief peaks in amplitude, but the overall area under the curve seems quite similar, especially for Figure 5N.

      (6) For Fig 5V: the authors run the statistics on behavior bouts pooled from many animals, but it is better to do this analysis as an animal average, not by compiling bouts. Compiling bouts over-inflates the power and can yield significant p values that would not exist if the analysis were carried out with each animal as an n of 1.

      (7) In general this is an excellent analysis of circuit function but leaves out the question of whether there may be other inputs to pPVT that also mediate the same behavioral effect. Future experiments that use activity-dependent Fos-TRAP labeling in combination with rabies can identify other inputs to heat-sensitive pPVT cells, which may have convergent or divergent functions compared to the POA inputs.

    3. Reviewer #2 (Public review):

      Summary

      The study by Cao et al. highlights an interesting and important aspect of heat- and thermal biology: the effect of repetitive, long-term heat exposure and its impact on brain function.<br /> Even though peripheral, sensory temperature sensors and afferent neuronal pathways conveying acute temperature information to the CNS have been well established, it is largely unknown how persistent, long-term temperature stimuli interact with and shape CNS function, and how these thermally-induced CNS alterations modulate efferent pathways to change physiology and behavior. This study is therefore not only novel but, given global warming, also timely.

      The authors provide compelling evidence that neurons of the paraventricular thalamus change plastically over three weeks of episodic heat stimulation and they convincingly show that these changes affect behavioral outputs such as social interactions, and anxiety-related behaviors.

      Strengths

      (1) It is impressive that the assessed behaviors can be (i) recruited by optogenetic fiber activation and (ii) inhibited by optogenetic fiber inhibition when mice are exposed to heat. Technically, when/how long is the fiber inhibition performed? It says in the text "3 min on and 3 min off". Is this only during the 20-minute heat stimulation or also at other times?

      (2) It is interesting that the frequency of activity in pPVT neurons, as assessed by fiber photometry, stays increased after long-term heat exposure (day 22) when mice are back at normal room temperature. This appears similar to a previous study that found long-term heat exposure to transform POA neurons plastically to become tonically active (https://www.biorxiv.org/content/10.1101/2024.08.06.606929v1 ). Interestingly, the POA neurons that become tonically active by persistent heat exposure described in the above study are largely excitatory, and thus these could drive the activity of the pPVT neurons analyzed in this study.

      (3) How can it be reconciled that the majority of the inputs from the POA are found to be largely inhibitory (Fig. 2H)? Is it possible that this result stems from the fact that non-selective POA-to-pPVT projections are labelled by the approach used in this study and not only those pathways activated by heat? These points would be nice to discuss.

      (4) It is very interesting that no LTP can be induced after chronic heat exposure (Figures K-M); the authors suggest that "the pathway in these mice were already saturated" (line 375). Could this hypothesis be tested in slices by employing a protocol to extinguish pre-existing (chronic heat exposure-induced) LTP? This would provide further strength to the findings/suggestion that an important synaptic plasticity mechanism is at play that conveys behavioral changes upon chronic heat stimulation.

      (5) It is interesting that long-term heat does not increase parameters associated with depression (Figure 1N-Q), how is it with acute heat stress, are those depression parameters increased acutely? It would be interesting to learn if "depression indicators" increase acutely but then adapt (as a consequence of heat acclimation) or if they are not changed at all and are also low during acute heat exposure.

      Weaknesses/suggestions for improvements

      (1) The introduction and general tenet of the study is, to us, a bit too one-sided/biased: generally, repetitive heat exposure --heat acclimation-- paradigms are known to not only be detrimental to animals and humans but also convey beneficial effects in allowing the animals and humans to gain heat tolerance (by strengthening the cardiovascular system, reducing energy metabolism and weight, etc.).

      (2) The point is well taken that these authors here want to correlate their model (90 minutes of heat exposure per day) to heat waves. Nevertheless, and to more fully appreciate the entire biology of repetitive/chronic/persistent heat exposure (heat acclimation), it would be helpful to the general readership if the authors would also include these other aspects in their introduction (and/or discussion) and compare their 90-minute heat exposure paradigm to other heat acclimation paradigms. For example, many past studies (using mice or rats) have used more subtle temperatures but permanently (and not only for 90 minutes) stimulated them over several days and weeks (for example see PMID: 35413138). This can have several beneficial effects related to cardiovascular fitness, energy metabolism, and other aspects. In this regard: 38{degree sign}C used in this study is a very high temperature for mice, in particular when they are placed there without acclimating slowly to this temperature but are directly placed there from normal ambient temperatures (22{degree sign}C-24{degree sign}C) which is cold/coolish for mice. Since the accuracy of temperature measurement is given as +/- 2{degree sign}C, it could also be 40{degree sign}C -- this temperature, 40{degree sign}C, non-heat acclimated C57bl/6 mice will not survive for long.

      The authors could consider discussing that this very strong, short episodic heat-stress model used here in this study may emphasize detrimental effects of heat, while more subtle long-term persistent exposure may be able to make animals adapt to heat, become more tolerant, and perhaps even prevent the detrimental cognitive effects observed in this study (which would be interesting to assess in a follow-up study).

      (3) Line 140: It would help to be clear in the text that the behaviors are measured 1 day after the acute heat exposure - this is mentioned in the legend to the figure, but we believe it is important to stress this point also in the text. Similarly, this is also relevant for chronic heat stimulation: it needs to be made very clear that the behavior is measured 1 day after the last heat stimulus. If the behaviors had been measured during the heat stimulus, the results would likely be very different.

      (4) Figure 2 D and Figure 2- Figure Supplement 1: since there is quite some baseline cFos activity in the pPVT region we believe it is important to include some control (room temperature) mice with anterograde labelling; in our view, it is difficult/not possible to conclude, based on Fig 2 supplement 2C, that nearly 100% of the cfos positive cells are contacted by POA fibre terminals (line 168). By eye there are several green cells that don't have any red label on (or next to) them; additionally, even if there is a little bit of red signal next to a green cell: this is not definitive proof that this is a synaptic contact. It is therefore advisable to revisit the quantification and also revisit the interpretation/wording about synaptic contacts.

      In relation to the above: Figure 2h suggests that all neurons are connected (the majority receiving inhibitory inputs), is this really the case, is there not a single neuron out of the 63 recorded pPVT neurons that does not receive direct synaptic input from the POA?

      (5) It would be nice to characterize the POA population that connects to the pPVT, it is possible/likely that not only warm-responsive POA neurons connect to that region but also others. The current POA-to-pPVT optogenetic fibre stimulations (Figure 4) are not selective for preoptic warm responsive neurons; since the POA subserves many different functions, this optogenetic strategy will likely activate other pathways. The referees acknowledge that molecular analysis of the POA population would be a major undertaking. Instead, this could be acknowledged in the discussion, for example in a section like "limitation of this study".

      (6) Figure 3a the strategy to express Gcamp in a Cre-dependent manner: it seems that the Gcamp8f signal would be polluted by EGFP (coming from the Cre virus injected into the POA): The excitation peak for both is close to 490nm and emission spectra/peaks of GCaMP8f (510-520 nm) and EGFP (507-510 nm) are also highly overlapping. We presume that the high background (EGFP) fluorescence signal would preclude sensitive calcium detection via Gcamp8f, how did the authors tackle this problem?

      (7) How did the authors perform the social interaction test (Figures 1F, G)? Was the intruder mouse male or female? If it was a male mouse would the interaction with the female mouse be a form of mating behavior? If so, the interpretation of the results (Figures 1F, G) could be "episodic heat exposure over the course of 3 weeks reduces mating behavior".

    4. Reviewer #3 (Public review):

      In this study, Cao et al. explore the neural mechanisms by which chronic heat exposure induces negative valence and hyperarousal in mice, focusing on the role of the posterior paraventricular nucleus (pPVT) neurons that receive projections from the preoptic area (POA). The authors show that chronic heat exposure leads to heightened activity of the POA projection-receiving pPVT neurons, potentially contributing to behavioral changes such as increased anxiety level and reduced sociability, along with heightened startle responses. In addition, using electrophysiological methods, the authors suggest that increased membrane excitability of pPVT neurons may underlie these behavioral changes. The use of a variety of behavioral assays enhances the robustness of their claim. Moreover, while previous research on thermoregulation has predominantly focused on physiological responses to thermal stress, this study adds a unique and valuable perspective by exploring how thermal stress impacts affective states and behaviors, thereby broadening the field of thermoregulation. However, a few points warrant further consideration to enhance the clarity and impact of the findings.

      (1) The authors claim that behavior changes induced by chronic heat exposure are mediated by the POA-pPVT circuit. However, it remains unclear whether these changes are unique to heat exposure or if this circuit represents a more general response to chronic stress. It would be valuable to include control experiments with other forms of chronic stress, such as chronic pain, social defeat, or restraint stress, to determine if the observed changes in the POA-pPVT circuit are indeed specific to thermal stress or indicative of a more universal stress response mechanism.

      (2) The authors use the term "negative emotion and hyperarousal" to interpret behavioral changes induced by chronic heat (consistently throughout the manuscript, including the title and lines 33-34). However, the term "emotion" is broad and inherently difficult to quantify, as it encompasses various factors, including both valence and arousal (Tye, 2018; Barrett, L. F. 1999; Schachter, S. 1962). Therefore, the reviewer suggests the authors use a more precise term to describe these behaviors, such as valence. Additionally, in lines 117 and 137-139, replacing "emotion" with "stress responses," a term that aligns more closely with the physiological observations, would provide greater specificity and clarity in interpreting the findings.

      (3) Related to the role of POA input to pPVT,<br /> a) The authors showed increased activity in pPVT neurons that receive projections from the POA (Figure 3), and these neurons are necessary for heat-induced behavioral changes (Figures 4N-W). However, is the POA input to the pPVT circuit truly critical? Since recipient pPVT neurons can receive inputs from various brain regions, the reviewer suggests that experiments directly inhibiting the POA-to-pPVT projection itself are needed to confirm the role of POA input. Alternatively, the authors could show that the increased activity of pPVT neurons due to chronic heat exposure is not observed when the POA is blocked. If these experiments are not feasible, the reviewer suggests that the authors consider toning down the emphasis on the role of the POA throughout the manuscript and discuss this as a limitation.<br /> b) In the electrophysiology experiments shown in Figures 6A-I, the authors conducted in vitro slice recordings on pPVT neurons. However, the interpretation of these results (e.g., "The increase in presynaptic excitability of the POA to pPVT excitatory pathway suggested plastic changes induced by the chronic heat treatment.", lines 349-350) appears to be an overclaim. It is difficult to conclude that the increased excitability of pPVT neurons due to heat exposure is specifically caused by inputs from the POA. To clarify this, the reviewer suggests the authors conduct experiments targeting recipient neurons in the pPVT, with anterograde labeling from the POA to validate the source of excitatory inputs.

      (4) The authors focus on the excitatory connection between the POA and pPVT (e.g., "Together, our results indicate that most of the pPVT-projecting POA neurons responded to heat treatment, which would then recruit their downstream neurons in the pPVT by exerting a net excitatory influence.", lines 169-171). However, are the POA neurons projecting to the pPVT indeed excitatory? This is surprising, considering i) the electrophysiological data shown in Figures 2E-K that inhibitory current was recorded in 52.4% of pPVT neurons by stimulation of POA terminal, and ii) POA projection neurons involved in modulating thermoregulatory responses to other brain regions are primarily GABAergic (Tan et al., 2016; Morrison and Nakamura, 2019). The reviewer suggests showing whether the heat-responsive POA neurons projecting to the pPVT are indeed excitatory (This could be achieved by retrogradely labeling POA neurons that project to the pPVT and conducting fluorescence in situ hybridization (FISH) assays against Slc32a1, Slc17a6, and Fos to label neurons activated by warmth). Alternatively, demonstrate, at least, that pPVT-projecting POA neurons are a distinct population from the GABAergic POA neurons that project to thermoregulatory regions such as DMH or rRPa. This would clarify how the POA-pPVT circuit integrates with the previously established thermoregulatory pathways.

    1. eLife Assessment

      This valuable manuscript reports a large-scale, data-driven, biophysically detailed model of the non-barrel primary somatosensory cortex and generates numerous predictions that can further our understanding of how the multiscale organization of the cortex shapes neural activity. While the approach is solid, many of the findings are obtained using a much smaller portion of the model, which, together with the broad scope of the work, makes the narrative somewhat confusing and the strength of findings not entirely clear.

    2. Reviewer #1 (Public review):

      This paper presents a model of the whole somatosensory non-barrel cortex of the rat, with 4.2 million morphologically and electrically detailed neurons, with many aspects of the model constrained by a variety of data. The paper focuses on simulation experiments, testing a range of observations. These experiments are aimed at understanding how the multiscale organization of the cortical network shapes neural activity.

      Strengths:

      (1) The model is very large and detailed. With 4.2 million neurons and 13.2 billion synapses, as well as the level of biophysical realism employed, it is a highly comprehensive computational representation of the cortical network.

      (2) Large scope of work - the authors cover a variety of properties of the network structure and activity in this paper, from dendritic and synaptic physiology to multi-area neural activity.

      (3) Direct comparisons with experiments, shown throughout the paper, are laudable.

      (4) The authors make a number of observations, like describing how high-dimensional connectivity motifs shape patterns of neural activity, which can be useful for thinking about the relations between the structure and the function of the cortical network.

      (5) Sharing the simulation tools and a "large subvolume of the model" is appreciated.

      Weaknesses:

      (1) A substantial part of this paper - the first few figures - focuses on single-cell and single-synapse properties, with high similarity to what was shown in Markram et al., 2015. Details may differ, but overall it is quite similar.

      (2) Although the paper is about the model of the whole non-barrel somatosensory cortex, out of all figures, only one deals with simulations of the whole non-barrel somatosensory cortex. Most figures focus on simulations that involve one or a few "microcolumns". Again, it is rather similar to what was done by Markram et al., 2015 and constitutes relatively incremental progress.

      (3) With a model like this, one has an opportunity to investigate computations and interactions across an extensive cortical network in an in vivo-like context. However, the simulations presented are not addressing realistic specific situations corresponding to animals performing a task or perceiving a relevant somatosensory stimulus. This makes the insights into the roles of cell types or connectivity architecture less interesting, as they are presented for relatively abstract situations. It is hard to see their relationship to important questions that the community would be excited about - theoretical concepts like predictive coding, biophysical mechanisms like dendritic nonlinearities, or circuit properties like feedforward, lateral, and feedback processing across interacting cortical areas. In other words, what do we learn from this work conceptually, especially, about the whole non-barrel somatosensory cortex?

      (4) Most comparisons with in vivo-like activity are done using experimental data for whisker deflection (plus some from the visual stimulation in V1). But this model is for the non-barrel somatosensory cortex, so exactly the part of the cortex that has less to do with whiskers (or vision). Is it not possible to find any in vivo neural activity data from the non-barrel cortex?

      (5) The authors almost do not show raw spike rasters or firing rates. I am sure most readers would want to decide for themselves whether the model makes sense, and for that, the first thing to do is to look at raster plots and distributions of firing rates. Instead, the authors show comparisons with in vivo data using highly processed, normalized metrics.

      (6) While the authors claim that their model with one set of parameters reproduces many experimentally established metrics, that is not entirely what one finds. Instead, they provide different levels of overall stimulation to their model (adjusting the target "P_FR" parameter, with values from 0 to 1, and other parameters), and that influences results. If I get this right (the figures could really be improved with better organization and labeling), simulations with P_FR closer to 1 provide more realistic firing rate levels for a few different cases, however, P_FR of 0.3 and possibly above tends to cause highly synchronized activity - what the authors call bursting, but which also could be called epileptic-like activity in the network.

      (7) The authors mention that the model is available online, but the "Resource availability" section does not describe that in substantial detail. As they mention in the Abstract, it is only a subvolume that is available. That might be fine, but more detail in appropriate parts of the paper would be useful.

    3. Reviewer #2 (Public review):

      Summary:

      This paper is a companion to Reminann et al. (2022), presenting a large-scale, data-driven, biophysically detailed model of the non-barrel primary somatosensory cortex (nbS1). To achieve this unprecedented scale of a bottom-up model, approximately 140 times larger than the previous model (Markram et al., 2015), they developed new methods to account for inputs from missing brain areas, among other improvements. Isbister et al. focus on detailing these methodological advancements and describing the model's ability to reproduce in vivo-like spontaneous, stimulus-evoked, and optogenetically modified activity.

      Strengths:

      The model generated a series of predictions that are currently impossible in vivo, as summarized in Table S1. Additionally, the tools used in this study are made available online, fostering community-based exploration. Together with the companion paper, this study makes significant contributions by detailing the model's constraints, validations, and potential caveats, which are likely to serve as a basis for advancing further research in this area.

      Weaknesses:

      That said, I have several suggestions to improve clarity and strengthen the validation of the model's in vivo relevance.

      Major:

      (1) For the stimulus-response simulations, the authors should also reference, analyze, and compare data from O'Connor et al. (2010; https://pubmed.ncbi.nlm.nih.gov/20869600/) and Yu et al .(2016; https://pubmed.ncbi.nlm.nih.gov/27749825/) in addition to Yu et al. 2019, which is the only data source the authors consider for an awake response. The authors mentioned bias in spike rate measurements, but O'Connor et al. used cell-attached recordings, which do not suffer from activity-based selection bias (in addition, they also performed Ca2+ imaging of L2/3). This was done in the exact same task as Yu et al., 2019, and they recorded from over 100 neurons across layers. Combining this data with Yu et al., 2019 would provide a comprehensive view of activity across layers and inhibitory cell types. Additionally, Yu et al. (2016) recorded VPM neurons in the same task, alongside whole-cell recordings in L4, showing that L4 PV neurons filter movement-related signals encoded in thalamocortical inputs during active touch. This dataset is more suitable for extracting VPM activity, as it was collected under the same behavior and from the same species (Unlike Diamond et al., 1992, which used anesthetized rats). Furthermore, this filtering is an interesting computation performed by the network the authors modeled. The validation would be significantly strengthened and more biologically interesting if the authors could also reproduce the filtering properties, membrane potential dynamics, and variability in the encoding of touch across neurons, not just the latency (which is likely largely determined by the distance and number of synapses).

      (2) The authors mention that in the model, the response of the main activated downstream area was confined to L6. Is this consistent with in vivo observations? Additionally, is there any in vivo characterization of the distance dependence of spiking correlation to validate Figure 8I?

      (3) Across the figures, activity is averaged across neurons within layers and E or I cell types, with a limited description of single-cell type and single-cell responses. Were there any predictions regarding the responses of particular cell types that significantly differ from others in the same layer? Such predictions could be valuable for future investigations and could showcase the advantages of a data-driven, biophysically detailed model.

      (4) 2.4: Are there caveats to assuming the OU process as a model for missing inputs? Inputs to the cortex are usually correlated and low-dimensional (i.e., communication subspace between cortical regions), but the OU process assumes independent conductance injection. Can (weakly) correlated inputs give rise to different activity regimes in the model? Can you add a discussion on this?

      (5) 2.6: The network structure is well characterized in the companion paper, where the authors report that correlations in higher dimensions were driven by a small number of neurons with high participation ratios. It would be interesting to identify which cell types exhibit high node participation in high-dimensional simplices and examine the spiking activity of cells within these motifs. This could generate testable predictions and inform theoretical cell-type-specific point neuron models for excitatory/inhibitory balanced networks and cortical processing.

      Minor:

      (1) Since the previous model was published in 2015, the neuroscience field has seen significant advancements in single-cell and single-nucleus sequencing, leading to the clustering of transcriptomic cell types in the entire mouse brain. For instance, the Allen Institute has identified ~10 distinct glutamatergic cell types in layer 5, which exceeds the number incorporated into the current model. Could you discuss 1) the relationship between the modeled me-types and these transcriptomic cell types, and 2) how future models will evolve to integrate this new information? If there are gaps in knowledge in order to incorporate some transcriptome cell types into your model, it would be helpful to highlight them so that efforts can be directed toward addressing these areas.

      (2) For the optogenetic manipulation, it would be interesting if the model could reproduce the paradoxical effects (for example, Mahrach et al. reported paradoxical effects caused by PV manipulation in S1; https://pubmed.ncbi.nlm.nih.gov/31951197/). This seems a more relevant and non-trivial network phenomenon than the V1 manipulation the authors attempted to replicate.

    1. eLife Assessment

      This study provides abundant valuable scRNA-Seq data that profiles fibroblasts involved in myocardium and coronary vasculature development. However, the evidence supporting the authors' claims is currently incomplete. The inclusion of additional citations, more in-depth discussions, and further analyses or experiments to validate the scRNA-Seq data would have significantly strengthened the study. Nonetheless, the scRNA-Seq expression data will be a resource that is of value to researchers in the field.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Deng et al reports single-cell expression analysis of developing mouse hearts and examines the requirements for cardiac fibroblasts in heart maturation. Much of this work is overlapping with previous studies, but the single-cell gene expression data may be useful to investigators in the field. The significance and scope of new findings are limited and major conclusions are largely based on correlative data.

      Strengths:

      The strengths of the manuscript are the new single-cell datasets and comprehensive approach to ablating cardiac fibroblasts in pre and postnatal development in mice.

      Weaknesses:

      There are several major weaknesses in the analysis and interpretation of the results.

      (1) The major conclusions regarding collagen signaling and heart maturation are based on gene expression patterns and are not functionally validated. The potential downstream signaling pathways were not examined and known structural contributions of fibrillar collagen to heart maturation are not discussed.

      (2) The heterogeneity of fibroblast populations and contributions to multiple structures in the developing heart are not well-considered in the analysis. The developmental targeting of fibroblasts will likely affect multiple structures in the embryonic heart and other organs. Lethality is described in some of these studies, but additional analysis is needed to determine the effects on heart morphogenesis or other organs beyond the focus on cardiomyocyte maturation being reported. In particular, the endocardial cushions and developing valves are likely to be affected in the prenatal ablations, but these structures are not included in the analyses.

      (3) ECM complexity and extensive previous work on specific ECM proteins in heart development and maturation are not incorporated into the current study. Different types of collagen (basement membrane Col4, filamentous Col6, and fibrillar Col1) are known to be expressed in fibroblast populations in the developing heart and have been studied extensively. Much also has been reported for other ECM components mentioned in the current work.

    3. Reviewer #2 (Public review):

      This study aims to elucidate the role of fibroblasts in regulating myocardium and vascular development through signaling to cardiomyocytes and endothelial cells. This focus is significant, given that fibroblasts, cardiomyocytes, and vascular endothelial cells are the three primary cell types in the heart. The authors employed a Pdgfra-CreER-controlled diphtheria toxin A (DTA) system to ablate fibroblasts at various embryonic and postnatal stages, characterizing the resulting cardiac defects, particularly in myocardium and vasculature development. scRNA-seq analysis of the ablated hearts identified collagen as a crucial signaling molecule from fibroblasts that influences the development of cardiomyocytes and vascular endothelial cells.

      This is an interesting manuscript; however, there are several major issues, including an over-reliance on the scRNA-seq data, which shows inconsistencies between replicates.<br /> Some of the major issues are described below.

      (1) The CD31 immunostaining data (Figures 3B-G) indicate a reduction in endothelial cell numbers following fibroblast deletion using PdgfraCreER+/-; RosaDTA+/- mice. However, the scRNA-seq data show no percentage change in the endothelial cell population (Figure 4D). Furthermore, while the percentage of Vas_ECs decreased in ablated samples at E16.5, the results at E18.5 were inconsistent, showing an increase in one replicate and a decrease in another, raising concerns about the reliability of the RNA-seq findings.

      (2) Similarly, while the percentage of Ven_CMs increased at E18.5, it exhibited differing trends at E16.5 (Figure 4E), further highlighting the inconsistency of the scRNA-seq analysis with the other data.

      (3) Furthermore, the authors noted that the ablated samples had slightly higher percentages of cardiomyocytes in the G1 phase compared to controls (Figures 4H, S11D), which aligns with the enrichment of pathways related to heart development, sarcomere organization, heart tube morphogenesis, and cell proliferation. However, it is unclear how this correlates with heart development, given that the hearts of ablated mice are significantly smaller than those of controls (Figure 3E). Additionally, the heart sections from ablated samples used for CD31/DAPI staining in Figure 3F appear much larger than those of the controls, raising further inconsistencies in the manuscript.

      (4) The manuscript relies heavily on the scRNA-seq dataset, which shows inconsistencies between the two replicates. Furthermore, the morphological and histological analyses do not align with the scRNA-seq findings.

      (5) There is a lack of mechanistic insight into how collagen, as a key signaling molecule from fibroblasts, affects the development of cardiomyocytes and vascular endothelial cells.

      (6) In Figure 1B, Col1a1 expression is observed in the epicardial cells (Figure 1A, E11.5), but this is not represented in the accompanying cartoon.

      (7) What is the genotype of the control animals used in the study?

      (8) Do the PdgfraCreER+/-; RosaDTA+/- mice survive after birth when induced at E15.5, and do they exhibit any cardiac defects?

    4. Reviewer #3 (Public review):

      The authors investigated fibroblasts' communication with key cell types in developing and neonatal hearts, with a focus on the critical roles of fibroblast-cardiomyocyte and fibroblast-endothelial cell networks in cardiac morphogenesis. They tried to map the spatial distribution of these cell types and reported the major pathways and signaling molecules driving the communication. They also used Cre-DTA system to ablate Pdgfra labeled cells and observed myocardial and endothelial cell defects at development. They screened the pathways and genes using sequencing data of ablated hearts. Lastly, they reported compensatory collagen expression in long-term ablated neonate hearts. Overall, this study provides us with important insight into fibroblasts' roles in cardiac development and will be a powerful resource for collagens and ECM-focused research.

      Strengths:

      The authors utilized good analyzing tools to investigate multiple databases of single-cell sequencing and Multi-seq. They identified significant pathways and cellular and molecular interactions of fibroblasts. Additionally, they compared some of their analytic findings with a human database, and identified several groups of ECM genes with varying roles in mice.

      Weaknesses:

      This study is majorly based on sequencing data analysis. At the bench, they used a very strident technique to study fibroblast functions by ablating one of the major cell populations of the heart. Considering the importance of the fibroblast population, intriguing in vivo findings were expected. Also, they analyzed the downstream genes in ablated hearts, but did not execute any experimental validation for any of the targets.

    1. eLife Assessment

      This useful study presents a biologically realistic, large-scale cortical model of the rat's non-barrel somatosensory cortex, investigating synaptic plasticity of excitatory connections under varying patterns of external activations and characterizing relations between network architecture and plasticity outcomes. While the model demonstrates several interesting phenomena, the results are less explanatory of causal relationships and more observational in nature; hence the evidence supporting the main conclusions remains incomplete.

    2. Reviewer #1 (Public review):

      This paper investigates the dynamics of excitatory synaptic weights under a calcium-based plasticity rule, in long (up to 10 minutes) simulations of a 211,000-neuron biophysically detailed model of a rat cortical network.

      Strengths

      (1) A very detailed network model, with a large number of neurons, connections, synapses, etc., and with a huge number of biological considerations implemented in the model.

      (2) A carefully developed calcium-based plasticity rule, which operates with biologically relevant variables like calcium concentration and NMDA conductances.

      (3) The study itself is detailed and thorough, covering many aspects of the cellular and network anatomy and properties and investigating their relationships to plasticity.

      (4) The model remains stable over long periods of simulations, with the plasticity rule maintaining reasonable synaptic weights and not pushing the network to extremes.

      (5) The variety of insights the authors derive in terms of relationships between the cellular and network properties and dynamics of the synaptic weights are potentially interesting for the field.

      (6) Sharing the model and the associated methods and tools is a big plus.

      Weaknesses

      (1) Conceptually, there seems to be a missed opportunity here in that it is not clear what the network learns to do. The authors present 10 different input patterns, the network does some plasticity, which is then analyzed, but we do not know whether the learning resulted in anything functionally significant. Did the network learn to discriminate the patterns much better than at the beginning, to capture or anticipate the timing of pattern presentation, detect similarities between patterns, etc.? This is important to understand if one wants to assess the significance of synaptic changes due to plasticity. For example, if the network did not learn much new functionally, relative to its initial state, then the observed plasticity could be considered minor and possibly insufficient. In that case, were the network to learn something substantial, one would potentially observe much more extensive plasticity, and the results of the whole study could change, possibly including the stability of the network. While this could be a whole separate study, this issue is of central importance, and it is hard to judge the value of the results when we do not know what the network learned to do, if anything.

      (2) In this study, plasticity occurs only at E-to-E connections but not at others. However, it is well known that inhibitory connections in the cortex exhibit at the very least a substantial short-term plasticity. One would expect that not including these phenomena would have substantial consequences on the results.

      (3) Lines 134-135: "We calibrated layer-wise spontaneous firing rates and evoked activity to brief VPM inputs matching in vivo data from Reyes-Puerta et al. (2015)."

      (4) Can the authors show these results? It is an important comparison, and so it would be great to see firing rates (ideally, their distributions) for all the cell types and layers vs. experimental data, for the evoked and spontaneous conditions.

      (5) That being said, the Reyes-Puerta et al. paper reports firing rates for the barrel cortex, doesn't it? Whereas here, the authors are simulating a non-barrel cortex. Is such a comparison appropriate?

      (6) Comparison with STDP on pages 5-7 and Figure 2: if I got this right, the authors applied STDP to already generated spikes, that is, did not run a simulation with STDP. That seems strange. The spikes they use here were generated by the system utilizing their calcium-based plasticity rule. Obviously, the spikes would be different if STDP was utilized instead. The traces of synaptic weights would then also be different. The comparison therefore is not quite appropriate, is it?

      (7) Section 2.3 and Figure 5: I am not sure this analysis adds much. The main finding is that plasticity occurs more among cells in assemblies than among all cells. But isn't that expected given what was shown in the previous figures? Specifically, the authors showed that for cells that fire more, plasticity is more prominent. Obviously, cells that fire little or not at all won't belong to any assemblies. Therefore, we expect more plasticity in assemblies.

      (8) Section 2.4 and Figure 6: It is not clear that the results truly support the formulation of the section's title ("Synapse clustering contributes to the emergence of cell assemblies, and facilitates plasticity across them") and some of the text in the section. What I can see is that the effect on rho is strong for non-clustered synapses (Figure 6C and Figure S8A). In some cases, it is substantially higher than what is seen for clustered synapses. Furthermore, the wording "synapse clustering contributes to the emergence of cell assemblies" suggests some kind of causal role of clustered synapses in determining which neurons form specific cell assemblies. I do not see how the data presented supports that. Overall, it appears that the story about clustered synapses is quite complicated, with both clustered and non-clustered synapses driving changes in rho across the board.

      (9) Section 2.5 and Figure 7: Can we be certain that it is the edge participation that is a particularly good predictor of synaptic changes and/or strength, as opposed to something simpler? For example, could it be the overall number of synapses, excitatory synapses, or something along these lines, that the source and/or target neurons receive, that determine the rho dynamics? And then, I do not understand the claim that edge participation allows one to "delineate potentiation from depression". The only related data I can find is in Figure 7A3, about which the authors write "this effect was stronger for potentiation than depression". But I don't see what they mean. For both depression and facilitation, the changes observed are in the range of ~12% of probability values. And even if the effect is stronger, does it mean one can "delineate" potentiation from depression better? What does it mean, to "delineate"? If it is some kind of decoding based on the edge participation, then the authors did not show that.

      (10) "test novel predictions in the MICrONS (2021) dataset, which while pushing the boundaries of big data neuroscience, was so far only analyzed with single cells in focus instead of the network as a whole (Ding et al., 2023; Wang et al., 2023)." That is incorrect. For example, the whole work of Ding et al. analyzes connectivity and its relation to the neuron's functional properties at the network level.

    3. Reviewer #2 (Public review):

      Summary:

      This paper aims to understand the effects of plasticity in shaping the dynamics and structure of cortical circuits, as well as how that depends on aspects such as network structure and dendritic processing.

      Strengths:

      The level of biological detail included is impressive, and the numerical simulations appear to be well executed. Additionally, they have done a commendable job in open-sourcing the model.

      Weaknesses:

      The main result of this work is that activity in their network model remains stable without the need for a homeostatic mechanism. However, as the authors acknowledge, this has been demonstrated in previous studies (e.g., Higgins et al. 2014). In those studies, stability was attributed to calcium-based rules combined with calcium concentrations at in vivo levels and background neuronal activity. Since the authors use the same calcium-based rule, it is unclear what new result, if any, is being presented. If the authors are suggesting that the mechanism in their simulations differs, that should be stated clearly, and evidence supporting that claim should be provided.

      The other findings discussed in the paper are related to a characterization of the dependency of plastic changes on network structure. While this analysis is potentially interesting, it has the following limitations.

      First, I believe the authors should include an analysis of the generality and specificity of their results. All the findings seem to be derived from a single run of the simulation. How do the results vary with different network initializations, simulation times, or parameter choices?

      Second, the presentation of the results is difficult to follow. The characterization comes across as a long list of experiments, making it hard to identify a central message or distinguish key findings from minor details. The authors provide little intuition about why certain outcomes arise, and the complexity of the simulation makes it challenging - if not impossible - to determine which model elements are essential for specific results and which mechanisms drive emergent properties. Additionally, the text often lacks crucial details. For instance, the description of k-edge participation should be expanded, and an explanation of what this method quantifies should be included. Overall, I believe the authors should focus on a smaller set of significant results and provide a more in-depth discussion.

      The comparison of the model with the MICrONS dataset could be improved. In Figure 7B, the authors should show how the same quantification looks in a network model without plasticity. In Figure 8B, the data aligns with the model before plasticity, so it's unclear how this serves as a verification of the theoretical predictions.

    4. Reviewer #3 (Public review):

      Summary:

      Ecker et al. utilized a biologically realistic, large-scale cortical model of the rat's non-barrel somatosensory cortex, incorporating a calcium-dependent plasticity rule to examine how various factors influence synaptic plasticity under in vivo-like conditions. Their analysis characterized the resulting plastic changes and revealed that key factors, including the co-firing of stimulus-evoked neuronal ensembles, the spatial organization of synaptic clusters, and the overall network topology, play an important role in affecting the extent of synaptic plasticity.

      Strengths:

      The detailed, large-scale model employed in this study enables the evaluation of diverse factors across various levels that influence the extent of plastic changes. Specifically, it facilitates the assessment of synaptic organization at the subcellular level, network topology at the macroscopic level, and the co-activation of neuronal ensembles at the activity level. Moreover, modeling plasticity under in vivo-like conditions enhances the model's relevance to experiments.

      Weaknesses:

      (1) The authors claimed that, under in vivo-like conditions and in the presence of plasticity, firing rates and weight distributions remain stable without additional homeostatic mechanisms during a 10-minute stimulation period. However, the weights do not reach the steady state immediately after the 10-minute stimulation. Therefore, extended simulations are necessary to substantiate the claim.

      (2) Another major limitation of the paper lies in its lack of mechanistic insights into the observed phenomena (particularly on aspects that are typically impossible to assess in traditional simplified models, like layer-specific and layer-to-layer pathways-specific plasticity changes), as well as the absence of discussions on the potential computational implications of the corresponding observed plastic changes.

  2. Oct 2024
    1. eLife Assessment

      This important study explores the interplay between gene dosage and gene mutations in the evolution of antibiotic resistance. The authors provide solid evidence to connect proteostasis with gene duplication during experimental evolution in a model system. If the experiments are found to be rigorous and reproducible, then this paper will be of high interest to other researchers studying antibiotic resistance, proteostasis, and bacterial evolution.

    2. Reviewer #1 (Public review):

      Summary:

      The study by Jena et al. addresses important questions on the fundamental mechanisms of genetic adaptation, specifically, does adaptation proceed via changes of copy number (gene duplication and amplification "GDA") or by point mutation. While this question has been worked on (for example by Tomanek and Guet) the authors add several important aspects relating to resistance against antibiotics and they clarify the ability of Lon protease to reduce duplication formation (previous work was more indirect).

      A key finding Jena et al. present is that point mutations after significant competition displace GDA. A second one is that alternative GDA constantly arise and displace each other (see work on GDA-2 in Figure 3). Finally, the authors found epistasis between resistance alleles that was contingent on lon. Together this shows an intricate interplay of lon proteolysis for the evolution and maintenance of antibiotic resistance by gene duplication.

      Strengths:

      The study has several important strengths: (i) the work on GDA stability and competition of GDA with point mutations is a very promising area of research and the authors contribute new aspects to it, (ii) rigorous experimentation, (iii) very clearly written introduction and discussion sections. To me, the best part of the data is that deletion of lon stimulates GDA, which has not been shown with such clarity until now.

      Weaknesses:

      The minor weaknesses of the manuscript are a lack of clarity in parts of the results section (Point 1) and the methods (Point 2).

    3. Reviewer #2 (Public review):

      Summary:

      In this strong study, the authors provide robust evidence for the role of proteostasis genes in the evolution of antimicrobial resistance, and moreover, for stabilizing the proteome in light of gene duplication events.

      Strengths:

      This strong study offers an important interaction between findings involving GDA, proteostasis, experimental evolution, protein evolution, and antimicrobial resistance. Overall, I found the study to be relatively well-grounded in each of these literatures, with experiments that spoke to potential concerns from each arena. For example, the literature on proteostasis and evolution is a growing one that includes organisms (even micro-organisms) of various sorts. One of my initial concerns involved whether the authors properly tested the mechanistic bases for the rule of Lon in promoting duplication events. The authors assuaged my concern with a set of assays (Figure 8).

      More broadly, the study does a nice job of demonstrating the agility of molecular evolution, with responsible explanations for the findings: gene duplications are a quick-fix, but can be out-competed relative to their mutational counterparts. Without Lon protease to keep the proteome stable, the cell allows for less stable solutions to the problem of antibiotic resistance.

      The study does what any bold and ambitious study should: it contains large claims and uses multiple sorts of evidence to test those claims.

      Weaknesses:

      While the general argument and conclusion are clear, this paper is written for a bacterial genetics audience that is familiar with the manner of bacterial experimental evolution. From the language to the visuals, the paper is written in a boutique fashion. The figures are even difficult for me - someone very familiar with proteostasis - to understand. I don't know if this is the fault of the authors or the modern culture of publishing (where figures are increasingly packed with information and hard to decipher), but I found the figures hard to follow with the captions. But let me also consider that the problem might be mine, and so I do not want to unfairly criticize the authors.

      For a generalist journal, more could be done to make this study clear, and in particular, to connect to the greater community of proteostasis researchers. I think this study needs a schematic diagram that outlines exactly what was accomplished here, at the beginning. Diagrams like this are especially important for studies like this one that offer a clear and direct set of findings, but conduct many different sorts of tests to get there. I recommend developing a visual abstract that would orient the readers to the work that has been done.

      Next, I will make some more specific suggestions. In general, this study is well done and rigorous, but doesn't adequately address a growing literature that examines how proteostasis machinery influences molecular evolution in bacteria.

      While this paper might properly test the authors' claims about protein quality control and evolution, the paper does not engage a growing literature in this arena and is generally not very strong on the use of evolutionary theory. I recognize that this is not the aim of the paper, however, and I do not question the authors' authority on the topic. My thoughts here are less about the invocation of theory in evolution (which can be verbose and not relevant), and more about engagement with a growing literature in this very area.

      The authors mention Rodrigues 2016, but there are many other studies that should be engaged when discussing the interaction between protein quality control and evolution.

      A 2015 study demonstrated how proteostasis machinery can act as a barrier to the usage of novel genes: Bershtein, S., Serohijos, A. W., Bhattacharyya, S., Manhart, M., Choi, J. M., Mu, W., ... & Shakhnovich, E. I. (2015). Protein homeostasis imposes a barrier to functional integration of horizontally transferred genes in bacteria. PLoS genetics, 11(10), e1005612

      A 2019 study examined how Lon deletion influenced resistance mutations in DHFR specifically: Guerrero RF, Scarpino SV, Rodrigues JV, Hartl DL, Ogbunugafor CB. The proteostasis environment shapes higher-order epistasis operating on antibiotic resistance. Genetics. 2019 Jun 1;212(2):565-75.

      A 2020 study did something similar: Thompson, Samuel, et al. "Altered expression of a quality control protease in E. coli reshapes the in vivo mutational landscape of a model enzyme." Elife 9 (2020): e53476.

      And there's a new review (preprint) on this very topic that speaks directly to the various ways proteostasis shapes molecular evolution:<br /> Arenas, Carolina Diaz, Maristella Alvarez, Robert H. Wilson, Eugene I. Shakhnovich, C. Brandon Ogbunugafor, and C. Brandon Ogbunugafor. "Proteostasis is a master modulator of molecular evolution in bacteria."

      I am not simply attempting to list studies that should be cited, but rather, this study needs to be better situated in the contemporary discussion on how protein quality control is shaping evolution. This study adds to this list and is a unique and important contribution. However, the findings can be better summarized within the context of the current state of the field. This should be relatively easy to implement.

    4. Reviewer #3 (Public review):

      Summary:

      This paper investigates the relationship between the proteolytic stability of an antibiotic target enzyme and the evolution of antibiotic resistance via increased gene copy number. The target of the antibiotic trimethoprim is dihydrofolate reductase (DHFR). In Escherichia coli, DHFR is encoded by folA and the major proteolysis housekeeping protease is Lon (lon). In this manuscript, the authors report the results of the experimental evolution of a lon mutant strain of E. coli in response to sub-inhibitory concentrations of the antibiotic trimethoprim and then investigate the relationship between proteolytic stability of DHFR mutants and the evolution of folA gene duplication. After 25 generations of serial passaging in a fixed concentration of trimethoprim, the authors found that folA duplication events were more common during the evolution of the lon strain, than the wt strain. However, with continued passaging, some folA duplications were replaced by a single copy of folA containing a trimethoprim resistance-conferring point mutation. Interestingly, the evolution of the lon strain in the setting of increasing concentrations of trimethoprim resulted in evolved strains with different levels of DHFR expression. In particular, some strains maintained two copies of a mutant folA that encoded an unstable DHFR. In a lon+ background, this mutant folA did not express well and did not confer trimethoprim resistance. However, in the lon- background, it displayed higher expression and conferred high-level trimethoprim resistance. The authors concluded that maintenance of the gene duplication event (and the absence of Lon) compensated for the proteolytic instability of this mutant DHFR. In summary, they provide evidence that the proteolytic stability of an antibiotic target protein is an important determinant of the evolution of target gene copy number in the setting of antibiotic selection.

      Strengths:

      The major strength of this paper is identifying an example of antibiotic resistance evolution that illustrates the interplay between the proteolytic stability and copy number of an antibiotic target in the setting of antibiotic selection. If the weaknesses are addressed, then this paper will be of interest to microbiologists who study the evolution of antibiotic resistance.

      Weaknesses:

      Although the proposed mechanism is highly plausible and consistent with the data presented, the analysis of the experiments supporting the claim is incomplete and requires more rigor and reproducibility. The impact of this finding is somewhat limited given that it is a single example that occurred in a lon strain and compensatory mutations for evolved antibiotic resistance mechanisms are described. In this case, it is not clear that there is a functional difference between the evolution of copy number versus any other mechanism that meets a requirement for increased "expression demand" (e.g. promoter mutations that increase expression and protein stabilizing mutations).

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      […] Strengths:

      The study has several important strengths: (i) the work on GDA stability and competition of GDA with point mutations is a very promising area of research and the authors contribute new aspects to it, (ii) rigorous experimentation, (iii) very clearly written introduction and discussion sections. To me, the best part of the data is that deletion of lon stimulates GDA, which has not been shown with such clarity until now.

      Weaknesses:

      The minor weaknesses of the manuscript are a lack of clarity in parts of the results section (Point 1) and the methods (Point 2).

      We thank the reviewer for their comments and suggestions on our manuscript. We also appreciate the succinct summary of key findings that the Reviewer has taken cognisance of in their assessment, in particular the association of the Lon protease with the propensity for GDAs as well as its impact on their eventual fate. Going ahead, we plan to revise the manuscript for greater clarity as suggested by Reviewer #1.

      Reviewer #2 (Public review):

      […] The study does what any bold and ambitious study should: it contains large claims and uses multiple sorts of evidence to test those claims.

      Weaknesses:

      While the general argument and conclusion are clear, this paper is written for a bacterial genetics audience that is familiar with the manner of bacterial experimental evolution. From the language to the visuals, the paper is written in a boutique fashion. The figures are even difficult for me - someone very familiar with proteostasis - to understand. I don't know if this is the fault of the authors or the modern culture of publishing (where figures are increasingly packed with information and hard to decipher), but I found the figures hard to follow with the captions. But let me also consider that the problem might be mine, and so I do not want to unfairly criticize the authors.

      For a generalist journal, more could be done to make this study clear, and in particular, to connect to the greater community of proteostasis researchers. I think this study needs a schematic diagram that outlines exactly what was accomplished here, at the beginning. Diagrams like this are especially important for studies like this one that offer a clear and direct set of findings, but conduct many different sorts of tests to get there. I recommend developing a visual abstract that would orient the readers to the work that has been done.

      Next, I will make some more specific suggestions. In general, this study is well done and rigorous, but doesn't adequately address a growing literature that examines how proteostasis machinery influences molecular evolution in bacteria.

      While this paper might properly test the authors' claims about protein quality control and evolution, the paper does not engage a growing literature in this arena and is generally not very strong on the use of evolutionary theory. I recognize that this is not the aim of the paper, however, and I do not question the authors' authority on the topic. My thoughts here are less about the invocation of theory in evolution (which can be verbose and not relevant), and more about engagement with a growing literature in this very area.

      The authors mention Rodrigues 2016, but there are many other studies that should be engaged when discussing the interaction between protein quality control and evolution.

      A 2015 study demonstrated how proteostasis machinery can act as a barrier to the usage of novel genes: Bershtein, S., Serohijos, A. W., Bhattacharyya, S., Manhart, M., Choi, J. M., Mu, W., ... & Shakhnovich, E. I. (2015). Protein homeostasis imposes a barrier to functional integration of horizontally transferred genes in bacteria. PLoS genetics, 11(10), e1005612

      A 2019 study examined how Lon deletion influenced resistance mutations in DHFR specifically: Guerrero RF, Scarpino SV, Rodrigues JV, Hartl DL, Ogbunugafor CB. The proteostasis environment shapes higher-order epistasis operating on antibiotic resistance. Genetics. 2019 Jun 1;212(2):565-75.

      A 2020 study did something similar: Thompson, Samuel, et al. "Altered expression of a quality control protease in E. coli reshapes the in vivo mutational landscape of a model enzyme." Elife 9 (2020): e53476.

      And there's a new review (preprint) on this very topic that speaks directly to the various ways proteostasis shapes molecular evolution:

      Arenas, Carolina Diaz, Maristella Alvarez, Robert H. Wilson, Eugene I. Shakhnovich, C. Brandon Ogbunugafor, and C. Brandon Ogbunugafor. "Proteostasis is a master modulator of molecular evolution in bacteria."

      I am not simply attempting to list studies that should be cited, but rather, this study needs to be better situated in the contemporary discussion on how protein quality control is shaping evolution. This study adds to this list and is a unique and important contribution. However, the findings can be better summarized within the context of the current state of the field. This should be relatively easy to implement.

      We thank the reviewer for their encouraging assessment of our manuscript. We appreciate that the manuscript may not be accessible for a general readership in its present form. We plan to revise the manuscript, in part by modifying figures and adding schematics, to afford greater clarity. We also appreciate the concern regarding situating this study in the context of other published work that relates proteostasis and molecular evolution. Indeed, this was a particularly difficult aspect for us given the different kinds of literature that were needed to make sense of our study. We plan on revising the manuscript by incorporating the references that the Reviewer has pointed out.

      Reviewer #3 (Public review):

      […] Strengths:

      The major strength of this paper is identifying an example of antibiotic resistance evolution that illustrates the interplay between the proteolytic stability and copy number of an antibiotic target in the setting of antibiotic selection. If the weaknesses are addressed, then this paper will be of interest to microbiologists who study the evolution of antibiotic resistance.

      Weaknesses:

      Although the proposed mechanism is highly plausible and consistent with the data presented, the analysis of the experiments supporting the claim is incomplete and requires more rigor and reproducibility. The impact of this finding is somewhat limited given that it is a single example that occurred in a lon strain and compensatory mutations for evolved antibiotic resistance mechanisms are described. In this case, it is not clear that there is a functional difference between the evolution of copy number versus any other mechanism that meets a requirement for increased "expression demand" (e.g. promoter mutations that increase expression and protein stabilizing mutations).

      We thank the reviewer for their in-depth assessment of our work and appreciate their concerns regarding reproducibility and rigor in analysis of our data. We will incorporate this feedback and provide the necessary clarifications in the revised version of our manuscript.

    1. eLife Assessment

      This valuable work explores the timely idea that aperiodic activity in human electrophysiology recordings shows changes in response to task events, which may be relevant for performance, and that these changes could be misinterpreted as oscillatory changes. While it is a timely and interesting topic in principle, in the present form, the analytic approach is incomplete. Further, the data offer inadequate support for the conclusions related to theta without demonstrations that the task evokes theta power. Impressions were split, but there was consensus that the Discussion should be tempered and that revisions would improve the manuscript.

    2. Reviewer #1 (Public review):

      Summary:

      Frelih et al. investigated both periodic and aperiodic activity in EEG during working memory tasks. In terms of periodic activity, they found post-stimulus decreases in alpha and beta activity, while in terms of aperiodic activity, they found a bi-phasic post-stimulus steepening of the power spectrum, which was weakly predictive of performance. They conclude that it is crucial to properly distinguish between aperiodic and periodic activity in event-related designs as the former could confound the latter. They also add to the growing body of research highlighting the functional relevance of aperiodic activity in the brain.

      Strengths:

      This is a well-written, timely paper that could be of interest to the field of cognitive neuroscience, especially to researchers investigating the functional role of aperiodic activity. The authors describe a well-designed study that looked at both the oscillatory and non-oscillatory aspects of brain activity during a working memory task. The analytic approach is appropriate, as a state-of-the-art toolbox is used to separate these two types of activity. The results support the basic claim of the paper that it is crucial to properly distinguish between aperiodic and periodic activity in event-related designs as the former could confound the latter. They also add to the growing body of research highlighting the functional relevance of aperiodic activity in the brain. Commendably, the authors include replications of their key findings on multiple independent data sets.

      Weaknesses:

      The authors also claim that their results speak to the interplay between oscillatory and non-oscillatory activity, and crucially, that task-related changes in the theta frequency band - often attributed to neural oscillations in the field - are in fact only a by-product of non-oscillatory changes. I believe these claims are too bold and are not supported by compelling evidence in the paper. Some control analyses - e.g., contrasting the scalp topographies of purported theta and non-oscillatory effects - could help strengthen the latter argument, but it may be safest to simply soften these two claims.

      In terms of the methodology used, I suggest the authors make it clearer to readers that the primary results were obtained on a sample of middle-aged-to-older-adults, some with subjective cognitive complaints, and note that while stimulus-locked event-related potentials (ERPs) were removed from the data prior to analyses, response-locked ERPs were not. This could potentially confound aperiodic findings. Contrasting the scalp topographies of response-related ERPs and the identified aperiodic components, especially the latter one, could bring some clarity here too.

      I also found certain parts of the introduction to be somewhat confusing.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Frelih et al investigate the relationship between aperiodic neural activity, as measured by EEG, and working memory performance, and compare this to the more commonly analyzed periodic, and in particular theta, measures that are often associated with such tasks. To do so, they analyze a primary dataset of 57 participants engaging in an n-back task, as well as a replication dataset, and use spectral parameterization to measure periodic and aperiodic features of the data, across time. In doing so, they find both periodic and aperiodic features that relate to the task dynamics, but importantly the aperiodic component appears to explain away what otherwise looks like theta activity in a more traditional analysis. This study, therefore, helps to establish that aperiodic activity is a task-relevant dynamic feature in working memory tasks, and may be the underlying change in many other studies that reported 'theta' changes but did not use methods that could differentiate periodic and aperiodic features.

      Strengths:

      Key strengths of this paper include that it addresses an important question - that of properly adjudicating which features of EEG recordings relate to working memory tasks - and in doing so provides a compelling answer, with important implications for considering prior work and contributing to understanding the neural underpinnings of working memory. I do not find any significant faults or errors with the design, analysis, and main interpretations as presented by this paper, and as such, find the approach taken to be valid and well-enacted. The use of multiple variants of the working memory task, as well as a replication dataset significantly strengthens this manuscript, by demonstrating a degree of replicability and generalizability. This manuscript is also an important contribution to motivating best practices for analyzing neuro-electrophysiological data, including in relation to using baselining procedures.

      Weaknesses:

      Overall, I do not find any obvious weaknesses in this manuscript and its analyses that challenge the key results and conclusions. There are some minor reporting notes, on the methods and conclusions that I believe could be improved (details in the suggestions for authors). One aspect that could be improved is that while the figures demonstrate the main findings convincingly, the results as written could have more detailed quantifications of the analyzed effects (including, for example, more on the model results, effect sizes, and quantifications of the different features), in order to more fully report the dynamics of the analyzed features and to provide the reader with more information on the findings.

    4. Reviewer #3 (Public review):

      Summary:

      Using a specparam (1/f) analysis of task-evoked activity, the authors propose that "substantial changes traditionally attributed to theta oscillations in working memory tasks are, in fact, due to shifts in the spectral slope of aperiodic activity." This is a very bold and ambitious statement, and the field of event-related EEG would benefit from more critical assessments of the role of aperiodic changes during task events. Unfortunately, the data shown here does not support the main conclusion advanced by the authors.

      Strengths:

      The field of event-related EEG would benefit from more critical assessments of the role of aperiodic changes during task events. The authors perform a number of additional control analyses, including different types of baseline correction, ERP subtraction, as well as replication of the experiment with two additional datasets.

      Weaknesses:

      The authors did not first show that their first task successfully evoked theta power, nor that specparam is capable of quantifying the background around a short theta burst, nor that theta effects are different between baseline corrected vs. spectral parameterized quantifications.

    5. Author Response:

      We would like thank reviewers for your comprehensive and insightful reviews of our manuscript. We highly value your constructive comments and suggestions and are preparing revisions that will enhance both the clarity and robustness of our study. Below is an outline of the changes we will implement in response to the points you raised.

      All three reviewers expressed concerns regarding the robustness of our conclusions about the relationship between task-related theta activity and aperiodic changes. We will revise the manuscript to present these conclusions more cautiously, stating that the findings indicate a potential contribution of aperiodic activity to what is traditionally interpreted as theta activity. While our results emphasize the importance of distinguishing between periodic and aperiodic components, further research is necessary to fully understand this relationship. We will conduct additional control analyses, including a comparison of the scalp topographies of theta and aperiodic components, to better understand the relationship between aperiodic and periodic (theta) activity.

      In response to Reviewer #1's request for greater transparency in our reporting of methodological details, we will provide key clarifications. We will add a clear statement noting that the primary results are based on data from middle-aged to older adults, some of whom had subjective cognitive complaints (SCC). However, it is important to note that no differences were observed between the SCC group and the control group regarding periodic or aperiodic changes in power. Additionally, the main findings were replicated in a sample of middle-aged adults.

      To address potential confounding factors, we will include an analysis contrasting response-related ERPs with the identified aperiodic components. However, we do not entirely agree with the assertion that this will necessarily clarify the results. ERPs are not inherently distinct from aperiodic (or periodic) activity; they may reflect changes in aperiodic (or periodic) power. In our view, examining aperiodic and periodic power, ERPs, or time-frequency decomposition with baseline correction provides different perspectives on the same data. Nonetheless, the combined analyses and their results are intended to guide future researchers toward the most suitable approach for interpreting this data.

      Reviewer #3 raised concerns regarding the task's effectiveness in evoking theta power and the ability of spectral parameterization method (specparam) to adequately quantify background activity around theta bursts. To address these concerns, we will include additional visualizations demonstrating that the task reliably elicited theta (and delta) activity. Regarding the reviewer's concerns about specparam and theta bursts, it is important to clarify that specparam, in the form we used, does not incorporate time information; rather, it can be applied to any power spectral density (PSD), independent of how the PSD is derived. Specparam’s performance depends on the methods used to estimate frequency content. For time-frequency decomposition, we employed superlets (https://doi.org/10.1038/s41467-020-20539-9), which have been shown to resolve short bursts of activity more effectively than other methods. To our knowledge, superlets provide the highest resolution in terms of both time and frequency. Moreover, to improve stability, we performed spectral parameterization on trial-averaged power (in contrast to the approach in https://doi.org/10.7554/eLife.77348). Nonetheless, we will conduct a simulation to test whether specparam can reliably resolve low-frequency peaks over the 1/f activity.

      Reviewer #2 suggested that the manuscript would benefit from a more detailed account of the effects. In response, we will include more detailed quantifications of the analyzed effects, such as model error and R² values.

      We believe that the planned revisions will strengthen the manuscript and address the primary concerns raised by the reviewers. We sincerely appreciate your thoughtful feedback and look forward to submitting an improved version of the manuscript soon.

      Once again, thank you for your time and expertise in reviewing our work.

      Sincerely,

      Andraž Matkovič & Tisa Frelih

    1. Reviewer #2 (Public review):

      Summary:

      The authors conduct a causal analysis of years of secondary education on brain structure in late life. They use a regression discontinuity analysis to measure the impact of a UK law change in 1972 that increased the years of mandatory education by 1 year. Using brain imaging data from the UK Biobank, they find essentially no evidence for 1 additional year of education altering brain structure in adulthood.

      Strengths:

      The authors pre-registered the study and the regression discontinuity was very carefully described and conducted. They completed a large number of diagnostic and alternate analyses to allow for different possible features in the data. (Unlike a positive finding, a negative finding is only bolstered by additional alternative analyses).

      Weaknesses:

      While the work is of high quality for the precise question asked, ultimately the exposure (1 additional year of education) is a very modest manipulation and the outcome is measured long after the intervention. Thus a null finding here is completely consistent educational attainment (EA) in fact having an impact on brain structure, where EA may reflect elements of training after a second education (e.g. university, post-graduate qualifications, etc) and not just stopping education at 16 yrs yes/no.

      The work also does not address the impact of the UK Biobank's well-known healthy volunteer bias (Fry et al., 2017) which is yet further magnified in the imaging extension study (Littlejohns et al., 2020). Under-representation of people with low EA will dilute the effects of EA and impact the interpretation of these results.

      References:

      Fry, A., Littlejohns, T. J., Sudlow, C., Doherty, N., Adamska, L., Sprosen, T., Collins, R., & Allen, N. E. (2017). Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population. American Journal of Epidemiology, 186(9), 1026-1034. https://doi.org/10.1093/aje/kwx246

      Littlejohns, T. J., Holliday, J., Gibson, L. M., Garratt, S., Oesingmann, N., Alfaro-Almagro, F., Bell, J. D., Boultwood, C., Collins, R., Conroy, M. C., Crabtree, N., Doherty, N., Frangi, A. F., Harvey, N. C., Leeson, P., Miller, K. L., Neubauer, S., Petersen, S. E., Sellors, J., ... Allen, N. E. (2020). The UK Biobank imaging enhancement of 100,000 participants: rationale, data collection, management and future directions. Nature Communications, 11(1), 2624. https://doi.org/10.1038/s41467-020-15948-9

    1. Author response:

      The following is the authors’ response to the original reviews.

      We greatly appreciate reviewer 2 comments with both insightful and clearly evaluated assessments of this study that include, much appreciated reframing and evaluation of the study’s advances in the sleep field. It is a constructive review and provides considerable added value to this study in better defining the biological significance of the findings, including both advances and limitations.  

      Reviewer 2 nicely summarized the work as “…highlight(ing) the accumulation and resolution of sleep need centered on the strength of excitatory synapses onto excitatory neurons.”. The reviewer succinctly placed one of the main electrophysiological findings in context of one of the sleep field’s most prevalent views, “that LTP associated with wake, leads to the accumulation of sleep need by increasing neuronal excitability, and by the "saturation" of LTP capacity.” It has been speculated that “This saturation subsequently impairs the capacity for further ongoing learning. This new data provides a satisfying mechanism of this saturation phenomenon (and its restoration by recovery sleep) by introducing the concept of silent synapses.” We want to emphasize that sleep need and its resolution involves more than just homeostasis of excitatory synaptic strength but may also be extended to include homeostasis of excitatory synaptic potential to undergo LTP (a homeostasis of meta-plasticity), with implications for learning and memory.   

      Reviewer 2 also identified another advance made by this study, summarized as, “The new snRNAseq dataset indicates the sleep need is primarily seen (at the transcriptional level) in excitatory neurons, consistent with a number of other studies.” References for these studies are nicely provided by the reviewer. Our analysis of this data extends the evidence for transcriptional sleep-need-driven changes, observed by us and others in excitatory neurons to more particularly involve the excitatory neurons in layers 2-5, targeting  intra-telencephalic neurons.  

      Reviewer 2, importantly noted, “New snRNAseq analysis indicates that SD drives the expression of synaptic shaping components (SSCs) consistent with the excitatory synapse as a major target for the restorative basis of sleep function”, and that “SD-induced gene expression is also enriched for autism spectrum disorder (ASD) risk genes”. These comments are well appreciated as they emphasize that beyond identification of the major target cell type of sleep function, the major sleep-target, gene-ontological characteristics are starting to be addressed.

      Reviewer 2 commented on the molecular sleep model, making a key observation that “SDinduced gene expression in excitatory neurons overlaps with genes regulated by the transcription factor MEF2C and HDAC4/5 (Figure 4),” and accurately discusses the significance with respect to the proposed model.

      We are in complete agreement with the observation that the molecular sleep model presented is not “definitively supported by the new data and in this regard should be viewed as a perspective…”. One of the more glaring gaps in supporting evidence is the absence of understanding of the role of HDAC4/5 (part of the SIK3-HDAC4/5 pathway) in sleep need modulation of excitatory synapses. Resolution of this issue might be approached by assessment of the synaptic effects of constitutively nuclear HDAC4/5. The current study provides a first step in the assessment by showing a correlation between HDAC4/5 and MEF2c target genes and a subset of differentially expressed synaptic shaping component (SSC) genes that modulate excitatory synapse strength and phenotype. However, the functional studies have yet to be completed. Complimentary studies on SD-induced SSC-DEGs (identified in this study) are also needed for follow-up characterization of their sleep need induced functional impact (both strength and meta-plasticity modulation) on the most relevant excitatory synapses (as identified in the current study).

      We agree with both reviewers 1 and 2 that, “Additional work is also needed to understand the mechanistic links between SIK3-HDAC4/5 signaling and MEF2C activity”. Reviewer 2 clarifies the key unresolved issue as, “cnHDAC4/5 suppresses NREM amount and NREM SWA but had no effect on the NREM-SWA increase following SD (Zhou et al., Nature 2022). Loss of MEF2C in CaMKII neurons had no effect on NREM amount and suppressed the increase in NREM-SWA following SD (Bjorness et al., 2020)”. One may conclude with reviewer 2, “These instances indicate that cnHDAC4/5 and loss of MEF2C do not exactly match suggesting additional factors are relevant in these phenotypes.”

      An understanding of the mechanism(s) responsible for the relationship between sleep need and SWA are critical to the evaluation of sleep need’s correlation with sleep DEGs and synaptic transmission, including “additional factors” as suggested by reviewer 2. SWA might result from a decrease of cortical glutamatergic neurotransmission below some threshold, which might occur in response to prolonged waking (possibly in response to waking activity-induced local increases of adenosine?), rather than being a cause of, or, being intimately involved in resolving sleep need.  

      An increase of SWA in association with SD can result directly from an acute SD-induced increase in local adenosine concentration. This will elicit an ADORA1-mediated down-regulation of glutamate excitatory neurotransmission in the cortex (Bjorness et al., 2016) and in cholinergic arousal centers (Rainnie et al., 1994; Porkka-Heiskanen et al., 1997; Portas et al., 1997; Li et al., 2023). When MEF2c is derepressed by chronic loss of HDAC4 function, SWA is facilitated (Kim et al., 2022). It is plausible that loss of HDAC4 function contributes to the increased SWA by downscaling glutamate excitatory transmission (independent of sleep need). This is expected to result from derepressed, MEF2c mediated sleep-gene expression.  

      Similarly, over-expression of constitutively active HDAC4 (cnHD4) can contribute to chronic upscaling of cortical glutamate synaptic strength to depress SWA (again, independent of sleep need). Thus, facilitation or depression of SWA correlates with up or down scaling effects on cortical glutamate neurotransmission, respectively, even in the absence of  a direct effects on sleep need (Figure 4D). Many reagents that reduce the excitability of glutamate pyramidal cells by various mechanisms, including anesthetics like isoflurane, barbiturates or benzodiazepines in addition to those activating ADORA1, increase SWA. Finally, it is important to acknowledge that direct evidence for this proposed link of SWA to cortical glutamate transmission remains in need of further investigation. Thus, SWA may reflect generalized cortical glutamate synaptic activity whether modulated by sleep function or by other agents.

      Still, other factors that can have a role mediating some of the mis-match between cnHD4/5 DEGs and Mef2c-cKO DEGs, include the broader over-expression of AAV-cnHD4 compared to CamKII- driven Cre KO of Mef2c. The cnHD4 overexpression can increase arousal center activity in the hypothalamus and other arousal areas to interfere with SWA, but not to the exclusion of SD-DEG repression resulting from a repression of MEF2c-mediated sleep gene expression.

      The critique by reviewer 1 raises a number of important technical issues with this study. A key, potentially critical issue raised by reviewer 1, is that of our method of experimental sleep deprivation (ESD). The reviewer suggests that “…neuronal activity/induction of plasticity”, peculiar to the ESD methodology employed in this study, “…rather than sleep/wake states are responsible for the observed results…”.  

      In this study, a slow-moving treadmill (SMTM; 0.1km/hour, as stated in the methods), requiring locomotion to avoid bumping into the backwall of a false bottomed plexiglass cage was used to induce ESD. A mouse, in its home cage, typically moves much faster than 0.1km/hour and the mouse is able to eat and drink freely while in the cage (see file: video 1). Furthermore, our observations using a beam-break cage, indicate that mice spontaneously travel for comparable to longer distances over 6 hours than the treadmill moves (during the ESD of 6 hours). Finally, our EEG recordings of mice on the active treadmill show 100% waking while it is on (Bjorness et al., 2009), whereas prevention of NREM sleep (including transition time) using the “gentle handling”  (GH) technique occurs depending on the diligence of the experimenter.  

      The accommodation (one week prior to ESD) included exposure to the treadmill-on for 30minutes ~ZT=2 & ZT= 14 hours (now spelled out in the “Materials & Methods” section). Thus, the likelihood of motor learning seems vanishingly small.  

      As with all ESD methods, there must be some associated increase in sensory and motor neuronal activity to drive arousal and prevent transition to sleep. For example, the more widely employed GH method of ESD involves sensory stimulation (tactile and or auditory) of sufficient intensity to induce postural change from that associated with sleep to that associated with wake (often involving some locomotion). Like the SMTM, both sensory and motor systems are likely to be engaged. Unlike the SMTM method, the stimulation used in GH is variably-intermittent from mouse to mouse and from experimenter to experimenter as it is applied only when the experimenter judges the mouse to be falling asleep. . It can even be argued that the varied and unpredictable ways in which these interactions happen cause plastic changes with a higher likelihood than the constant slow motion of a treadmill – the mice know how to walk, after all. In other protocols, novel objects are introduced to the animals – those will certainly trigger plastic processes –something that is avoided using a slow-running treadmill to which the mouse has been accommodated, for sleep deprivation.  

      The changes induced by SMTM technique are reproducible and induce arousal by somatic stimulation of sufficient intensity to induce natural motor activity as with GH. All ESD methods induce motor activity and it is reasonable to speculate that induced, motor activity is essential for effective ESD for the prolonged durations (>4 hours in mice) that elicit high sleep need. Electrophysiological assessment of SD-evoked increases in mEPSC amplitude and frequency using GH-ESD (Liu et al., 2010) are similar in all respects to our observations of the response to SMTMESD (Bjorness et al., 2020). Further studies might directly address a comparison of SMTM-ESD to GH-ESD as suggested by reviewer 1 but are regrettably outside the scope and resources of our study.

      The model presented in Figure 4C is consistent with the experimental findings with respect to the observed electrophysiological changes (including loss of silent synapses and increased AMPA/NMDA ratio after ESD of 6 hours) and altered gene expression that includes enrichment of SSC genes, many of which (7 candidates are listed) can affect both AMPA/NMDA ratio and silent synapses. No claim of mechanism linking the changed expression to altered AMPAR or NMDAR activity can be made at this point, even as to polarity of gene expression, related to electrophysiological outcome. Furthermore, some transcripts may involve receptor trafficking while others more directly affect activated receptor function. To help illustrate the complexity of interpreting gene up-regulation, consider the following hypothetical scenario. If a gene like upregulated Grin3a acts rapidly, it may facilitate reduction of NMDAR function (decreasing plasticity) during ESD, whereas upregulation of a gene like Kif17, if acting in a more delayed manner, might enhance NMDAR surface expression and activity (increasing silent synapses) in response to ESD, during recovery sleep. Relevant references, consistent with these various outcomes are supplied in the manuscript but further investigation is clearly needed, or as reviewer 2 so aptly commented, this work “…provides a framework to stimulate further research and advances on the molecular basis of sleep function”.  

      Several issues are raised by reviewer 1 concerning the electrophysiological methodology and statistical assessment. In regard to the former, we closely followed established protocols employed in the frontal neocortex (Myme et al., 2003). We did not include the details for series resistance monitoring. Series resistance values ranged between 8 and 15 MOhm and experiments with changes larger than 25% not used for further analyses. Thank you for bringing this  oversight on our part, to our attention. This essential information, that is unfailingly gathered for all our whole cell recordings, is now added to the version of record.

      The -90 mV holding potential was chosen according to precedent (Myme et al., 2003). It increases driving force and permits lower stimulus strength for the same response size – reducing the likelihood for polysynaptic responses. Experiments with multiple response peaks at -90 mV were not included in the analysis. The -90 mV holding potential also increases NMDA receptor Mg++ block resulting in a minimally contaminated AMPA response. This information is now added to our submitted version of record.

      The statistical assessments shown in Table 1 refer to two sets of data measured from 3X2=6 different cohorts for each sleep condition (CS, SD, RS): 1) AMPA & NMDA EPSCs and 2) AMPA/NMDA FR ratios (FRR; now bolded in row 1, second tab, Table S1). As stated in the results section, “A two-way ANOVA analysis showed a significant interaction between AMPA matched to NMDA EPSC response for each neuron, and sleep condition (F (2, 21) = 7.268, p<0.004; Figure 1 A, C, E). When considered independently, neither the effect of sleep condition nor of EPSC subtype reached significance at p<0.05 (Figure 1 C)”.  

      As noted by reviewer 1, we inadvertently dropped one of the data points from the RS FR and FR ratio (FRR) statistical analysis (raw data in the third tab of Table S1, statistical data in fourth and fifth tab and illustrated in figure 1 F). Thanks to this appreciated, rigorous review, we can correct the oversight (using raw data unchanged in Table S1, third tab). The Table S1 and figure 1 F are now corrected for the version of record. For better clarity, we now use two tabs, the fourth and fifth tabs, respectively of Table S1, for separate stat analyses of FR and FRR data.

      The significance of the AMPA/NMDA FRR across sleep conditions was assessed with the KruskalWallis test, a non-parametric method. The two-stage linear step-up procedure of Benjamini, Krieger, and Yekutieli (BKY) was used to control for the FDR across multiple sleep conditions, in the non-parametric Kruskal-Wallis test but it is usually less powerful than tests presuming normal distributions like the one-way ANOVA and Holm-Sidak’s test. We have now added re-analyzed  FRR across CS, SD and RS conditions using a normal one-way ANOVA (Table S1, tab5). The results now read, “The difference between  sleep conditions and FRR is significant (F (2, 19) = 11.3, Table S1, tab5). Multiple comparisons (Holm-Sidak, Table S1, tab5) indicate the near absence of silent synapses was reversed by either CS or RS (SD/CS; p<0.0011 and SD/RS: p<0.0006; Table S1, tab 5; Figure 1 F).”. These analyses compare well to the non-parametric assessment using the  KruskalWallis test (significant at p= 0.0006) with BYK correction for multiple comparison analysis to give for CS-SD, p<= 0.0262 and for RS-SD, p<= 0.0006 (statistics also shown in Table S1, tab5). [Also shown in tab5 is the “standard approach of correcting for family wise error rate”, namely, Dunn’s test. It is more conservative but less powerful than the BYK correction- in general the tradeoff of greater power/ less conservative is better tolerated when many comparisons are made, however, it can be argued that in the present analysis type 2 errors are also potentially misleading and thus not well tolerated.]  The modifications of our statistical analyses, inspired by reviewer 1,  did not affect the interpretation of the data nor the conclusions.  

      Bjorness TE, Kelly CL, Gao T, Poffenberger V, Greene RW (2009) Control and function of the homeostatic sleep response by adenosine A1 receptors. The Journal of neuroscience : the official journal of the Society for Neuroscience 29:1267-1276.

      Bjorness TE, Dale N, Mettlach G, Sonneborn A, Sahin B, Fienberg AA, Yanagisawa M, Bibb JA, Greene RW (2016) An Adenosine-Mediated Glial-Neuronal Circuit for

      Homeostatic Sleep. The Journal of neuroscience : the official journal of the Society for Neuroscience 36:3709-3721.

      Bjorness TE, Kulkarni A, Rybalchenko V, Suzuki A, Bridges C, Harrington AJ, Cowan CW, Takahashi JS, Konopka G, Greene RW (2020) An essential role for MEF2C in the cortical response to loss of sleep in mice. Elife 9.

      Kim SJ et al. (2022) Kinase signalling in excitatory neurons regulates sleep quantity and depth. Nature 612:512-518.

      Li B, Ma C, Huang YA, Ding X, Silverman D, Chen C, Darmohray D, Lu L, Liu S, Montaldo G, Urban A, Dan Y (2023) Circuit mechanism for suppression of frontal cortical ignition during NREM sleep. Cell 186:5739-5750 e5717.

      Liu ZW, Faraguna U, Cirelli C, Tononi G, Gao XB (2010) Direct evidence for wake-related increases and sleep-related decreases in synaptic strength in rodent cortex. The Journal of neuroscience : the official journal of the Society for Neuroscience 30:8671-8675.

      Myme CI, Sugino K, Turrigiano GG, Nelson SB (2003) The NMDA-to-AMPA ratio at synapses onto layer 2/3 pyramidal neurons is conserved across prefrontal and visual cortices. Journal of neurophysiology 90:771-779.

      Porkka-Heiskanen T, Strecker RE, Thakkar M, Bjorkum AA, Greene RW, McCarley RW (1997) Adenosine: a mediator of the sleep-inducing effects of prolonged wakefulness. Science 276:1265-1268.

      Portas CM, Thakkar M, Rainnie DG, Greene RW, McCarley RW (1997) Role of adenosine in behavioral state modulation: a microdialysis study in the freely moving cat. Neuroscience 79:225-235.

      Rainnie DG, Grunze HC, McCarley RW, Greene RW (1994) Adenosine inhibition of mesopontine cholinergic neurons: implications for EEG arousal. Science 263:689692.

    2. eLife Assessment

      This important study showing that sleep deprivation increases functional synapses while depleting silent synapses supports previous findings that excitatory signaling increases during wakefulness. This manuscript focuses in particular on AMPA/NMDA ratios. An interesting, although speculative, aspect of the manuscript is the inclusion of a model for the accumulation of sleep need that is based upon the MEF2C transcription factor but also links to the sleep-regulating SIK3-HDAC4/5 pathway. The authors have clarified some questions raised in the previous review, but the evidence for major claims was still found to be incomplete, requiring additional experimentation.

    3. Reviewer #1 (Public review):

      Summary:

      This manuscript by Vogt et al examines how the synaptic composition of AMPA and NMDA receptors changes over sleep and wake states. The authors perform whole-cell patch clamp recordings to quantify changes in silent synapse number across conditions of spontaneous sleep, sleep deprivation, and recovery sleep after deprivation. They also perform single nucleus RNAseq to identify transcriptional changes related to AMPA/NMDA receptor composition following spontaneous sleep and sleep deprivation. The findings of this study are consistent with a decrease in silent synapse number during wakefulness and an increase during sleep. However, these changes cannot be conclusively linked to sleep/wake states. Measurements were performed in motor cortex, and sleep deprivation was achieved by forced locomotion, raising the possibility that recent patterns of neuronal activity, rather than sleep/wake states, are responsible for the observed results.

      Strengths:

      This study examines an important question. Glutamatergic synaptic transmission has been a focus of studies in the sleep field, but AMPA receptor function has been the primary target of these studies. Silent synapses, which contain NMDA receptors but lack AMPA receptors, have important functional consequences for the brain. Exploring the role of sleep in regulating silent synapse number is important to understanding the role of sleep in brain function. The electrophysiological approach of measuring the failure rate ratio, supported by AMPA/NMDA ratio measurements, is a rigorous tool to evaluate silent synapse number.

      The authors also perform snRNAseq to identify genes differentially expressed in the spontaneous sleep and sleep deprivation groups. This analysis reveals an intriguing pattern of upregulated genes controlled by HDAC4 and Mef2c, along with synaptic shaping component genes and genes associated with autism spectrum disorder, across cell types in the sleep deprivation group. This unbiased approach identifies candidate genes for follow-up studies. The finding that ASD-risk genes are differentially expressed during SD also raises the intriguing possibility that normal sleep function is disrupted in ASD.

      Weaknesses:

      A major consideration to the interpretation of this study is the use of forced locomotion for sleep deprivation. Measurements are made from motor cortex, and therefore the effects observed could be due to differences in motor activity patterns across groups, rather than lack of sleep per se. Considering that other groups have failed to find a difference in AMPA/NMDA ratio in mice with different spontaneous sleep/wake histories (Bridi et al., Neuron 2020), confirmation of these findings in a different brain region would greatly strengthen the study.

      The electrophysiological measurements and statistical analyses raise several questions. Input resistance (cutoffs and actual values) are not provided, making it difficult to assess recording quality. Parametric one-way ANOVAs were used, although the data do not appear to be normally distributed. In addition, for the AMPA/NMDA and FRR measurements (Figures 1E, F), the SD group (rather than the control sleep group) was used as the control group for post-hoc comparisons, but it is unclear why. While the data appear in line with the authors' conclusions, the number of mice (3/group) and cells recorded is low, and adding more would better account for inter-animal variability and increase the robustness of the findings.

      The snRNAseq data are intriguing. However, several genes relevant to the AMPA/NMDA ratio are mentioned, but the encoded proteins would be expected to have variable effects on AMPA/NMDA receptor trafficking and function, making the model presented in Figure 4C oversimplified. A more thorough discussion of the candidate genes and pathways that are upregulated during sleep deprivation, the spatiotemporal/posttranslational control of protein expression, and their effects on AMPA/NMDA trafficking vs function is warranted.

    4. Reviewer #2 (Public review):

      Summary:

      Here Vogt et al., provide new insights into the need for sleep and the molecular and physiological response to sleep loss. The authors expand on their previously published work (Bjorness et al., 2020) and draw from recent advances in the field to propose a neuron-centric molecular model for the accumulation and resolution of sleep need and basis of restorative sleep function. While speculative, the proposed model successfully links important observations in the field and provides a framework to stimulate further research and advances on the molecular basis of sleep function. In my review, I highlight the important advances of this current work, the clear merits of the proposed model, and indicate areas of the model that can serve to stimulate further investigation.

      Strengths:

      Reviewer comment on new data in Vogt et al., 2024<br /> Using classic slice electrophysiology, the authors conclude that wakefulness (sleep deprivation (SD)) drives a potentiation of excitatory glutamate synapses, mediated in large part by "un-silencing" of NMDAR-active synapses to AMPAR-active synapses. Using a modern single nuclear RNAseq approach the authors conclude that SD drives changes in gene expression primarily occurring in glutamatergic neurons. The two experiments combined highlight the accumulation and resolution of sleep need centered on the strength of excitatory synapses onto excitatory neurons. This view is entirely consistent with a large body of extant and emerging literature and provides important direction for future research.

      Consistent with prior work, wakefulness/SD drives an LTP-type potentiation of excitatory synaptic strength on principle cortical neurons. It has been proposed that LTP associated with wake, leads to the accumulation of sleep need by increasing neuronal excitability, and by the "saturation" of LTP capacity. This saturation subsequently impairs the capacity for further ongoing learning. This new data provides a satisfying mechanism of this saturation phenomenon by introducing the concept of silent synapses. The new data show that in mice well rested, a substantial number of synapses are "silent", containing an NMDAR component but not AMPARs. Silent synapses provide a type of reservoir for learning in that activity can drive the un-silencing, increasing the number of functional synapses. SD depletes this reservoir of silent synapses to essentially zero, explaining how SD can exhaust learning capacity. Recovery sleep led to restoration of silent synapses, explaining how recovery sleep can renew learning capacity. In their prior work (Bjorness et al., 2020) this group showed that SD drives an increase in mEPSC frequency onto these same cortical neurons, but without a clear change in pre-synaptic release probability, implying a change in the number of functional synapses. This prediction is now born out in this new dataset.

      The new snRNAseq dataset indicates the sleep need is primarily seen (at the transcriptional level) in excitatory neurons, consistent with a number of other studies. First, this conclusion is corroborated by an independent, contemporary snRNAseq analysis recently available as a pre-print (Ford et al., 2023 BioRxiv https://doi.org/10.1101/2023.11.28.569011). A recently published analysis on the effects of SD in drosophila imaged synapses in every brain region in a cell-type dependent manner (Weiss et al., PNAS 2024), concluding that SD drives brain wide increases in synaptic strength almost exclusively in excitatory neurons. Further, Kim et al., Nature 2022, heavily cited in this work, show that the newly described SIK3-HDAC4/5 pathway promotes sleep depth via excitatory neurons and not inhibitory neurons.

      The new experiments provided in Fig1-3 are expertly conducted and presented. This reviewer has no comments of concern regarding the execution and conclusions of these experiments.

      Reviewer comment on model in Vogt et al., 2024<br /> To the view of this reviewer the new model proposed by Vogt et al., is an important contribution. The model is not definitively supported by new data, and in this regard should be viewed as a perspective, providing mechanistic links between recent molecular advances, while still leaving areas that need to be addressed in future work. New snRNAseq analysis indicates SD drives expression of synaptic shaping components (SSCs) consistent with the excitatory synapse as a major target for the restorative basis of sleep function. SD induced gene expression is also enriched for autism spectrum disorder (ASD) risk genes. As pointed out by the authors, sleep problems are commonly reported in ASD, but the emphasis has been on sleep amount. This new analysis highlights the need to understand the impact on sleep's functional output (synapses) to fully understand the role of sleep problems in ASD.

      Importantly, SD induced gene expression in excitatory neurons overlap with genes regulated by the transcription factor MEF2C and HDAC4/5 (Fig. 4). In their prior work, the authors show loss of MEF2C in excitatory neurons abolished the SD transcriptional response and the functional recovery of synapses from SD by recovery sleep. Recent advances identified HDAC4/5 as major regulators of sleep depth and duration (in excitatory neurons) downstream of the recently identified sleep promoting kinase SIK3. In Zhou et al., and Kim et al., Nature 2022, both groups propose a model whereby "sleep-need" signals from the synapse activate SIK3, which phosphorylates HDAC4/5, driving cytoplasmic targeting, allowing for the de-repression and transcriptional activation of "sleep genes". Prior work shows that HDAC4/5 are repressors of MEF2C. Therefore, the "sleep genes" derepressed by HDAC4/5 may be the same genes activated in response to SD by MEF2C. The new model thereby extends the signaling of sleep need at synapses (through SIK3-HDAC4/5) to the functional output of synaptic recovery by expression of synaptic/sleep genes by MEF2C. The model thereby links aspects of expression of sleep need with the resolution of sleep need by mediating sleep function: synapse renormalization.

      Weaknesses:

      Areas for further investigation.<br /> In the discussion section Vogt et al., explore the links between excitatory synapse strength, arguably the major target of "sleep function", and NREM slow-wave activity (SWA), the most established marker of sleep need. SIK3-HDAC4/5 have major effects on the "depth" of sleep by regulating NREM-SWA. The effects of MEF2C loss of function on NREM SWA activity are less obvious, but clearly impact the recovery of glutamatergic synapses from SD. The authors point out how adenosine signaling is well established as a mediator of SWA, but the links with adenosine and glutamatergic strength are far from clear. The mechanistic links between SIK3/HDAC4/5, adenosine signaling, and MEF2C, are far from understood. Therefore, the molecular/mechanistic links between a synaptic basis of sleep need and resolution with NREM-SWA activity require further investigation.

      Additional work is also needed to understand the mechanistic links between SIK3-HDAC4/5 signaling and MEF2C activity. The authors point out that constitutively nuclear (cn) HDAC4/5 (acting as a repressor) will mimic MEF2C loss of function. This is reasonable, however, there are notable differences in the reported phenotypes of each. Notably, cnHDAC4/5 suppresses NREM amount and NREM SWA but had no effect on the NREM-SWA increase following SD (Zhou et al., Nature 2022). Loss of MEF2C in CaMKII neurons had no effect on NREM amount and suppressed the increase in NREM-SWA following SD (Bjorness et al., 2020). These instances indicate that cnHDAC4/5 and loss of MEF2C do not exactly match suggesting additional factors are relevant in these phenotypes. Likely HDAC4/5 have functionally important interactions with other transcription factors, and likewise for MEF2C, suggesting areas for future analysis.

      One emerging theme may be that the SIK3-HDAC4/5 axis are major regulators of the sleep state, perhaps stabilizing the NREM state once the transition from wakefulness occurs. MEF2C is less involved in regulating sleep per se, and more involved in executing sleep function, by promoting restorative synaptic modifications to resolve sleep need.

      Finally, advances in the roles of the respective SIK3-HDAC4/5 and MEF2C pathways point towards transcription of "sleep genes", as clearly indicated in the model of Fig.4. Clearly more work is needed to understand how the expression of such genes ultimately lead to resolution of sleep need by functional changes at synapses. What are these sleep genes and how do they mechanistically resolve sleep need? Thus, the current work provides a mechanistic framework to stimulate further advances in understanding the molecular basis for sleep need and the restorative basis of sleep function.

    1. eLife Assessment

      This useful study sheds light on the species-specific nature of sperm-oocyte interactions by examining sperm binding and penetration of the zona pellucida across various mammalian species. While the evidence remains incomplete, the authors propose that two distinct mechanisms drive mammalian sperm-oocyte recognition and penetration: a specific, zona pellucida (ZP)-mediated mechanism, and a non-specific, oviductal glycoprotein 1 (OVGP1)-mediated mechanism. Upon revision, this study would offer insights to reproductive biologists, potentially improving porcine in vitro fertilization (IVF) - which is particularly susceptible to polyspermy - and enhancing sperm selection processes in human IVF, ultimately leading to better outcomes in assisted reproduction techniques.

    2. Reviewer #1 (Public review):

      Summary:

      This very interesting manuscript first shows that human, murine, and feline sperm penetrate the zona pellucida (ZP) of bovine oocytes recovered directly from the ovary, although first cleavage rates are reduced (Figure 1A). Similarly, bovine sperm can penetrate superovulated murine oocytes recovered directly from the ovary (Figure 1B). However, bovine oocytes incubated with oviduct fluid (30 min) are generally impenetrable by human sperm (Figure 1C).

      Thereafter, the cytoplasm was aspirated from murine oocytes - obtained from the ovary (Figure 1D) or oviduct (Figure 1D). Binding and penetration by bovine and human sperm were reduced in both groups relative to homologous (murine) sperm. However, heterologous (bovine and human) sperm penetration was further reduced in oviduct vs. ovary derived empty ZP. These compelling data show that outer (ZP) not inner (cytoplasmic) oocyte alterations reduce heterologous sperm penetration as well as homologous sperm binding.

      This was repeated using empty bovine ZP incubated (Figure 2B), or not (Figure 2A) with bovine oviduct fluid. Prior oviduct fluid exposure reduced non-homologous (human and murine) empty ZP penetration, polyspermy, and sperm binding. This demonstrates that species-specific oviduct fluid factors regulate ZP penetrability.

      To test the hypothesis that OVGP1 is responsible, the authors obtained his-tagged bovine and murine OVGP1 and DDK-tagged human OVGP1 proteins. Tagging was to enable purification following overexpression in BHK-21 or HEK293T cells. The authors confirm these recombinant OVGP1 proteins bound to both murine (Figure 3C) and bovine (Figure 3D) oocytes. Moreover, previous data using oviduct fluid (Figure 1D-E and 2A-B) was mirrored using bovine oocytes supplemented with homologous (bovine) recombinant OVGP1 (Figure 4B) or not (Figure 4A). This confirms the hypothesis, at least in cattle.

      Next, the authors exposed bovine (Figure 6A) and murine (Figure 6B) empty ZP to bovine, murine, and human recombinant OVGP1, in addition to bovine, murine, or human sperm. Interestingly, both species-specific ZP and OVGP1 seem to be required for optimal sperm binding and penetration.

      Lastly, empty bovine (Figures 7A-B) and murine (Figures 7C-D) ZP were treated with neuraminidase, or not, with or without pre-treatment with homologous OVGP1. In each case, neuraminidase reduced sperm binding and penetration. This further demonstrates that both ZP and OVGP1 are required for optimal sperm binding and penetration.

      Strengths:

      The authors convincingly demonstrate that two mechanisms underpin mammalian sperm recognition and penetration, the first being specific (ZP-mediated) and the second non-specific (OVGP1-mediated). This may prove useful for improving porcine in vitro fertilization (IVF), which is notoriously prone to polyspermy, in addition to human IVF, for better intrinsic individual sperm selection.

      Weaknesses:

      In my estimation, the following would improve this manuscript:

      (1) The physiological relevance of these data could be better highlighted. For instance, future work could revolve around incubating oocytes with oviduct fluid (or OVGP1) to reduce polyspermy in porcine IVF, and naturally improve sperm selection in human IVF.

      (2) Biological and technical replicate values for each experiment are unclear - for semen, oocytes, and oviduct fluid pools. I suggest providing in the Materials and Methods and/or Figure legends.

      (3) Although differences presented in the bar charts seem obvious, providing statistical analyses would strengthen the manuscript.

      (4) Results are presented as {plus minus} SEM (line 677); however, I believe standard deviation is more appropriate.

      (5) Given the many independent experimental variables and combinations, a schematic depiction of the experimental design may benefit readers.

      (6) Attention to detail can be improved in parts, as delineated in the "author recommendation" review section.

    3. Reviewer #2 (Public review):

      In the manuscript entitled "Oviductin sets the species-specificity of the mammalian zona pellucida." The study analyzes the species specificity of sperm-egg recognition by looking at sperm binding and penetration of zonae pellucidae from different mammalian species and find a role for the oviductal protein OVGP1 in determining species specificity.

      Strengths:

      By combining sperm, oocytes, zona pellucida (ZP), and oviductal fluid from different mammalian species, they elucidate the essential role of OVGP1 in conferring species-specific fertilization.

      Weaknesses:

      The authors postulate a role for oviductal fluid in species-specific fertilization, but in my opinion, they cannot rule out hormonal effects or differences in the method of oocyte maturation employed.

      They also cannot unequivocally prove that OVGP1 is the oviductal protein involved in the effect. Additional experiments are necessary to rule out these alternative explanations.

      When performing the EZPT assay on mouse oocytes obtained either from the ovary or from the oviduct, the oocytes obtained from the ovary came from mice primed with eCG, whereas the ones collected from the oviduct were obtained from superovulated mice (eCG plus hCG). This difference in the hormonal environment may make a difference in the properties of the ZP. Additionally, the ones obtained from the ovary were in vitro matured, which is also different from the freshly ovulated eggs and, again, may change the properties of the ZP. I suggest doing this experiment superovulating both groups of mice but collecting the fully matured MII eggs from the ovary before they get ovulated. In that way the hormonal environment will be the same in both groups and in both groups, oocytes will be matured in vivo. Hence, the only difference will be the exposure to oviductal fluids.

      Mice with OVGP1 deletion are viable and fertile. It would be quite interesting to investigate the species-specificity of sperm-ZP binding in this model. That would indicate whether OVGP1 is the only glycoprotein involved in determining species-specificity. Alternatively, the authors could immunodeplete OVGP1 from oviductal fluid and then ascertain whether this depleted fluid retains the ability to impede cross-species fertilization.

      What is the concentration of OVGP1 in the oviduct? How did the authors decide what concentration of protein to use in the experiments where they exposed ZPs to purified OVGP1? Why did they use this experimental design to check the structure of the ZP by SEM? Why not do it on oocytes exposed to oviductal fluid, which would be more physiological?

      None of the figures show any statistical analysis. Please perform analysis for all the data presented, include p values, and indicate which statistical tests were performed. The Statistical analysis section in the Methods indicating that repeated measures ANOVA was used must refer to the tables. Was normality tested? I doubt all the data are normally distributed, in which case using ANOVA is not appropriate.

      Why was OVGP1 selected as the probable culprit of the species specificity? In the Results section entitled "Homology of bovine, human and murine OVGP1 proteins..." the authors delve into the possible role of this protein without any rationale for investigating it. What about other oviductal proteins?

    4. Reviewer #3 (Public review):

      Summary:

      The present study reports findings from a series of experiments suggesting that bovine oviductal fluid and species-specific oviductal glycoprotein (OVGP1 or oviductin) from bovine, murine, or human sources modulate the species specificity of bovine and murine oocytes.

      Strengths:

      The study reported in the manuscript deals with an important topic of interest in reproductive biology.

      Weaknesses:

      The manuscript began with a well-written introduction, but problems started to surface in the Results section, in the Discussion, as well as in the Materials and Methods. Major concerns include inconsistencies, misinterpretation of results, lacking up-to-date literature search, numerous errors found in the figure legends, misleading and incorrect information given in the Materials and Methods, missing information regarding statistical analysis, and inadequate discussion. These concerns raise questions regarding the authenticity of the study, reliability of the findings, and interpretation of the results. The manuscript does not provide solid and convincing findings to support the conclusion.

    5. Author response

      We appreciate the positive comments and constructive suggestions from the editors and reviewers, which will help us improve our manuscript. We will implement the changes as requested by the reviewers, focusing primarily on revising and clarifying the following aspects:

      First, we will clarify the use of biological and technical replicates in each experiment and provide more details about the statistical analyses conducted. Additionally, we plan to include a schematic representation of the experimental design.

      Second, we will explain the experiment conducted to rule out hormonal effects or differences in the oocyte maturation method used. We will also indicate the concentration of OVGP1 in the oviduct and explain why we selected OVGP1 as the probable cause of species specificity.

      Third, by addressing all of the reviewers' suggestions, we aim to resolve any concerns, inconsistencies, or minor errors identified by the reviewers.

      We are committed to addressing all the issues raised by the reviewers and believe that the manuscript will greatly benefit from the insightful suggestions and invaluable contributions of the editors and reviewers.

    1. eLife Assessment

      This useful study shows how genetic variation is associated with fecundity following a period of reproductive diapause in female Drosophila. The work identifies the olfactory system as central to successful diapause with associated changes in longevity and fecundity. While the methods used are solid, a limitation of the study, as of any other laboratory-based investigation is the challenge of demonstrating how well measures for fitness related to diapause and its recovery correlates with realities encountered during development in the wild.

    2. Reviewer #1 (Public Review):

      Summary:

      The paper begins with phenotyping the DGRP for post-diapause fecundity, which is used to map genes and variants associated with fecundity. There are overlaps with genes mapped in other studies and also functional enrichment of pathways including most surprisingly neuronal pathways. This somewhat explains the strong overlap with traits such as olfactory behaviors and circadian rhythm. The authors then go on to test genes by knocking them down effectively at 10 degrees. Two genes, Dip-gamma and sbb are identified as significantly associated with post-diapause fecundity, which they also find the effects to be specific to neurons. They further show that the neurons in the antenna but not arista are required for the effects of Dip-gamma and sbb. They show that removing antenna has a diapause specific lifespan extending effect, which is quite interesting. Finally, ionotropic receptor neurons are shown to be required for the diapause associated effects.

      Strengths:

      Overall I find the experiments rigorously done and interpretations sound. I have no further suggestions except an ANOVA to estimate heritability of the post-diapause fecundity trait, which is routinely done in the DGRP and offers a global parameter regarding how reliable phenotyping is. A minor point is I cannot find how many DGRP lines are used.

      Weaknesses:

      None noted.

    3. Reviewer #2 (Public Review):

      Summary

      In this study, Easwaran and Montell investigated the molecular, cellular, and genetic basis of adult reproductive diapause in Drosophila using the Drosophila Genetic Reference Panel (DGRP). Their GWAS revealed genes associated with variation in post-diapause fecundity across the DGRP and performed RNAi screens on these candidate genes. They also analyzed the functional implications of these genes, highlighting the role of genes involved in neural and germline development. In addition, in conjunction with other GWAS results, they noted the importance of the olfactory system within the nervous system, which was supported by genetic experiments. Overall, their solid research uncovered new aspects of adult diapause regulation and provided a useful reference for future studies in this field.

      Strengths:

      The authors used whole-genome sequenced DGRP to identify genes and regulatory mechanisms involved in adult diapause. The first Drosophila GWAS of diapause successfully uncovered many QTL underlying post-diapause fecundity variations across DGRP lines. Gene network analysis and comparative GWAS led them to reveal a key role for the olfactory system in diapause lifespan extension and post-diapause fecundity.

      Comments on revised version:

      While the authors have addressed many of the minor concerns raised by the reviewers, they have not fully resolved some of the key criticisms. Notably, two reviewers highlighted significant concerns regarding the phenotype and assay of post-diapause fecundity, which are critical to the study. The authors acknowledged that this assay could be confounded by the 'cold temperature endurance phenotype,' potentially altering the interpretation of their results. However, they responded by stating that it is not obvious how to separate these effects experimentally. This leaves the analysis in this research ambiguous, as also noted by Reviewer #3.

      Additionally, I raised concerns about the validity of prioritizing genes with multiple associated variants. Although the authors agreed with this point, they did not revise the manuscript accordingly. The statement that 'Genes with multiple SNPs are good candidates for influencing diapause traits' is not a valid argument within the context of population and quantitative genetics.

      In summary, the authors have not fully utilized the peer-review process to address the critical weaknesses identified, which ultimately leaves the quality of their work in question.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper begins with phenotyping the DGRP for post-diapause fecundity, which is used to map genes and variants associated with fecundity. There are overlaps with genes mapped in other studies and also functional enrichment of pathways including most surprisingly neuronal pathways. This somewhat explains the strong overlap with traits such as olfactory behaviors and circadian rhythm. The authors then go on to test genes by knocking them down effectively at 10 degrees. Two genes, Dip-gamma and sbb, are identified as significantly associated with post-diapause fecundity, and they also find the effects to be specific to neurons. They further show that the neurons in the antenna but not the arista are required for the effects of Dip-gamma and sbb. They show that removing the antenna has a diapause-specific lifespan-extending effect, which is quite interesting. Finally, ionotropic receptor neurons are shown to be required for the diapause-associated effects.

      Strengths and Weaknesses:

      Overall I find the experiments rigorously done and interpretations sound. I have no further suggestions except an ANOVA to estimate the heritability of the post-diapause fecundity trait, which is routinely done in the DGRP and offers a global parameter regarding how reliable phenotyping is. A minor point is I cannot find how many DGRP lines are used.

      Thank you for the suggestions. We screened 193 lines and we will add that information to the methods. Additionally, we will add the heritability estimate of the post-diapause fecundity trait.

      Reviewer #2 (Public Review):

      Summary

      In this study, Easwaran and Montell investigated the molecular, cellular, and genetic basis of adult reproductive diapause in Drosophila using the Drosophila Genetic Reference Panel (DGRP). Their GWAS revealed genes associated with variation in post-diapause fecundity across the DGRP and performed RNAi screens on these candidate genes. They also analyzed the functional implications of these genes, highlighting the role of genes involved in neural and germline development. In addition, in conjunction with other GWAS results, they noted the importance of the olfactory system within the nervous system, which was supported by genetic experiments. Overall, their solid research uncovered new aspects of adult diapause regulation and provided a useful reference for future studies in this field.

      Strengths:

      The authors used whole-genome sequenced DGRP to identify genes and regulatory mechanisms involved in adult diapause. The first Drosophila GWAS of diapause successfully uncovered many QTL underlying post-diapause fecundity variations across DGRP lines. Gene network analysis and comparative GWAS led them to reveal a key role for the olfactory system in diapause lifespan extension and post-diapause fecundity.

      Weaknesses:

      (1) I suspect that there may be variation in survivorship after long-term exposure to cold conditions (10ºC, 35 days), which could also be quantified and mapped using genome-wide association studies (GWAS). Since blocking Ir21a neuronal transmission prevented flies from exiting diapause, it is possible that natural genetic variation could have a similar effect, influencing the success rate of exiting diapause and post-diapause mortality. If there is variation in this trait, could it affect post-diapause fecundity? I am concerned that this could be a confounding factor in the analysis of post-diapause fecundity. However, I also believe that understanding phenotypic variation in this trait itself could be significant in regulating adult diapause.

      We agree that it is possible that the ability to endure cool temperatures per se may influence post-diapause fecundity. However, cool temperature is the essential diapause-inducing condition in Drosophila, so it is not obvious how to separate those effects experimentally, and we agree that phenotypic variation in the cool-sensitivity trait itself could be significant in regulating diapause.

      (2) On p.10, the authors conclude that "Dip-𝛾 and sbb are required in neurons for successful diapause, consistent with the enrichment of this gene class in the diapause GWAS." While I acknowledge that the results support their neuronal functions, I remain unconvinced that these genes are required for "successful diapause". According to the RNAi scheme (Figure 4I), Dip-γ and sbb are downregulated only during the post-diapause period, but still show a significant effect, comparable to that seen in the nSyb Gal4 RNAi lines (Figure 4K).

      Our definition of successful diapause is the ability to produce viable adult progeny post-diapause, which requires that the flies enter, maintain, and exit diapause, alive and fertile. We will restate our conclusion to say that Dip-γ and sbb are required for post-diapause fecundity.

      In addition, two other RNAi lines (SH330386, 80461) that did not show lethality did not affect post-diapause fecundity.

      We interpret those results to mean that those RNAi lines were not effective since Dip-γ and sbb are known to be essential.

      Notably, RNAi (27049, KK104056) substantially reduced non-diapause fecundity, suggesting impairment of these genes affects fecundity in general regardless of diapause experience. Therefore, the reduced post-diapause fecundity observed may be a result of this broader effect on fecundity, particularly in a more "sensitized" state during the post-diapause period, rather than a direct regulation of adult diapause by these genes.

      Ubiquitous expression of RNAi lines #27049 or #KK104056 was lethal, so we included the tubGAL80ts repressor to prevent RNAi from taking effect during development. Flies had to be shifted to 30 °C to inactivate the repressor and thereby activate the RNAi. At 30 °C, fecundity of the controls (GFP RNAi lines #9331, KK60102) were also lower (average non-diapause fecundity = 12 and 19 respectively) and similar to #27049 or #KK104056. We also assessed the knockdown using Repo GAL4 and nSyb GAL4 and did not find a significant difference/decline in the non diapause fecundity for #27049 and #KK104056 as compared to a nonspecific RNAi control (#54037).

      (3) The authors characterized 546 genetic variants and 291 genes associated with phenotypic variation across DGRP lines but did not prioritize them by significance. They did prioritize candidate genes with multiple associated variants (p.9 "Genes with multiple SNPs are good candidates for influencing diapause traits."), but this is not a valid argument, likely due to a misunderstanding of LD among variants in the same gene. A gene with one highly significantly associated variant may be more likely to be the causal gene in a QTL than a gene with many weakly associated variants in LD. I recommend taking significance into account in the analysis.

      We agree with the reviewer, and in Supplemental Table S3 we list top-associated SNPs in order from the lowest (most significant) p-value. Most of the top-associated genes from this analysis were uncharacterized CG numbers for which there were insufficient tools available for validation purposes. Nevertheless, there is overlap amongst the highly significant genes by p-value and those with multiple SNPs. Amongst the top 15 genes with multiple associated SNPs- CG18636 & CR15280 ranked 3rd by p-value, CG7759 ranked 4th, CG42732 ranked 10th, and Drip ranked 30th (all above the conservative Bonferroni threshold of 4.8e-8) while three Sbb-associated SNPs also appear in Table 3 above the standard e-5 threshold.

      Reviewer #3 (Public Review):

      Summary:

      Drosophila melanogaster of North America overwinters in a state of reproductive diapause. The authors aimed to measure 'successful' D. melanogaster reproductive diapause and reveal loci that impact this quantitative trait. In practice, the authors quantified the number of eggs produced by a female after she exited 35 days of diapause. The authors claim that genes involved with olfaction in part contribute to some of the variation in this trait.

      Strengths:

      The work used the power platform of the fly DRGP/GWAS. The work tried to verify some of the candidate loci with targeted gene manipulations.

      Weaknesses:

      Some context is needed. Previous work from 2001 established that D. melanogaster reproductive diapause in the laboratory suspends adult aging but reduces post-diapause fecundity. The work from 2001 showed the extent fecundity is reduced is proportional to diapause duration. As well, the 2001 data showed short diapause periods used in the current submission reduce fecundity only in the first days following diapause termination; after this time fecundity is greater in the post-diapause females than in the non-diapause controls.

      The 2001 paper by Tatar et al. reports the number of eggs laid after 3, 6, or 9 weeks in diapause conditions. Thus the diapause conditions used in this study (35 days or 5 weeks) are neither short nor long, rather intermediate. Does the reviewer have a specific concern?

      In this context, the submission fails to offer a meaningful concept for what constitutes 'successful diapause'. There is no biological rationale or relationship to the known patterns of post-diapause fecundity. The phenotype is biologically ambiguous.

      We have unambiguously defined successful diapause as the ability to produce viable adult progeny post-diapause. Other groups have measured % of flies that arrest ovarian development or % of post-diapause flies with mature eggs in the ovary, or # eggs laid post-diapause; however we suggest that # of viable adult progeny produced post-diapause is more meaningful than the other measurements from the point of view of perpetuating the species.

      I have a serious concern about the antenna-removal design. These flies were placed on cool/short days two weeks after surgery. Adults at this time will not enter diapause, which must be induced soon after eclosion. Two-week-old adults will respond to cool temperatures by 'slowing down', but they will continue to age on a time scale of day-degrees. This is why the control group shows age-dependent mortality, which would not be seen in truly diapaused adults. Loss of antennae increases the age-dependent mortality of these cold adults, but this result does not reflect an impact on diapause.

      We carried out the lifespan study under two different conditions. We either removed the antenna and moved the flies directly to 10 °C or we removed the antenna and allowed a “wound healing” period prior to moving the flies to 10 °C (out of concern that the flies might die quickly because wound healing may be impaired at 10 °C). In both cases, antenna removal shortened lifespan. Furthermore the lifespan extension at 10 °C was similar regardless of whether flies had experienced two weeks at 25 °C or not.

      • Appraisal of whether the authors achieved their aims, and whether the results support their conclusions.

      The work falls well short of its aim because the concept of 'successful diapause' is not biologically established. The paper studies post-diapause fecundity, and we don't know what that means. The loci identified in this analysis segregate for a minimally constructed phenotype. The results and conclusions are orthogonal.

      It is unclear to us why the reviewer has such a negative opinion of measuring post-diapause fecundity, specifically the ability to produce viable progeny post-diapause. The value of this measurement seems obvious from the point of view of perpetuating the species.

      • The likely impact of the work on the field, and the utility of the methods and data to the community.

      The work will have little likely impact. Its phenotype and operational methods are weakly developed. It lacks insight based on the primary literature on post-diapause. The community of insect diapause investigators are not likely to use the data or conclusions to understand beneficial or pest insects, or the impact of a changing climate on how they over-winter.

      The reviewer has not explained why his/her opinion is so negative.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Perform an ANOVA to estimate heritability.

      We will do this.

      (2) List the number of DGRP lines tested.

      193

      Reviewer #2 (Recommendations For The Authors):

      [Minor suggestions]

      (1) Check Drosophila italics

      We will do this.

      (2) It would be informative to include the number of DGRP lines used in this study in the Results and Methods section.

      We will include the information that we assessed 193 DGRP lines.

      (3) Figure 1C - several dots are missing at the top of the line.

      We will correct.

      (4) Figures 1E, F - Why use a discontinuous histogram for continuous distribution? Consider using a continuous histogram (e.g. Lafuente et al. (2018) Figure 1C).

      We will do this.

      (5) Figure 1F - Why have fewer bins than panel E?

      Figure 1F is normalized post-diapause fecundity. Individual post-diapause fecundity was normalized to the mean non-diapause fecundity. Then the normalized individual post-diapause fecundity was averaged to get the mean normalized post-diapause fecundity for the DGRP line. So the bins are different in panel E. Please refer to Supplemental Table S1.

      (6) Figure 2D - It would be informative to have fold enrichment stats.

      The following will be added in the methods section: The Gene Ontology (GO) categories and Q-values from the false discovery rate (FDR)-corrected hypergeometric test for enrichment are reported. Additionally, coverage ratios for the number of annotated genes in the displayed network versus the number of genes with that annotation in the genome are provided. GeneMANIA estimates Q-values using the Benjamini-Hochberg procedure.

      (7) Supplementary table (Table S5) or supplemental table (other supplementary tables)? Need consistency (to Supplementary?)

      We will change ‘Supplementary Table S5’ to ‘Supplemental Table S5’.

      (8) Figure 5D,E - unused ticks on the x-axis.

      The unused ticks on the x-axis will be removed from Figures 5D and E.

      Reviewer #3 (Recommendations For The Authors):

      • Suggestions for improved or additional experiments, data or analyses.

      The authors cannot redo the GWAS with an alternative trait that might better reflect 'successful diapause', and I am not even sure what such a trait would involve or mean. Given this limitation, the authors should consider how they can conduct additional experiments to better define, justify, and elaborate how post-diapause reproduction relates to the mechanisms, processes, depth, and 'success' of diapause.

      We agree that it is entirely unclear what trait would be a better measure of successful diapause. Other investigators might have chosen to measure something different but there is no reason why a different choice would be a better choice. We do not believe that this is a “limitation.” We believe that we have unambiguously defined and justified  post-diapause reproduction as a measurement of successful diapause with respect to perpetuating the species through a stressful period.

      • Recommendations for improving the writing and presentation.

      The mechanics of the writing are fine, aside from some typos/grammar issues. But, the paper is conceptually superficial and tautological. It claims to provide a 'stringent criterion' for 'successful diapause', then measures an unjustified trait, then claims this demonstrates variation for 'successful diapause'.

      We respectfully disagree with this opinion.

      This story is conducted without reference to prior, primary literature or on the mechanisms of reproductive diapause. The presentation may be improved by considering the literature and precedence for what and how reproductive diapause is induced, maintained, and terminated ... in many insects as well as Drosophila

      We will revisit our citations of the literature and apologize for any inadvertent omissions.

    1. eLife Assessment

      This study presents a useful investigation of the use of small, de novo-designed protein binding domains (mini-binders) against the Spike protein of SARS-CoV-2 and EGFR, as ligand binding domains on two classes of synthetic receptors, second-generation synNotch (SNIPR) and CAR. The methods and evidence supporting the focused claims are solid. This work will be of interest to synthetic biologists and cell engineers as a starting point to map out the rules for receptor engineering based on mini-binders and ultimately to advance them in biomedical applications.

    2. Reviewer #2 (Public review):

      Summary:

      Weinberg et al. show that spike LCB minibinders can be used as the extracellular domain for SynNotch, SNIPR, and CAR. They evaluated their designs against cells expressing the target proteins and live virus.

      Strengths:

      This is a good fundamental demonstration of alternative use of the minibinder. The results are unsurprising but robust and solid in most cases.

      Weaknesses:

      The manuscript can benefit from better descriptions of the study's novelty. Given that LCB previously worked in SynNotch, what unexpected finding was uncovered by this study? It is well known that the extracellular domain of CAR is amendable to different types of binding domains (e.g., scFv, nanobody, DARPin, natural ligands). So, it is not surprising that a minibinder also works with CAR. We don't know if the minibinders are more or less likely to be compatible with CAR or SNIPR.

      The demonstrations are all done using just 1 minibinder. It is hard to conclude that minibinders, as a unique class of protein binders, are generalizable in different contexts. All it can conclude is that this specific Spike minibinder can be used in synNotch, SNIPR, and CAR. The LCB3 minibinder seems to be much weaker.

      The sensing of live viruses is interesting, but the output is very weak. It is difficult to imagine a utility for such a weak response.

    3. Author response:

      The following is the authors’ response to the original reviews.

      In our initial submission, reviewers highlighted that the major limitations of our study were related to both the number of minibinders tested as well as the number of optimizations we evaluated for improving minibinder function. In this revision, we have focused on expanding the minibinders tested. To do so, we selected two previously published minibinders against the epidermal growth factor receptor (EGFR). Selection of EGFR as a target enabled us to evaluate two minibinders that bind at different sites, unlike the previously evaluated binders LCB1 and LCB3 which both bind the same interface on SARS-CoV-2 Spike. Further, using EGFR as a target enabled us to qualitatively compare the efficacy of minibinder-coupled chimeric antigen receptors against an existing anti-EGFR CAR. We believe the results here demonstrate broader generalizability of our approach across binding sites, targets, and minibinders. We hope this addition is sufficient to convince future would-be users of these tools to attempt synthetic receptor engineering using minibinders against their protein of choice.

      Reviewers made comments about the presentation of flow data and the use of statistics throughout the manuscript. We did not modify how flow data are presented as the density plots we used are common throughout the field. We have opted to not include statistics – we believe that in the case of most of the experiments we show, our findings are obvious. In cases where statistics would be helpful for discerning whether subtle effects are real – for example, comparing the linker-based optimizations or comparing the anti-EGFR CARs – we believe that other experimental factors like construct expression are sufficient confounds that even in the presence of statistically significant effects we would be leading readers astray to make such claims about our data. As such, we have sought to limit the claims we make and hope that reviewers and audience agree we do not over interpret our data without statistical support.

      On more minor points, both reviewers addressed the differences in Figure 5A and 5C, which we addressed in our figure legend and in the previous response to reviews is the result of these data originating from different time points of the same assay. Reviewer #2 believed we should be more staid in our comments about linker optimality, which we have addressed by changing the referenced line in the discussion. Otherwise, we have made no modifications to figures or text beyond the addition of new data.

    1. eLife Assessment

      The authors developed a method to allow a hypothermic agent, neurotensin, to cross the blood-brain barrier so it could potentially protect the brain from seizures and the adverse effects of seizures. The work is important because it is known that cooling the brain can protect it but developing a therapeutic approach based on that knowledge has not been done. The paper is well presented and the data are convincing.

    2. Reviewer #1 (Public review):

      In this manuscript, Ferhat and colleagues describe their study aimed at developing a blood brain barrier (BBB) penetrant agent that could induce hypothermia and provide neuroprotection from the sequelae of status epilepticus (SE) in mice. Hypothermia is used clinically in an attempt to reduce neurological sequelae of injury and disease. Hypothermia can be effective, but physical means used to reduce core body temperature is associated with untoward effects. Pharmacological means to induce hypothermia could be as effective with fewer untoward complications. Intracerebroventricularly applied neurotensin can cause hypothermia; however, neurotensin applied peripherally is degraded and does not cross the BBB. Here the authors develop and characterize a neurotensin conjugate that can reach the brain, induce hypothermia, and reduce seizures, cognitive changes, and inflammatory changes associated with status epilepticus.

      Strengths:

      (1) In general, the study is well reasoned, well designed, and seemingly well executed.<br /> (2) Strong dose-response assessment of multiple neurotensin conjugates in mice.<br /> (3) Solid assessment of binding affinity, in vitro stability ion blood, and brain uptake of the conjugate.<br /> (4) Appropriate inclusion of controls for SE and for drug injections.<br /> (5) Multifaceted assessment of neurodegeneration, inflammation, and mossy fiber sprouting in the different groups.<br /> (6) Inclusion of behavioral assessments.<br /> (7) Evaluate NSTR1 receptor distribution in multiple ways.<br /> (8) Demonstrate that this conjugate can induce hypothermia and have positive effects on the sequelae of SE. Could have great impact on the application of pharmacologically-induced hypothermia as a neuroprotective measure in patients.

      Weaknesses:

      (1) The data suggest that the neurotensin conjugate causes hypothermia AND has favorable effects on the sequelae of SE. There is a limitation that they do not definitely show that the hypothermia caused by the neurotensin conjugate is necessarily responsible for the effects they see. The authors recognize and discuss this limitation in the manuscript.

    3. Reviewer #2 (Public review):

      Summary:

      The authors generated analogs consisting of modified neurotensin (NT) peptides capable of binding to low density lipoprotein (LDL) and NT receptors. Their lead analog was further evaluated for additional validation as a novel therapeutic. The putative mechanism of action for NT in its antiseizure activity is hypothermia, and as therapeutic hypothermia has been demonstrated in epilepsy, NT analogs may confer antiseizure activity and avoid the negative effects of induced hypothermia.

      Strengths:

      The authors demonstrate an innovative approach, i.e. using LDLR as a means of transport into the brain, that may extend to other compounds. They systematically validate their approach and its potential through binding, brain penetration, in vivo antiseizure efficacy, and neuroprotection studies.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      We addressed the issue of “tolerability” in our answers to Reviewer 2 and in the revised manuscript where we had added data concerning tolerability, see the paragraph in the Results Section, page 11:

      "Finally, tolerability studies were performed with the administration of up to 20 and 40 mg/kg eq. NT (i.e. 25.8 and 51.6 mg/kg of VH-N412) with n=3 for these doses. The rectal temperature of the animals did not fall below 32.5 to 33.2°C, similar to the temperature induced with the 4 mg/kg eq. NT dose. We observed no mortality or notable clinical signs other than those associated with the rapid HT effect such as a decrease in locomotor activity. We thus report a very interesting therapeutic index since the maximal tolerated dose (MTD) was > 40 mg/kg eq. NT, while the maximum effect is observed at a 10x lower dose of 4 mg/kg eq. NT and an ED50 established at 0.69 mg/kg as shown in Figure 1G.”

      We have slightly modified the paragraph above to emphasize that the tolerability studies were performed in “naïve mice”. 

      "Finally, tolerability studies were performed in naïve mice with the administration of up to 20 and 40 mg/kg eq. NT (i.e. 25.8 and 51.6 mg/kg of VH-N412) with n=3 for these doses. The rectal temperature of the animals did not fall below 32.5 to 33.2°C, similar to the temperature induced with the 4 mg/kg eq. NT dose. We observed no mortality or notable clinical signs other than those associated with the rapid HT effect such as a decrease in locomotor activity. We thus report a very interesting therapeutic index since the maximal tolerated dose (MTD) was > 40 mg/kg eq. NT, while the maximum effect is observed at a 10x lower dose of 4 mg/kg eq. NT and an ED50 established at 0.69 mg/kg as shown in Figure 1G.”

      We propose to add a sentence in the Results section, page 11, relative to the fact that we can also induce severe hypothermia in rats using conjugates similar to VH-N412.

      We also added in the Discussion section (page 38) that we could induce hypothermia with different conjugates in mice, rats and pigs.

    1. eLife Assessment

      Fallah et al carefully dissect projections from substantia nigra pars reticulata (SNr) and the globus pallidus externa (GPe) – two key basal ganglia nuclei – to the pedunculopontine nucleus (PPN), a brainstem nucleus that has a central role in motor control. They consider inputs from these two areas onto 3 types of downstream PPN neurons – GABAergic, glutamatergic, and cholinergic neurons – and carefully map connectivity along the rostrocaudal axis of the PPN. Overall, this valuable study provided convincing data on PPN connectivity with two key input structures that will provide a basis for further understanding PPN function.

    2. Reviewer #1 (Public review):

      Summary:

      Fallah and colleagues characterize the connectivity between two basal ganglia output nuclei, the SNr and GPe, and the pedunculopontine nucleus, a brainstem nucleus that is part of the mesencephalic locomotor region. Through a series of systematic electrophysiological studies, they find that these regions target and inhibit different populations of neurons, with anatomical organization. Overall, SNr projects to PPN and inhibits all major cell types, while the GPe inhibits glutamatergic and GABAergic PPN neurons, and preferentially in the caudal part of the nucleus. Optogenetic manipulation of these inputs had opposing effects on behavior - SNr terminals in the PPN drove place aversion, while GPe terminals drove place preference.

      Strengths:

      This work is a thorough and systematic characterization of a set of relatively understudied circuits. They build on the classic notions of basal ganglia connectivity and suggest a number of interesting future directions to dissect motor control and valence processing in brainstem systems.

      Weaknesses:

      Characterization of the behavioral effects of manipulations of these PPN input circuits could be further parsed, for a better understanding of the functional consequences of the connections demonstrated in the ephys analyses.

      All the cell type recording studies showing subtle differences in the degree of inhibition and anatomical organization of that inhibition suggest a complex effect of general optogenetic manipulation of SNr or GPe terminals in the PPN. It will be important to determine if SNr or GPe inputs onto a particular cell type in PPN are more or less critical for how the locomotion and valence effects are demonstrated here.

    3. Reviewer #2 (Public review):

      Summary:

      Fallah et al carefully dissect projections from SNr and GPe - two key basal ganglia nuclei - to the PPN, an important brainstem nucleus for motor control. They consider inputs from these two areas onto 3 types of downstream PPN neurons: GABAergic, glutamatergic, and cholinergic neurons. They also carefully map connectivity along the rostrocaudal axis of the PPN.

      Strengths:

      The slice electrophysiology work is technically well done and provides useful information for further studies of PPN. The optogenetics and behavioral studies are thought-provoking, showing that SNr and GPe projections to PPN play distinct roles in behavior.

      Weaknesses:

      Although the optogenetics and behavioral studies are intriguing, they are somewhat difficult to fit together into a specific model of circuit function. Perhaps the authors can work to solidify the connection between these two arms of the work. Otherwise, there are a few questions whose answers could add context to the interpretation of these results:

      (1) Male and female mice are used, but the authors do not discuss any analysis of sex differences. If there are no sex differences, it is still useful to report data disaggregated by sex in addition to pooled data.

      (2) There is some lack of clarity in the current manuscript on the ages used - 2-5 months vs "at least 7 weeks." Is 7 weeks the time of virus injection surgery, then recordings 3 weeks later (at least 10 weeks)? Please clarify if these ages apply equally to electrophysiological and behavioral studies. If the age range used for the test is large, it may be useful to analyze and report if there are age-related effects.

      (3) Were any exclusion criteria applied, e.g. to account for missed injections?

      (4) 28-34degC is a fairly wide range of temperatures for electrophysiological recording, which could affect kinetics.

      (5) It would be good to report the number of mice used for each condition in addition to n=cells. Statistically, it would be preferable not to assume that each cell from the same mouse is an independent measurement and to use a nested ANOVA.

    4. Reviewer #3 (Public review):

      Summary:

      The study by Fallah et al provides a thorough characterization of the effects of two basal ganglia output pathways on cholinergic, glutamatergic, and GABAergic neurons of the PPN. The authors first found that SNr projections spread over the entire PPN, whereas GPe projections are mostly concentrated in the caudal portion of the nucleus. Then the authors characterized the postsynaptic effects of optogenetically activating these basal ganglia inputs and identified the PPN's cell subtypes using genetically encoded fluorescent reporters. Activation of inputs from the SNr inhibited virtually all PPN neurons. Activation of inputs from the GPe predominantly inhibited glutamatergic neurons in the caudal PPN, and to a lesser extent GABAergic neurons. Finally, the authors tested the effects of activating these inputs on locomotor activity and place preference. SNr activation was found to increase locomotor activity and elicit avoidance of the optogenetic stimulation zone in a real-time place preference task. In contrast, GPe activation reduced locomotion and increased the time in the RTPP stimulation zone.

      Strengths:

      The evidence of functional connectivity of SNr and GPe neurons with cholinergic, glutamatergic, and GABAergic PPN neurons is solid and reveals a prominent influence of the SNr over the entire PPN output. In addition, the evidence of a GPe projection that preferentially innervates the caudal glutamatergic PPN is unexpected and highly relevant for basal ganglia function.

      Opposing effects of two basal ganglia outputs on locomotion and valence through their connectivity with the PPN.

      Overall, these results provide an unprecedented cell-type-specific characterization of the effects of basal ganglia inputs in the PPN and support the well-established notion of a close relationship between the PPN and the basal ganglia.

      Weaknesses:

      The behavioral experiments require further analysis as some motor effects could have been averaged out by analyzing long segments. Additional controls are needed to rule out a motor effect in the real-time place preference task. Importantly, the location of the stimulation is not reported even though this is critical to interpret the behavioral effects.

      There are some concerns about the possible recruitment of dopamine neurons in the SNr experiments.

    5. Author Response:

      Reviewer #1 (Public review):

      Summary:

      Fallah and colleagues characterize the connectivity between two basal ganglia output nuclei, the SNr and GPe, and the pedunculopontine nucleus, a brainstem nucleus that is part of the mesencephalic locomotor region. Through a series of systematic electrophysiological studies, they find that these regions target and inhibit different populations of neurons, with anatomical organization. Overall, SNr projects to PPN and inhibits all major cell types, while the GPe inhibits glutamatergic and GABAergic PPN neurons, and preferentially in the caudal part of the nucleus. Optogenetic manipulation of these inputs had opposing effects on behavior - SNr terminals in the PPN drove place aversion, while GPe terminals drove place preference.

      Strengths:

      This work is a thorough and systematic characterization of a set of relatively understudied circuits. They build on the classic notions of basal ganglia connectivity and suggest a number of interesting future directions to dissect motor control and valence processing in brainstem systems.

      We thank the reviewers for these positive comments.

      Weaknesses:

      Characterization of the behavioral effects of manipulations of these PPN input circuits could be further parsed, for a better understanding of the functional consequences of the connections demonstrated in the ephys analyses.

      We will further analyze our behavioral data to reveal more nuanced functional effects.

      All the cell type recording studies showing subtle differences in the degree of inhibition and anatomical organization of that inhibition suggest a complex effect of general optogenetic manipulation of SNr or GPe terminals in the PPN. It will be important to determine if SNr or GPe inputs onto a particular cell type in PPN are more or less critical for how the locomotion and valence effects are demonstrated here.

      This is a really interesting future direction and we will expand on these points in the discussion.

      Reviewer #2 (Public review):

      Summary:

      Fallah et al carefully dissect projections from SNr and GPe - two key basal ganglia nuclei - to the PPN, an important brainstem nucleus for motor control. They consider inputs from these two areas onto 3 types of downstream PPN neurons: GABAergic, glutamatergic, and cholinergic neurons. They also carefully map connectivity along the rostrocaudal axis of the PPN.

      Strengths:

      The slice electrophysiology work is technically well done and provides useful information for further studies of PPN. The optogenetics and behavioral studies are thought-provoking, showing that SNr and GPe projections to PPN play distinct roles in behavior.

      We appreciate the reviewer’s positive evaluation.

      Weaknesses:

      Although the optogenetics and behavioral studies are intriguing, they are somewhat difficult to fit together into a specific model of circuit function. Perhaps the authors can work to solidify the connection between these two arms of the work.

      We will expand on these topics in the discussion.

      (1) Male and female mice are used, but the authors do not discuss any analysis of sex differences. If there are no sex differences, it is still useful to report data disaggregated by sex in addition to pooled data.

      While we do not have sufficient n for a well-powered analysis of sex differences in behavior, we find that both male and female mice increase movement in response to SNr axon stimulation and decrease movement in response to GPe axon stimulation. We will expand on this further in the revised manuscript.

      (2) There is some lack of clarity in the current manuscript on the ages used - 2-5 months vs "at least 7 weeks." Is 7 weeks the time of virus injection surgery, then recordings 3 weeks later (at least 10 weeks)? Please clarify if these ages apply equally to electrophysiological and behavioral studies. If the age range used for the test is large, it may be useful to analyze and report if there are age-related effects.

      7 weeks is the youngest age at which mice used for electrophysiology were injected, and all were used for electrophysiology between 2-5 months. For behavior, the youngest mice used were 11 weeks old at time of behavior (8 weeks old at injection). Mice in the GPe-stimulated condition were 110 ± 7.4 SEM days old and mice in the SNr-stimulated condition 132 ± 23.4 SEM days old. We will add these details to the revised manuscript.

      In addition, we have correlated distance traveled at baseline and during stimulation with age for both SNr and GPe stimulated conditions. Baseline distance traveled did not correlate with age, but there was a trend toward more movement during stimulation with older mice in the SNr axon stimulation group. We will discuss this in the revised manuscript.

      (3) Were any exclusion criteria applied, e.g. to account for missed injections?

      All injection sites and implant sites were within our range of acceptability, so we did not exclude any mice for missed injections.

      (4) 28-34degC is a fairly wide range of temperatures for electrophysiological recording, which could affect kinetics.

      This is an important consideration. We have checked our main measurement of current amplitude in the condition where we found significant differences between rostral and caudal PPN (SNr to Vglut2 PPN neurons) against temperature and found no correlation (Pearson’s r value = -0.0076). Similarly, we found no correlation between baseline (pre-opto) firing frequency and temperature (r = -0.068).

      (5) It would be good to report the number of mice used for each condition in addition to n=cells. Statistically, it would be preferable not to assume that each cell from the same mouse is an independent measurement and to use a nested ANOVA.

      For electrophysiology, the number of mice used in each experiment was 6 (3 male, 3 female). In the manuscript ‘N’ represents number of mice and ‘n’ represents number of cells. Because of the unpredictability of how many healthy cells can be recorded from one mouse, our data were planned to be collected with n=cells, and are underpowered for a nested ANOVA. However, rostral and caudal data were collected from the same mice. While we do not have sufficient paired data for each parameter, analyzing one of our main and most important findings with a paired comparison (with biological replicates being mice) shows a statistically significant difference in the inhibitory effect of SNr axon stimulation on firing rate between rostral and caudal glutamatergic neurons (p=0.031, Wilcoxon signed rank test).

      Reviewer #3 (Public review):

      Summary:

      The study by Fallah et al provides a thorough characterization of the effects of two basal ganglia output pathways on cholinergic, glutamatergic, and GABAergic neurons of the PPN. The authors first found that SNr projections spread over the entire PPN, whereas GPe projections are mostly concentrated in the caudal portion of the nucleus. Then the authors characterized the postsynaptic effects of optogenetically activating these basal ganglia inputs and identified the PPN's cell subtypes using genetically encoded fluorescent reporters. Activation of inputs from the SNr inhibited virtually all PPN neurons. Activation of inputs from the GPe predominantly inhibited glutamatergic neurons in the caudal PPN, and to a lesser extent GABAergic neurons. Finally, the authors tested the effects of activating these inputs on locomotor activity and place preference. SNr activation was found to increase locomotor activity and elicit avoidance of the optogenetic stimulation zone in a real-time place preference task. In contrast, GPe activation reduced locomotion and increased the time in the RTPP stimulation zone.

      Strengths:

      The evidence of functional connectivity of SNr and GPe neurons with cholinergic, glutamatergic, and GABAergic PPN neurons is solid and reveals a prominent influence of the SNr over the entire PPN output. In addition, the evidence of a GPe projection that preferentially innervates the caudal glutamatergic PPN is unexpected and highly relevant for basal ganglia function.

      Opposing effects of two basal ganglia outputs on locomotion and valence through their connectivity with the PPN.

      Overall, these results provide an unprecedented cell-type-specific characterization of the effects of basal ganglia inputs in the PPN and support the well-established notion of a close relationship between the PPN and the basal ganglia.

      We thank the reviewer for their positive comments.

      Weaknesses:

      The behavioral experiments require further analysis as some motor effects could have been averaged out by analyzing long segments.

      We will further analyze our motor effects in the revised manuscript.

      Additional controls are needed to rule out a motor effect in the real-time place preference task.

      This is an important point. Our use of unilateral stimulation in the RTPP task reduces potential motor effects, and our supplemental videos show that the mice can easily escape and enter the stimulated zone. However, we can't completely rule out a motor component. To delve into this further, we analyzed mouse speed in the RTPP task. We find that in both SNr and GPe stimulation conditions, the maximum speed of the mouse is not different in the stimulated vs unstimulated zone. We will further analyze mouse speed at the transition into and out of the stimulated zone to identify any acute motor effects in this experiment.

      Importantly, the location of the stimulation is not reported even though this is critical to interpret the behavioral effects.

      The implant locations were generally over the middle-to-rostral PPN and we will clarify this in the revised manuscript. These locations are shown in figure 7B.

      There are some concerns about the possible recruitment of dopamine neurons in the SNr experiments.

      We are very interested in this possibility and plan to discuss this with more clarity in a revised manuscript.

    1. eLife Assessment

      This useful manuscript reports on a new mouse model for LAMA2-MD, a rare but very severe congenital muscular dystrophy; the knockout mice were generated by removing exon3 in the Lama2 gene, which results in a frameshift in exon4 and a premature stop codon. These animals lack any laminin-alpha2 protein and confirm results from previous Lama2 knockout models. Additionally, this study includes transcriptomics data that might be a good resource for the field. However, the experimental evidence supporting the main claims of the manuscript is incomplete, citations of previous Lama2 null mice studies are lacking, and both data presentation and interpretation need improvement.

    2. Reviewer #1 (Public review):

      Strengths:

      This work adds another mouse model for LAMA2-MD that re-iterates the phenotype of previously published models. Such as dy3K/dy3K; dy/dy and dyW/dyW mice. The phenotype is fully consistent with the data from others.

      One of the major weaknesses of the manuscript initially submitted was the overinterpretation and the overstatements. The revised version is clearly improved as the authors toned-down their interpretation and now also cite the relevant literature of previous work.

      Weaknesses:

      Unfortunately, the data on RNA-seq and scRNA-seq are still rather weak. scRNA-seq was conducted with only one mouse resulting in only 8000 nuclei. I am not convinced that the data allow us to interpret them to the extent of the authors. Similar to the first version, the authors infer function by examining expression. Although they are a bit more cautious, they still argue that the BBB is not functional in dyH/dyH mice without showing leakiness. Such experiments can be done using dyes, such as Evans-blue or Cadaverin. Hence, I would suggest that they formulate the text still more carefully.

      A similar lack of evidence is true for the suggested cobblestone-like lissencephaly of the mice. There is no strong evidence that this is indeed occurring in the mice (might also be a problem because mice die early). Hence, the conclusions need to be formulated in such a way that readers understand that these are interpretations and not facts.

      Finally, I am surprised that the only improvement in the main figures is the Western blot for laminin-alpha2. The histology of skeletal muscle still looks rather poor. I do not know what the problems are but suggest that the authors try to make sections from fresh-frozen tissue. I anticipate that the mice were eventually perfused with PFA before muscles were isolated. This often results in the big gaps in the sections.

      Overall, the work is improved but still would need additional experiments to make it really an important addition to the literature in the LAMA-MD field.

    3. Reviewer #2 (Public review):

      Summary:

      This revised manuscript describes the production of a mouse model for LAMA2-Related Muscular Dystrophy. The authors investigate changes in transcripts within the brain and blood barrier. The authors also investigate changes in the transcriptome associated with the muscle cytoskeleton.

      Strengths:

      (1) The authors produced a mouse model of LAMA2-CMD using CRISPR-Cas9

      (2) The authors identify cellular changes that disrupted the blood-brain barrier.

      Weaknesses:

      (1) The authors throughout the manuscript overstate "discoveries" which have been previously described, published and not appropriately cited.

      (2) Alternations in the blood brain barrier and in the muscle cell cytoskeleton in LAMA2-CMD have been extensively studied and published in the literature and are not cited appropriately.

      (3) The authors have increased animal number to N=6, but this is still insufficient based on Power analysis results in statistical errors and conclusions that may be incorrect.

      (4) The use of "novel mouse model" in the manuscript overstates the impact of the study.

      (5) All studies presented are descriptive and do not more to the field except for producing yet another mouse model of LAMA2-CMD and is the same as all the others produced.

      (6) Grip strength measurements are considered error prone and do not give an accurate measurement of muscle strength, which is better achieved using ex vivo or in vivo muscle contractility studies.

      (7) A lack of blinded studies as pointed out of the authors is a concern for the scientific rigor of the study.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) Some of the figures are of rather poor quality. For example, the H&E and Sirius Red stainings in Figures 3 and 4 are quite poor so it is difficult to see what is going on in the muscles. The authors should take note of another publication on dy3K/dy3K mice of similar age (PMID: 31586140) where such images are of much higher quality. Similarly, the Western blot for laminin-alpha2 (Figure 4B) of the wild-type mouse needs improvement. If the single laminin-alpha2 protein is not detected, there is an issue with the denaturation buffer used to load the protein.

      Thank you for the valuable suggestions. We have read the study on dy3K/dy3K mice of similar age (PMID: 31586140) which showed dystrophic changes in dy3K/dy3K muscle throughout the disease course with the whole muscle and representative muscle area. We have generated new figures with higher quality including the whole muscle and representative muscle area for the H&E and Sirius Red stainings.  However, due to the large images, we have added them in the new Figure supplement 2 and Figure supplement 3. Also, we have changed the denaturation buffer used to load the protein, and performed Western blot of laminin α2, the result of the laminin α2 protein of the wild-type mice (n =3) and dyH/dyH mice (n =3) detected by Western blot has been showed in Figure 4B.

      (2) My biggest concern is, however, the many overstatements in the manuscript and the over-interpretation of the data. This already starts with the first sentence in the abstract where the authors write: "Understanding the underlying pathogenesis of LAMA2- related muscular dystrophy (LAMA2-MD) have been hampered by lack of genuine mouse model." This is not correct as the dy3K/dy3K, generated in 1997 (PMID: 9326364), are also Lama2 knockout mice; there are also other strains (dyW/dyW mice) that are severely affected and there are the dy2J/dy2J mice that represent a milder form of LAMA2-MD. Similarly, the last two sentences of the abstract "This is the first reported genuine model simulating human LAMA2-MD. We can use it to study the molecular pathogenesis and develop effective therapies." are a clear overstatement. The mechanisms of the disease are well studied and the above-listed mouse models have been amply used to develop possible treatment options. The overinterpretation concerns the results from transcriptomics. The fact that Lama2 is expressed in particular cell types of the brain does not at all imply that Lama2 knockout mice have a defect in the blood-brain barrier as the authors state. If there are no functional data, this cannot be stated. Indications for a blood-brain barrier defect come from work in dy3K/dy3K mice (PMID: 25392494) and this needs to be written like this.

      Thank you for your comment and sorry for the overstatements in the manuscript. We have carefully considered our previous statements and corrected them accordingly. We have changed the first sentence in the abstract into "Our understanding of the molecular pathogenesis of LAMA2-related muscular dystrophy (LAMA2-MD) requires improving". Also, we have changed the last two sentences in the abstract with "In summary, this study provided useful information for understanding the molecular pathogenesis of LAMA2-MD".

      We also agree that "Lama2 is expressed in particular cell types of the brain does not at all imply that Lama2 knockout mice have a defect in the blood-brain barrier", and the indications for a blood-brain barrier defect come from work in dy3K/dy3K mice (PMID: 25392494). Therefore, we have corrected the overstatement according to the suggestion with "It was reported that the deficiency of laminin α2 in astrocytes and pericytes was associated with a defective blood-brain barrier (BBB) in the dy3K/dy3K mice (Menezes et al., 2014). The defective BBB presented with altered integrity and composition of the endothelial basal lamina, reduced pericyte coverage, and hypertrophic astrocytic endfeet lacking appropriately polarized aquaporin4 channels."

      (3) Finally, the bulk RNA-seq data also needs to be presented in a disease context. The authors, again, mix up changes in expression with functional impairment. All gene expression changes are interpreted as direct evidence of an involvement of the cytoskeleton. In fact, changes in the cytoskeleton are more likely a consequence of the severe muscle phenotype and the delay in muscle development. This is particularly possible as muscle samples from 14-day-old mice are compared; a stage at which muscle still develops and grows tremendously. Thus, all the data need to be interpreted with caution.

      Thank you for your comment. We have changed the over-interpretation of the bulk RNA-seq data, and have corrected the last sentence in the Result with "These observations important data for the impaired muscle cytoskeleton and abnormal muscle development which were associated with the muscle pathology consequence of severe dystrophic changes in the dyH/dyH mice.".

      (4) In summary, the authors need to improve data presentation and, most importantly, they need to tone down the interpretation and they must be fully aware that their work is not as novel as they present it.

      Thank you for your comments and valuable suggestions, and we have changed the previous overstatements and interpretation of the results. We are sorry that we failed to clearly present our rational of making this mouse model. Indeed, there were many existing mouse models, which were all important to the research in the field. One of the reasons why we wished to create dyH/dyH is to make a mouse model without any trace of engineering (e.g., inserted bacterial elements for knockout). By doing so, we were hoping to provide a novel model suited for gene-editing-based gene therapy development. To this end, dyH/dyH was created to reflect the hot mutation region in the Chinese population. Hopefully, you will agree with our points and see that we were not trying to belittle previous models but were simply trying to provide a different option. The overstatements were largely rooted from language barriers, and we have tried to make our statements more cautious and acceptable to the readers.

      Reviewer #2 (Public Review):

      (1) The major weakness is the manuscript reads like this was the first-ever knockout mouse model generated for LAMA2-CMD. There are in fact many Lama2 knockout mice (dy, dy2J, dy3k, dyW, and more) which have all been extensively studied with publications. It is important for the authors to comment on these other published studies that have generated these well-studied mouse lines. Therefore, there is a lack of background information on these other Lama2 null mice.

      Thank you for your comment. We have added background information on these other Lama2 null mice with the sentences "The most common mouse models for LAMA2-MD are the dy/dy, dy3k/dy3k, dyw/dyw and dy2J/dy2J mice (Xu et al., 1994; Michelson et al., 1995; Miyagoe et al., 1997; Kuang et al., 1998; Sunada et al., 1995). Among them, the dy/dy, dy3k/dy3k, dyw/dyw mice present severe muscular dystrophy, and dy2J/dy2J mice show mild muscular dystrophy and peripheral neuropathy (Gawlik and Durbeej, 2020). The mutation of the dy/dy mice has been still unclear (Xu et al., 1994; Michelson et al., 1995). The dy3k/dy3k mice were generated by inserting a reverse Neo element in the 3' end of exon 4 of Lama2 gene in 1997 (Miyagoe et al., 1997), and the dyw/dyw mice were created with an insertion of lacZ-neo in the exon 1 of Lama2 gene in 1998 (Kuang et al., 1998). The dy2J/dy2J mice were generated in 1970 by a spontaneous splice donor site mutation which resulted in a predominant transcript with a 171 base in-frame deletion, leading to the expression of a truncated laminin α2 with a 57 amino acid deletion (residues 34-90) and a substitution of Gln91Glu (Sunada et al., 1995). They were established in the pre-gene therapy era, leaving trace of engineering, such as bacterial elements in the Lama2 gene locus, thus unsuitable for testing various gene therapy strategies. Moreover, insufficient transcriptomic data of the muscle and brain of LAMA2-CMD mouse models limits the understanding of disease hallmarks. Therefore, there is a need to create new appropriate mouse models for LAMA2-CMD based on human high frequently mutated region using the latest gene editing technology such as clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9."

      (2) The phenotypes of dyH/dyH are similar to, if not identical to dy/dy, dy2J/dy2J, dy3k/dy3k, dyW/dyW including muscle wasting, muscle weakness, compromised blood-brain barrier, and reduced life expectancy. This should be addressed, and a comparison made with Lama2 deficient mice in published literature.

      Thank you for your comment. We have added Table supplement 3 to make a comparison between dyH/dyH with other Lama2 deficient mice. We aslo have added the statement in Discussin with "Compared with other Lama2 deficient mice including dy/dy, dy2J/dy2J, dy3k/dy3k and dyW/dyW, the phenotype of the dyH/dyH mice presented with a very severe muscular dystrophy, which was similar to that of the dy3k/dy3k mice (Table supplement 3)."

      (3) Recent published studies (Chen et al., Development (2023), PMID 36960827) show loss of Itga7 causes disruption of the brain-vascular basal lamina leading to defects in the blood-brain barrier. This should be referenced in the manuscript since this integrin is a major Laminin-211/221 receptor in the brain and the mouse model appears to phenocopy the dyH/dyH mouse model.

      Thank you for your great suggestion. We have cited the published studies (Chen et al., Development (2023), PMID 36960827) and added statements in Discussion with "As reported, the aberrant BBB function was also associated with the adhesion defect of alpha7 integrin subunit in astrocytes to laminins in the Itga_7-/- mice (_Chen et al., 2023). In this study, loss of communications involving the laminins’ pathway between laminin α2 and integrins were predicted between vascular and leptomeningeal fibroblasts and astrocytes in the dyH/dyH brain, providing more evidence for the impaired BBB due to laminin α2 deficiency."

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Improve the data presentation (as mentioned above). Make a new picture of the histology; repeat the Western blots. Discuss the RNA-seq data with more caution and present it in a more attractive way. Tone down the wording.

      Thank you for your recommendations. We have revised the overstatements and improved the RNA-seq data interpretation as suggested. Also,we have made a new picture of the histology, and repeated the Western blots.

      Reviewer #2 (Recommendations For The Authors)

      (1) There are many grammatical errors within the manuscript. The manuscript should be carefully proofread.

      Thank you for your recommendations. We have carefully corrected the grammatical errors within the manuscript.

      (2) Figure 2: The animal numbers used in this analysis were not indicated. Please include this number in the figure legend.

      Thank you for your recommendations. We have added animal numbers in the figure legends wherever applicable.

      (3) Figure 2: The forelimb grip strength is informative but has limitations. Ex vivo or in vivo muscle contractility is the gold standard for measuring muscle strength.

      Thank you for your recommendations. We do agree that the ex vivo or in vivo muscle contractility is the gold standard for measuring muscle strength, and we really want to finish this experiment. However, we feel sorry that this test has not been finished due to the following reasons: (1) The forelimb grip strength for measuring muscle strength is a classic method and remains a commonly used method for measuring mouse muscle strength in the studies of different muscular dystrophies, such as LAMA2-MD (Amelioration of muscle and nerve pathology of Lama2-related dystrophy by AAV9-laminin-αLN linker protein. JCI Insight. 2022;7(13):e158397. PMID: 35639486), Duchenne muscular dystrophy (Investigating the role of dystrophin isoform deficiency in motor function in Duchenne muscular dystrophy. J Cachexia Sarcopenia Muscle. 2022;13(2):1360-1372. PMID: 35083887), facioscapulohumeral muscular dystrophy (Systemic delivery of a DUX4-targeting antisense oligonucleotide to treat facioscapulohumeral muscular dystrophy. Mol Ther Nucleic Acids. 2021;26:813-827. PMID: 34729250), and etc. (2) The forelimb grip strength for measuring muscle strength is also used in the human studies (PMID: 32366821; PMID: 29313844; PMID: 34499663, and etc). In view of reasons above, for measuring muscle strength, we used the forelimb grip strength, and have not finished the supplementary experiment of ex vivo or in vivo muscle contractility.

      (4) Figure 3: Muscle fibrosis should be measured with a hydroxyproline assay.

      Thank you for your recommendations. We do agree that the hydroxyproline assay is one of the most classic method to evaluate collagen content for measuring muscle fibrosis. However, we performed Sirius Red staining for measuring muscle fibrosis due to the following reasons: (1) Muscle fibrosis measured by Sirius Red staining can be observed more directly, and the other pathological features also can be observed, and compared through muscle pathology. (2) Sirius Red staining is also a classic method and remains a commonly used method for measuring muscle fibrosis, which has been previously reported in the mouse studies of muscle disorders, such as PMID: 22522482 (Losartan, a therapeutic candidate in congenital muscular dystrophy: studies in the dy(2J) /dy(2J) mouse. Ann Neurol. 2012;71(5):699-708.), PMID: 34337906 (Aging-related hyperphosphatemia impairs myogenic differentiation and enhances fibrosis in skeletal muscle. J Cachexia Sarcopenia Muscle. 2021;12(5):1266-1279.), PMID: 28798156 (Phosphodiesterase 4 inhibitor and phosphodiesterase 5 inhibitor combination therapy has antifibrotic and anti-inflammatory effects in mdx mice with Duchenne muscular dystrophy. FASEB J. 2017;31(12):5307-5320.), and etc. Therefore, we used Sirius Red staining to measure muscle fibrosis in this study.

      (5) Figure 8: The N=3 is very low which could result in type I or II statistical errors. A larger sample size will reduce the chance of statistical errors.

      Thank you for your recommendations. We have increased the number of animals to reduce the chance of statistical errors. We have performed the supplementary experiment, the number of animals for each group has been increased to 6 (3 male and female each).  The results were consistent with previous data in Figure 8.

      (6) Power analysis to estimate experimental animal numbers should be reported in the manuscript.

      Thank you for your recommendations. Refer to previous study (Power and sample size. Nature Methods. 2013;10:1139–1140), “The distributions show effect sizes d = 1, 1.5 and 2 for n = 3 and α = 0.05. Right, power as function of d at four different a values for n = 3”, and “If we average seven measurements (n = 7), we are able to detect a 10% increase in expression levels (μ_A = 11, _d = 1) 84% of the time with α = 0.05.”, the experimental animal numbers estimated were 3 to 7. Moreover, if the increased number of experimental animals could be available, we would retain data.

      (7) It is unclear if the studies were performed with adequate rigor. Were those scoring outcome measures blinded to the treatment groups?

      Thank you for your recommendations. We performed the studies with those scoring outcome measures not blinded to the treatment groups, the groups were based on their genotype. Actually, it was easy to discriminate the dyH/dyH groups from the WT/Het mice due to their small body shape.

      (8) Authors should appropriately cite previous studies that have generated Lama2 null mice.

      Thank you for your recommendations. We have cited previous studies that have generated Lama2 null mice with the sentence “The most common mouse models for LAMA2-MD are the dy/dy, dy3k/dy3k, dyw/dyw and dy2J/dy2J mice (Xu et al., 1994; Michelson et al., 1995; Miyagoe et al., 1997; Kuang et al., 1998; Sunada et al., 1995)”.

      (9) The number of animals should be increased to reduce the chance of statistical error.

      Thank you for your recommendations. We have performed the supplementary experiment, the number of animals for each group has been increased to reduce the chance of statistical error.

      (10) A power analysis should be performed to determine the number of experimental animals.

      Thank you for your recommendations. We have performed a power analysis to determine the number of experimental animals as mentioned above.

      (11) There are many grammatical errors within the manuscript. The manuscript should be carefully proofread.

      Thank you for your recommendations. We have carefully corrected the grammatical errors within the manuscript.

    1. eLife Assessment

      This study makes an important contribution by characterizing the role of the exocyst in secretory granule exocytosis in the Drosophila larval salivary gland. The results are solid and lead to the novel interpretation that the exocyst participates not only in exocytosis, but also in earlier steps of secretory granule biogenesis and maturation. However, the authors are urged to provide additional proof that the exocyst subunit knockdowns were effective and to acknowledge the possibility that inactivation of an essential exocytosis component could indirectly affect other parts of the secretory pathway.

    2. Reviewer #1 (Public review):

      Suarez-Freire et al. analyzed here the function of the exocyst complex in the secretion of the glue proteins by the salivary glands of the Drosophila larva. This is a widely used, genetically accessible system in which the formation, maturation and precisely timed exocytosis of the glue secretory granules can be beautifully imaged. Using RNAi, the authors show that all units of the exocyst complex are required for exocytosis. They show that not just granule fusion with the plasma membrane is affected (canonical role), but also, with different penetrance, that glue protein is retained in the ER, secretory granules fail to fuse homotypically or fail to acquire maturation features. The authors document these phenotypes and postulate specific roles for the exocyst in these additional processes to explain them: exocyst as a Golgi-Golgi, Golgi-granule or granule-granule tether.

      Compared to the initial submission, this revised version of the study presents strengthened evidence for these novel roles. In particular, authors show juxta-Golgi localization of exocyst components and disruption of the trans-Golgi compartment upon exocyst loss. Additionally, the revised study contains controls indicating that glue secretion defects prior to plasma membrane exocytosis are not due to polarity loss or unspecific poor health of cells.

    3. Reviewer #2 (Public review):

      The manuscript from Wappner and Melani labs claims a novel for the exocyst subunits in multiple aspects of secretory granule exocytosis. This an intriguing paper for it suggests multiple roles of the exocyst in granule maturation and fusion with roles at the ER/Golgi interface, TGN, granule homotypic fusion.

      A key strength is the breadth of the assays and study of all 8 exocyst subunits in a powerful model system (fly larvae). But why do KD of different exocysts have different effects on presumed granule formation? Also it can be hard to disentangle direct vs. secondary effects, as much of the TGN seems to be altered in the KDs. The authors ascribe many of the results to the holocomplex, but there are major differences between the proteins -- this may be all related to the different levels of expression (as the authors propose), but only limited mRNA was examined.

      Unresolved Comments:

      (A) Explanation variability of exocyst KD on the appearance of MSG. What is remarkable is a highly variable effect of different subunit KD on the percentage of cells with MLS (Fig. 4C). Controls = 100 %, Exo70=~75% (at 19 deg), Sec3 = ~30%, Sec10 = 0%, Exo84 = 100% ... This is interesting for the functional exocyst is an octameric holocomples, thus why the huge subunit variability in the phenotypes? One explanation is that the levels of KD varied between the subunits. Another is that not all subunits have equivalent roles (as seen for instance in exocyst's roles in autophagy).

      This should be addressed by quantification of the KD of the 8 different exocyst proteins (and or mRNA as only 2 subunits were studied). If their data holds up then the underlying mechanism here needs to be considered. (Note: there is some precedent from the autophagy field of differential exocyst effects).

      (B) Golgi: It is unclear from their model (Fig. 5) why after exocyst KD of Sec15 the cis-Golgi is more preserved than the TGN, which appears as large vacuoles.

      (C) Granule homotypic fusion. Over-expression of just one subunit, Sec15-GFP, made giant secretory granules (SG) that were over 8 microns big. Does it act like a seed to promote exocyst assembly as the authors propose? If so is there evidence that there is biochemically more holocomplex with expression of Sec15, but not other subunits?

      (D) The authors should better frame their interpretations of other studies of the exocyst that includes role in autophagy, Palade body trafficking and differential roles of the subunits.

      In summary, there clearly are striking new effects on secretory granule biogenesis by dysfunction of the exocyst which are important and should inspire other studies for new roles of the exocyst; e.g. in non cannonical roles. Secondly, the power of the system to partially deplete proteins (if further validated) suggests that one may need to consider protein expression as an important variable that can be used to unmask multiple phenotypes in granule maturation. Last this paper implies new roles of the exocyst in homotypic fusion, which could be investigated in future work.

    4. Reviewer #3 (Public review):

      Freire and co-authors examine the role of the exocyst complex during the formation and secretion of mucins from secretory granules in the larval salivary gland of Drosophila melanogaster. Using transgenic lines with a tagged Sgs3 mucin, the authors KD expression of exocyst subunit members and observe a defect in secretory granules with a heterogeneity of phenotypes. By carefully controlling RNAi expression using a Gal4-based system, the authors can KD exocyst subunit expression to varying degrees. The authors find that the stronger the inhibition of expression of the exocyst is, the earlier the defect is in the secretory pathway. The manuscript is well written, the model system is physiological, and the techniques are innovative.

      In my initial review, my major concern was the pleiotropic effect of the loss of exocyst. The authors have responded to this point with clarity and have argued that the multiple localisations of exocyst during the Sgs3 synthesis programme indicate it is likely a direct phenotype. They also performed some analysis of PM lipids but did not detect a difference. I accept the arguments presented. However, I remain concerned that these are due to a pleiotropic effect. It is very hard to absolutely prove a direct effect, and due to the unusual claim and nature of the evidence (depletion levels), I think that there is still the possibility of this being an indirect effect. Perhaps it is just worth the authors writing a paragraph in the discussion, at least accepting the possibility that it is an indirect effect so future readers are aware of that.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1:

      (1) General comment: The evidence for these highly novel, potentially interesting roles (of the exocyst) would need to be more compelling to support direct involvement.

      We wish to thank the reviewer for his/her comments, and for considering that the proposed functions are highly novel and potentially interesting. To strengthen the evidence supporting the new roles of the exocyst, we have performed a number of additional experiments that are depicted in novel figures or figure panels of the new version of the manuscript. Particularly, we aimed at providing further support of the direct involvement of the exocyst in different steps of the regulated secretory pathway. Please see the details below.

      (2) For instance, the localization of exocyst to Golgi or to granule-granule contact sites does not seem substantial.

      We have performed quantitative colocalization studies, as suggested by the reviewer to further substantiate our initial findings. We have carefully analysed GFP-Sec15 distribution in relation to the Golgi complex and secretory Glue granules at relevant time points of salivary gland development. Overall, we found that GFP-Sec15 distribution is dynamic during salivary gland development. Before Glue synthesis (72 h AEL), Sec15 was observed in close association (defined as a distance equal to, or less than 0.6 µm) with the Golgi complex (please see below Author response image 1). This association was lost once Glue granules have begun to form (96 h AEL). Importantly, we do not see relevant association between GFP-Sec15 and the ER (please see Author response image 2). These observations support our conclusion that the exocyst plays a role at the Golgi complex. New images supporting these conclusions, as well as quantitative data, have been included in Figure 5 of the new version of the manuscript. In addition, real time imaging, as well as 3D reconstruction analyses, confirming the close association between Sec15 and Golgi cisternae are now included in the manuscript. Please see Supplementary Videos 1-3. These new data are described in the text lines 200-210 of the Results section and text lines 359368 of the Discussion section.

      Interestingly, at the time when Sec15-Golgi association is lost (96 h AEL), Sec15 foci associate instead with newly formed secretory granules (< 1µm diameter). This association persists during secretory granule maturation (100-116 h AEL), when Sec15 foci localize specifically in between neighbouring, immature secretory granules. When maturation has ended and Glue granule exocytosis begins (116-120 h AEL), this localization between granules is lost. These observations are consistent with a role of the exocyst in homotypic fusion during SG maturation. We have included new images showing that association between Sec15 and secretory granules is dynamic and depends on the developmental stage. We have quantified this association both during maturation and at a stage when SGs are already mature. We have in addition performed a 3D reconstruction analysis of these images to confirm the close association between Sec15 and immature SGs. These new data are now depicted in Figure 7BC, Supplementary Videos 4-5, and described in text lines 216-221 of the Results section. In addition, a lower magnification image is provided below in this letter (Author response image 3), quantifying the proportion of Sec15 foci localized in between SGs (yellow arrows) relative to the total number of Sec15 foci (yellow arrows + green arrowheads).

      Author response image 1.

      Criteria utilized to define Sec15 focithat were“associated” or“not associated” withthe trans-Golgi network in the experiments of Figure 5C-E of the manuscript.When the distance between maximal intensities of GFP-Sec15 and Golgi-RFP signals was equal or less than 0.6 m, the signals were considered “associated” (upper panels). When the distance was more than 0.6 m, the signals were considered “not associated” (lower panels).

      Author response image 2.

      Criteria utilized to define Sec15 focithat were“associated” or“not associated” withthe ERin the experiments of Figure 5A-Bof the manuscript.When the distance between maximal intensities of GFP-Sec15 and KDEL-RFP signals was equal or less than 0.6 m, the signals were considered “associated”. When the distance was more than 0.6 m, the signals were considered “not associated”.

      Author response image 3.

      (A) GFP-Sec15 foci (cyan) and SGs (red) are shown in cells bearing Immature SGs or (B) with mature SGs. Yellow arrows indicate GFP-Sec15 foci localized in between SGs; green arrowheads indicate GFP-Sec15 foci that arenot in between SGs. (C) Quantification of the percentage (%) of Sec15 foci localized in between SGs respect to the total number of Sec15 foci in cells filled with immature SGs (ISG)vs cells with mature SGs (MSG).

      It is interesting to mention that previous evidence from mammalian cultured cells (Yeaman et al,  2001) show that the exocyst localizes both at the trans-Golgi network and at the plasma membrane, weighing in favour of our claim that the exocyst is required at various steps of the exocytic pathway. Thus, the exocyst may play multiple roles in the secretion pathway in other biological models as well. This concept has now been included at the Discussion section of the revised version of the manuscript (lines 359-368).

      To make the conclusions of our work clearer, in the revised version of the manuscript, we have now included a graphical abstract, summarizing the dynamic localization of the exocyst in relation to the processes of SG biogenesis, maturation and exocytosis reported in our work. 

      (3) Instead, it is possible that defects in Golgi traffic and granule homotypic fusion are not due to direct involvement of the exocyst in these processes, but secondary to a defect in canonical exocyst roles at the plasma membrane. A block in the last step of glue exocytosis could perhaps propagate backward in the secretory pathway to disrupt Golgi complexes or cause poor cellular health due to loss of cell polarity or autophagy.

      We thank the reviewer for these thoughtful comments. We have performed a number of additional experiments to assess “cellular health” or to identify possible defects in cell polarity after knock-down of exocyst subunits. These new data have been included in new supplementary figures 5 and 6 of the revised version of the manuscript (please see below). 

      In our view, the precise localization of GFP-Sec15 at the Golgi complex (Figure 5C-E), as well as in between immature secretory granules (Figure 7B-D), argues in favour of a direct involvement of the exocyst in SG biogenesis and homofusion respectively. 

      We truly appreciate the comment of the reviewer raising the possibility that the defects that we observe at early steps of the pathway (SG biogenesis and SG maturation) may actually stem from a backward effect of the role of the exocyst in SG-plasma membrane tethering. We wish to respectfully point out that the processes of biogenesis, maturation and plasma membrane tethering/fusion of SGs do not occur simultaneously in the Drosophila larval salivary gland in vivo, as they do in other secretory model systems (i.e. cell culture). In this regard, the experimental model is unique in terms of synchronization. In each cell of the salivary gland, the three processes (biogenesis, maturation and exocytosis) occur sequentially, and controlled by developmental cues. At the developmental stage when SGs fuse with the plasma membrane, SG biogenesis has already ceased many hours earlier: SG biogenesis occurs at 96-100 hours after egg lay (AEL), SG maturation takes place at 100-112 hours AEL, and SG-plasma membrane fusion happens only when all SGs have undergone maturation and are ready to fuse with the plasma membrane at 116-120 h AEL. Thus, in our view it is not conceivable that a defect in SG-plasma membrane tethering/fusion (116-120 h AEL) may affect backwards the processes of SG biogenesis or SG maturation, which have occurred earlier in development (96-112 h AEL).

      As suggested by the reviewer, we have analysed several markers of cellular health and cell polarity, comparing conditions of exocyst subunit silencing (exo70RNAi, sec3RNAi or exo84RNAi) with wild type controls (whiteRNAi). These new data are depicted in Supplementary Figures 5 and 6, and described in lines 172-179 of the Results section of the revised version of the manuscript. Noteworthy, for these experiments we have applied silencing conditions that block secretory granule maturation, bringing about mostly immature SGs. Our analyses included: 1) Subcellular distribution of PI(4,5)P2, 2) subcellular distribution of the tetraspanin CD63, 3) of Rab11, 4) of filamentous actin, and 5) of CD8. We have also compared 6) nuclear size and nuclear general morphology, 7) the number and distribution of mitochondria, 8) morphology and subcellular distribution of the cis- and 9) trans-Golgi networks. Finally, 10) we have compared basal autophagy in salivary cells with or without knocking down exocyst subunits. The markers that we have analysed behaved similarly to those of control salivary glands, suggesting that the observed defects in regulated exocytosis indeed reflect different roles of the exocyst in the secretory pathway, rather than poor cellular health or impaired cell polarity.  

      Our conclusions are in line with previous studies in which apico-basal polarity, Golgi complex morphology and distribution, as well as apical membrane trafficking were also evaluated in exocyst mutant backgrounds, finding no anomalies (Jafar-Nejad et al, 2005). 

      Conversely, in studies in which apical polarity was disturbed by interfering with Crumbs levels, SG biogenesis, maturation and exocytosis were not affected (Lattner et al, 2019), indicating that these processes not necessarily interfere with one another.  

      (4) Final recommendation: In the absence of stronger evidence for these other exocyst roles, I would suggest focusing the study on the canonical role (interesting, as it was previously reported that Drosophila exocyst had no function in the salivary gland and limited function elsewhere [DOI: 10.1034/j.1600-0854.2002.31206.x]), and leave the alternative roles for discussion and deeper study in the future.  

      We appreciate the reviewer´s recommendation. However, we believe that the major strength of our work is the discovery of non-canonical roles of the exocyst complex, unrelated to its function as a tethering complex for vesicle-plasma membrane fusion. We believe that in the new version of our manuscript, we provide stronger evidence supporting the two novel roles of the exocyst:

      a) Its participation in maintaining the normal structure of the Golgi complex, and b) Its function in secretory granule maturation.

      Reviewer 2:

      (5) General comment: A key strength is the breadth of the assays and study of all 8 exocyst subunits in a powerful model system (fly larvae). Many of the assays are quantitated and roles of the exocyst in early phases of granule biogenesis have not been ascribed. 

      We are grateful that the reviewer appreciates the novelty of our contribution.

      (6) However there are several weaknesses, both in terms of experimental controls, concrete statements about the granules (better resolution), and making a clear conceptual framework. Namely, why do KD of different exocysts have different effects on presumed granule formation

      The reviewer has raised a point that is central to the interpretation of all our data throughout the manuscript. The short answer is that the extent of RNAi-dependent silencing of exocyst subunits determines the phenotype: 

      1) Maximum silencing affects Golgi complex morphology and prevents SG biogenesis. 2) Intermediate silencing blocks SG maturation, without affecting Golgi complex morphology and SG biogenesis. 3) Weak silencing blocks SG tethering and fusion with the plasma membrane, without affecting Golgi complex morphology, SG biogenesis or SG maturation. 

      In other words, 1) Low levels of exocyst subunits are sufficient for normal Golgi complex morphology and SG biogenesis. 2) Intermediate levels of exocyst subunits are sufficient for SG maturation (and also sufficient for SG biogenesis). 3) High levels of exocyst subunits are required for SG tethering and subsequent fusion with the plasma membrane. 

      Based on the above notion, we have exploited the fact that temperature can fine-tune the level of Gal4/UAS-dependent transcription, thereby achieving different levels of silencing, as shown by Norbert Perrimon et al in their seminal paper “the level of RNAi knockdown can also be altered by using Gal4 lines of various strengths, rearing flies at different temperatures, or via coexpression of UAS-Dicer2” (Perkins et al, 2015). 

      We found in our system that indeed, by applying appropriate silencing conditions (RNAi line and temperature) to any of the eight subunits of the exocyst, we have been able to obtain one of the three alternative phenotypes: Impaired SG biogenesis, or impaired SG maturation, or impaired SG tethering/fusion with the plasma membrane.

      These concepts are summarized below in Author response image 4. Please see also at point 26, the general comment of Reviewer #3. 

      We have conducted qRT-PCR assays to provide experimental support to the notions summarized above in Author response image 4. We measured the remaining levels of mRNAs of some of the exocyst subunits, after inducing RNAi-mediated silencing at different temperatures, or with different RNAi transgenic lines. The remaining RNA levels after silencing correlate well with the observed phenotypes, following the predictions of Author response image 4 and summarized in Author response image 5. These new data are now shown in Supplementary Figure 2 of the revised version of the manuscript, and described in lines 153-159 at the Results section.

      (7) Why does just overexpression of a single subunit (Sec15) induce granule fusion?

      The reviewer raises a very important point. Based on available data from the literature, Sec15 behaves as a seed for assembly of the holocomplex and it also mediates the recruitment of the holocomplex to SGs through its interaction with Rab11 (Escrevente et al, 2021; Bhuin and Roy, 2019; Wu et al, 2005; Zhang et al, 2004; Guo et al, 1999). Thus, overexpression of Sec15 is expected to enhance exocyst assembly, thereby potentiating the activities carried out by the complex in the cell, including SG homofusion. In the revised version of the manuscript we have also performed the overexpression of Sec8, finding that, unlike Sec15, Sec8 fails to induce homotypic fusion. These results were expected, as they confirm that Sec8 does not behave as a seed for mounting the whole complex. These new data have been included in Figure 7E-H, and are described in text lines 221-229 of the Results section. 

      Author response image 4.

      Conceptual model of RNAi expression at different temperatures , remaining levels of mRNA/protein levels and phenotypes obtained at each temperature.

      Author response image 5.

      qRT-PCR assays presented in Supplementary Figure 2 are shown in combination with the phenotypes observed at each of the conditions analyzed. Note the correlation between phenotypes and the extent of mRNA downregulation.

      (8) While the paper is fascinating, the major comments need to be addressed to really be able to make better sense of this work, which at present is hard to disentangle direct vs. secondary effects, especially as much of the TGN seems to be altered in the KDs.  

      We hope that our response to point 6) has helped to clarify this important point raised by the Reviewer. After applying silencing conditions where normal structure of the trans-Golgi network is impaired, SG biogenesis does not occur. Thus, since SGs do not form, it is not conceivable to detect defects in SG maturation or SG fusion with the plasma membrane in the same cell.

      (9) The authors conveniently ascribe many of the results to the holocomplex, but their own data (Fig. 4 and Fig. 6) are at odds with this.

      This is another central point of our work, so we thank the reviewer for his/her comment. In Figures 4A, 7A and 9A of the revised version of the manuscript, we show that, by inducing appropriate levels of silencing of any of the 8 subunits of the exocyst, each of the three alternative phenotypic manifestations can occur. In our opinion, this argues in favour of a function for the whole exocyst complex in each of the three specific activities proposed in our study: 1) SG biogenesis, 2) SG maturation, and 3) SG tethering/fusion with the plasma membrane. In detailed characterizations of these three phenotypes performed throughout the study, we decided to induce silencing of just two or three of the subunits of the exocyst, assuming that the whole complex accounts the mechanisms involved.

      Major comments

      (10) Resolution not sufficient. Identification of "mature secretory granules" (MSG) in Fig. 3 is based on low-resolution images in which the MSG are not clearly seen (see control in Fig. 3A) and rather appear as a diffuse haze, and not as clear granules. There may be granules here, but as shown it is not clear. Thus it would be helpful to acquire images at higher resolution (at the diffraction limit, or higher) to see and count the MSG.

      We thank the reviewer for raising this point, as it may not be straightforward to the reader to identify the SGs throughout the figures of our study. To make it clearer, in Figure 3A (magnified insets on the right), we have delimitated individual SGs with a green dotted line, and included diagrams (far right), which we hope will help the identification of SGs. In Figure 3B, we show that after silencing Sec84, a mosaic phenotype was observed: In some cells SGs fail to undergo maturation, and remain smaller than normal. In other cells of this mosaic phenotype, biogenesis of SGs was impaired and the fluorescent cargo remained trapped in a mesh-like structure (that we later show that corresponds to the ER). The dotted line marks individual SGs, and the diagrams included on the right intend to help the interpretation of the phenotype. The mesh-like structures where Sgs3-GFP was retained are also marked with dotted line, and schematized on the right. These new schemes are described in the Figure 3 caption of the revised version of the manuscript.

      We wish to mention that all the confocal images depicted in this figure and throughout the manuscript  have been captured at high resolution, with a theoretical resolution limit of 168177nm (d = γ/2NA). Given that secretory granules range from 0.8-7µm in diameter, the resolution is more than sufficient to clearly resolve these structures. 

      (11) Note: the authors are not clear on which objective was used. Maybe the air objective as the resolution appears poor).  

      In this particular figure, we have utilized a Plan-Apochromat 63X/1.4NA oil objective of the inverted Carl Zeiss LSM 880 confocal microscope (mentioned in materials and methods).

      (12) They need to prove that the diffuse Sgs3-GFP haze is indeed due to MSG.  

      If we interpret correctly the concern of the reviewer, what he/she calls “diffuse haze” is actually the distribution of Sgs3-GFP within individual SGs, which, as previously reported by other authors, is not homogeneous at this stage (Syed et al. 2022). We hope that the diagrams that we have included in Figure 3 A, B (point 10) will help the readers interpreting the images.   

      (13) Related it is unclear what are the granule structures that correspond to Immature secretory granules (ISG) and cells with mesh-like structures (MLS)?

      We are confident that the diagrams now included in Figure 3A and B will help the interpretation, and particularly to identify immature granules and the mesh-like structure generated after silencing of exocyst subunits.

      (14) Similarly, Sgs3 images of KD of 8 exocyst subunits were interpreted to be identical, in Fig. 4, but the resolution is poor.

      We hope that the issue related to resolution of our images has been properly addressed in the response to point 10) of this letter. In Figure 4A, we show that after silencing of any of the 8 subunits (with the appropriate conditions), in all cases SG biogenesis was impaired, and Sgs3GFP was instead retained in a mesh-like structure. Images obtained after silencing different exocyst subunits are of course not identical, but in all cases, a mesh-like structure has replaced the formation of SGs (Figure 4A). Hopefully, the diagrams now included in Figure 3A and B help the correct interpretation of the phenotypes throughout the study.

      To demonstrate that the structure in which Sgs3-GFP was retained upon exocyst complex knockdown corresponds to the ER, we performed a colocalization analysis between Sgs3-GFP and the ER markers GFP-KDEL or Bip-sfGFP-HDEL, after which we calculated the Pearsons Coefficient, which indicated substantial colocalization (Figure 4B-G and Supplementary Figures 7 and 8). These new data are described in lines 196-199 of the revised version of the manuscript. To facilitate the visualization of the results, in the revised version of the manuscript we have included magnified cropped areas of the images shown in Figure 4A.

      (15) What is remarkable is a highly variable effect of different subunit KD on the percentage of cells with MLS (Fig. 4C). Controls = 100 %, Exo70=~75% (at 19 deg), Sec3 = ~30%, Sec10 = 0%, Exo84 = 100% ... This is interesting for the functional exocyst is an octameric holocomples, thus why the huge subunit variability in the phenotypes? The trivial explanation is either: i) variable exocyst subunit KD (not shown) or ii) variability between experiments (no error bars are shown). Both should be addressed by quantification of the KD of different proteins and secondly by replicating the experiments.

      We agree with the reviewer statement. We believe that both, variability of KD efficiency (i) and variability between experiments (ii) contribute to the variable effect observed after knocking down the different subunits. As detailed in the response to point 6), we have performed qRT-PCR determinations to confirm that the severity of the phenotype depends on the efficiency of RNAimediated silencing. We chose to analyse in detail the effect on the subunits exo70 and sec3, which were those with the highest phenotypic differences between the three silencing temperatures utilized. We found that as expected, the levels of silencing were temperaturedependent, being higher at 29°C and lower at 19°C. These data were included in Supplementary Figure 2, and described lines 153-159 of the Results section and also summarized in Author response images 4 and 5 of this rebuttal letter.

      We thank the reviewer for his/her comment on the replication of experiments and statistics. We failed to include detailed numerical information in the original submission, such as the number of replicas and standard deviations of the data depicted in Figure 3C and Supplementary Figure 1, so we apologize for this omission. In the revised version of the manuscript, we have included a table (Supplementary Table 3) in which all the raw data of Figure 3C and Supplementary Figure 1, including standard deviations, are now depicted.

      (16) If their data holds up then the underlying mechanism here needs to be considered.

      (Note: there is some precedent from the autophagy field of differential exocyst effects)

      Our proposed mechanism is essentially that the holocomplex is required for multiple processes along the secretory pathway. Each of these actions (Golgi structure maintenance, SG maturation and SG tethering/fusion with the plasma membrane) requires different amounts of holocomplex activity, being this the reason why each phenotype manifests at different levels of RNAi-mediated silencing (Author response image 4 of this letter). The model predicts that Golgi structure maintenance requires minimal levels of complex activity, and that is why strong knock-down of exocyst subunits is required to obtain this phenotype. In line with our results, it has been reported that other tethering complexes of the CATCHR family are also required for maintaining Golgi cisternae stuck together (D'Souza et al, 2020; Khakurel and Lupashin, 2023; Liu et al, 2019). One possibility is that the exocyst may play a redundant role in the maintenance of the normal structure of the Golgi complex, along with other CATCHR complexes. This potential redundancy could explain why severe exocyst knock-down is required to observe structural anomalies at this organelle. On the other end of the spectrum, we propose that tethering/fusion with the plasma membrane is very susceptible to even slight reduction of complex activity, so that mild RNAi-mediated silencing is sufficient to provoke defects in this process. This proposed model is depicted in Author response image 4 and discussed in lines 395-405 of the Discussion section. 

      (17) In the salivary glands the authors state that the exocyst is needed for Sgs3-GFP exit from the ER. First, Pearson's coefficient should be shown so as to quantitate the degree of ER localizations of all KDs.

      We thank the reviewer for this comment that helped us to strengthen the observation that when SG biogenesis is impaired, Sgs3-GFP remains trapped in the ER. In the revised version of the manuscript, we have calculated Pearson´s coefficient to assess colocalization between ER markers (GFP-KDEL or Bip-sfGFP-HDEL) and Sgs3-GFP in salivary gland cells that express sec15RNAi. The Pearson’s coefficient was around 0.6 for both ER markers, indicating that colocalization with Sgs3-GFP was substantial (Supplementary Figure 8, text lines 196-199 of the Results section).

      (18) Second, there should be some rescue performed (if possible) to support specificity. 

      As suggested by the reviewer, we have performed a rescue experiment of the phenotype provoked by the expression of sec15 RNAi, which consisted on the retention of Sgs3-GFP in the endoplasmic reticulum: Expression of Sec15-GFP reverted substantially the ER retention phenotype, rescuing SG biogenesis and also SG maturation in most cells (over 60% of the cells). These new data are now shown in Supplementary Figure 4, and described in lines 168-171 of the Results section.

      (19) Third, importantly other proteins that should traffic to the PM need to be shown to traffic normally so as to rule out a non-specific effect.

      We have addressed this issue (also mentioned by Reviewer #1), by analyzing the localization of a number of polarization markers, finding that the overall polarization of the cell was not affected by loss of function of exocyst subunits. Please, see our response to the point 3) raised by Reviewer #1. The new data showing cell polarization markers are shown in Supplementary Figure 6 of the revised version of the manuscript, and described on text lines 172-179 of the Results section.

      (20) It is unclear from their model (Fig. 5) why after exocyst KD of Sec15 the cis-Golgi is more preserved than the TGN, which appears as large vacuoles. This is not quantitated and not shown for the 8 subunits.

      We thank the reviewer for this relevant comment. We agree that the phenotype of either, sec15 or sec3 loss-of-function cells manifests differently with cis-Golgi and trans-Golgi markers. While the cis-Golgi marker looked fragmented and aggregated, the trans-Golgi marker adopted a swollen appearance. However, in our view, the different appearance of the two markers does not necessarily imply that one compartment is more preserved than the other. In the revised version of the manuscript, we have quantified the penetrance of the phenotypes provoked by sec15 or sec3 silencing, using both cis-Golgi and trans-Golgi markers. In both cases, the penetrance was high, although even higher with the trans-Golgi marker. These new data are now depicted in Supplementary Figure 9 of the revised version of the manuscript. 

      It is interesting to mention that in HeLa cells, as well as in the retinal epithelial cell line hTERT, Golgi phenotypes similar to those we have described here have been reported after loss-offunction of other tethering complexes, which were shown to maintain the Golgi cisternae stuck together, including the GOC and GARP complexes (D'Souza et al, 2020, Khakurel and Lupashin, 2023; Shijie Liu et al, 2019). As we did throughout our work, not every aspect of the analysis included the silencing of all eight subunits. In this case, we chose to silence Sec3 and Sec15. Please note that we have modified the model depicted in Figure 6E-F, to highlight the cis- and transGolgi phenotypes upon exocyst knock-down, as well as the localization of the exocyst in cisternae of the Golgi complex.

      (21) Acute/Chronic control: It would be nice to acutely block the exocyst so as to better distinguish if the effects observed are primary or secondary effects (e.g. on a recycling pathway).

      We thank the reviewer for raising this important issue. To address this point, and to be able to induce silencing of exocyst subunits at specific time intervals of larval development, we utilized a strategy based on a thermosensitive variant of the Gal4 inhibitor Gal80 (Gal80ts)(Lee and Luo, 1999). We blocked Gal4 activity (and therefore RNAi expression) by maintaining the larvae at 18 °C during the 1st and 2nd instars (until 120 hours after egg lay), and then induced the activity of Gal4 specifically at the 3rd larval instar by raising the temperature to 29 ºC, a condition in which Gal80ts becomes inactive. After silencing the expression of sec3 or sec15 at the 3rd larval instar only, the phenotype was very similar to that observed after chronic silencing of exocyst subunits (larvae maintained at 29 ºC all throughout development, where Gal4 was never inhibited). These observations suggest that the defects observed in the secretory pathway after knock down of exocyst subunits reflect genuine functions of the exocyst in this pathway, rather than a secondary effect derived from impaired development of the salivary glands at early larval stages. These new results are now shown in Supplementary Figure 3, and described in manuscript lines 160-171 of the Results section.   

      (22) Granule homotypic fusion. Strangely over-expression of just one subunit, Sec15-GFP, made giant secretory granules (SG) that were over 8 microns big! Why is that, especially if normally the exocyst is normally a holocomplex. Was this an effect that was specific to Sec15 or all exocyst subunits? Is the Sec15 level rate limiting in these cells? It may be that a subcomplex of Sec15/10 plays earlier roles, but in any case this needs to be addressed across all (or many) of the exocyst subcomplex members.

      Please, see our response to point 7) of this letter. Sec15 is believed to act as a seed for the formation of the whole complex.

      (23) In summary, there are clearly striking effects on secretory granule biogenesis by dysfunction of the exocyst, however right now it is hard to disentangle effects on ERGolgi traffic, loss of the TGN, and a problem in maturation or fusion of granules. 

      As discussed in detail in our response to the point 3 raised by Reviewer #1, the secretory pathway is highly synchronized in each of the cells of the Drosophila salivary gland. SG biogenesis, SG maturation and SG fusion with the plasma membrane never occur simultaneously in the same cell. Thus, in a cell in which ER-Golgi traffic is impaired (and SG biogenesis does not occur), SGs do not exist, and therefore, they cannot exhibit defects in the process of maturation or fusion with the plasma membrane. In summary, we believe that our work has shown that in Drosophila larval salivary glands the exocyst holocomplex is required for (at least) three functions along the secretory pathway: 1) To maintain the appropriate Golgi complex architecture, thus enabling ERGolgi transport; 2) For secretory granule maturation: both, homotypic fusion and acquisition of maturation factors; 3) For secretory granule exocytosis: secretory granule tethering to enable subsequent fusion with the plasma membrane. As mentioned above (point 6 of this letter), these three functions require different amounts of the holocomplex, and therefore can be revealed by inducing different levels of silencing.  

      (24) It is also confusing if the entire exocyst holocomplex or subcomplex plays a key role 

      The fact that, by silencing any of the subunits (with the appropriate conditions) it is possible obtain any of the 3 phenotypes (impaired SG biogenesis, impaired SG maturation or impaired SG fusion with the plasma membrane) argues in favour of a function of the complex as a whole in each of these three functions.

      Reviewer 3:

      (25) General comment: Freire and co-authors examine the role of the exocyst complex during the formation and secretion of mucins from secretory granules in the larval salivary gland of Drosophila melanogaster. Using transgenic lines with a tagged Sgs3 mucin the authors KD expression of exocyst subunit members and observe a defect in secretory granules with a heterogeneity of phenotypes. By carefully controlling RNAi expression using a Gal4-based system the authors can KD exocyst subunit expression to varying degrees. The authors find that the stronger the inhibition of expression of exocyst the earlier in the secretory pathway the defect. The manuscript is well written, the model system is physiological, and the techniques are innovative.

      We appreciate the reviewer´s assessment of our work. 

      (26) My major concern is that the evidence underlying the fundamental claim of the manuscript that "the exocyst complex participates" in multiple secretory processes lacks direct evidence.

      We thank the reviewer for raising this important issue. We believe that the analysis of Sec15 subcellular localization during salivary gland development (Figures 5, 7B-D and 9E-F), in combination with the detailed analysis of the phenotypes provoked by loss-of-function of each of the exocyst subunits, provide evidence supporting multiple functions of the exocyst in the secretory pathway. We have also included 3D reconstructions and videos of GFP-Sec15 colocalization with Golgi and SG markers to support exocyst localization associated to these structures (Supplementary Videos 1-7), text lines 200-210; 216-221 and 303-305.

      (27) It is clear from multiple lines of evidence, which are discussed by the authors, that exocyst is essential for an array of exocytic events. The fundamental concern is that loss of homeostasis on the plasma membrane proteome and lipidome might have severe pleiotropic effects on the cell.

      We agree with the reviewer that this is an important point that needed to be addressed. As discussed in detail above at the response to point 3 raised by Reviewer #1, we have analysed several plasma membrane markers (including a PI(4,5)P2 lipid reporter), and found that overall, plasma membrane integrity and polarity were not substantially affected (Supplementary Figure 6). In addition, we have analyzed several markers of general cellular “health” that indicate that salivary gland cells do not seem to be distressed by the reduction of exocyst complex activity (Supplementary Figure 5). These new data are described in lines 172-179 of the Results section.

      (28) Perhaps the authors have more evidence that exocyst is important for homeotypic fusion of the SGs, as supported by the localisation of Sec15 on the fusion sites.

      We believe that the fact that, by silencing any of the exocyst subunits (with the appropriate conditions), immature smaller-than-normal granules were observed, argus in favour that the exocyst as a whole participates in SG homofusion (Figure 7A). In addition, we have included more images, quantifications, 3D reconstructions and videos of GFP-Sec15 localized just at the contact sites between immature SGs. We have quantified and compared GFP-Sec15 localization at immature SG vs its localization at mature SGs, finding that localizes preferentially at immature SGs, supporting a role of the exocyst as a tethering complex during homotypic fusion (shown Figure 7B-C and Supplementary Videos 4-6, and described in lines 216-221 of the Results section). Please see also our response to the point 2 raised by reviewer 1 in this rebuttal letter, and to Author response image 3 above in this letter.

      (29) The second question that I think is important to address is, what exactly do the varying RNAi levels correspond to in terms of experiments, and have these been validated? Due to the fundamental claim being that the severity of the phenotype being correlated with the level of KD, I think validation of this model is absolutely essential.  

      We thank the Reviewer for raising this important point, and agree it was lacking in the original version of our manuscript. As discussed in our response to the point 6) raised by Reviewer #2, we have performed qRT-PCR determinations for exo70 and sec3 mRNA levels after inducing silencing of these subunits at different temperatures, or with different RNAi transgenic lines. The remnant mRNA levels correlate well with the observed phenotypes. Please see Supplementary Figure 2 of the revised manuscript, and Author response image 5 of this rebuttal letter; described in lines 155-159 of the Results section. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      -  The authors assert in the discussion that exocyst involvement in constitutive secretion is well documented. This is based on a very recent study in mammalian culture cells. Therefore, I would not dismiss the issue as completely settled. Furthermore, a previous study of Drosophila sec10 reported no roles outside the ring gland (DOI: 10.1034/j.1600-0854.2002.31206.x).

      We have included these observations in the Discussion section. Lines 326-329.

      -  A salivary gland screening by Julie Brill's lab reported exocyst components as hits (DOI: 10.1083/jcb.201808017).

      We have referred to this paper in the Discussion section. Lines 326-329.

      -  It should be explained in more detail what is measured in graphs 7C, F, and others quantifying fluorescence around secretory granules. Looking at the images, the decrease in Rab1 and Rab11 seems less convincing.

      We have made a clearer description of how fluorescence intensity was measured in the Methods section lines 558-561. Also, we have uploaded a source data file in which the raw data of each experiment used for quantifications are disclosed. 

      Please note that the data indicates that Rab11 levels are higher in sec5 (Figure 8J-L) and sec3 (supplementary Figure 11M-R).

      Reviewer #2 (Recommendations For The Authors):

      No major issues.

      Writing - The authors should better frame their interpretations of other studies of the exocyst that include the role in autophagy, Palade body trafficking, and differential roles of the subunits.

      We have discussed these specific points in the Discussion section, lines 348-355 and 409-410.

      Minor - Fig. 6A: Why are variable temperatures (19-29 deg C used for the 8 KD experiments)?

      Please show it all at the same temperature (control too).

      The need for the usage of specific temperatures to obtain specific phenotypes with each of the RNAi lines used was explained in point 6 of this letter.

      Reviewer #3 (Recommendations For The Authors):

      In the abstract, the authors refer to the exocytic process and go on to describe secretory granule biogenesis and exocytosis. However, there are many exocytic processes aside from secretory granule biogenesis, and I think the authors should clarify this.

      Corrected in the Abstract. Lines 19-21

      Page 17 Thomas, 2021 reference, there is a glitch with the reference.

      Thanks for noticing. Fixed.

      References

      Bhuin T, Roy JK. Developmental expression, co-localization and genetic interaction of exocyst component Sec15 with Rab11 during Drosophila development. Exp Cell Res. 2019 Aug 1;381(1):94-104. doi: 10.1016/j.yexcr.2019.04.038. Epub 2019 May 7. PMID: 31071318.

      D'Souza Z, Taher FS, Lupashin VV. Golgi inCOGnito: From vesicle tethering to human disease. Biochim Biophys Acta Gen Subj. 2020 Nov;1864(11):129694. doi: 10.1016/j.bbagen.2020.129694. Epub 2020 Jul 27. PMID: 32730773; PMCID: PMC7384418.

      Escrevente C, Bento-Lopes L, Ramalho JS, Barral DC. Rab11 is required for lysosome exocytosis through the interaction with Rab3a, Sec15 and GRAB. J Cell Sci. 2021 Jun 1;134(11):jcs246694. doi: 10.1242/jcs.246694. Epub 2021 Jun 8. PMID: 34100549; PMCID: PMC8214760.

      Guo W, Roth D, Walch-Solimena C, Novick P. The exocyst is an effector for Sec4p, targeting secretory vesicles to sites of exocytosis. EMBO J. 1999 Feb 15;18(4):1071-80. doi: 10.1093/emboj/18.4.1071. PMID: 10022848; PMCID: PMC1171198.

      Jafar-Nejad H, Andrews HK, Acar M, Bayat V, Wirtz-Peitz F, Mehta SQ, Knoblich JA, Bellen HJ. Sec15, a component of the exocyst, promotes notch signaling during the asymmetric division of Drosophila sensory organ precursors. Dev Cell. 2005 Sep;9(3):351-63. doi: 10.1016/j.devcel.2005.06.010. PMID: 16137928.

      Khakurel A, Lupashin VV. Role of GARP Vesicle Tethering Complex in Golgi Physiology. Int J Mol Sci. 2023 Mar 23;24(7):6069. doi: 10.3390/ijms24076069. PMID: 37047041; PMCID: PMC10094427.

      Lattner J, Leng W, Knust E, Brankatschk M, Flores-Benitez D. Crumbs organizes the transport machinery by regulating apical levels of PI(4,5)P2 in Drosophila. Elife. 2019 Nov 7;8:e50900. doi: 10.7554/eLife.50900. PMID: 31697234; PMCID: PMC6881148.

      Lee T, Luo L. Mosaic analysis with a repressible cell marker for studies of gene function in neuronal morphogenesis. Neuron. 1999 Mar;22(3):451-61. doi: 10.1016/s08966273(00)80701-1. PMID: 10197526.

      Liu S, Majeed W, Grigaitis P, Betts MJ, Climer LK, Starkuviene V, Storrie B. Epistatic Analysis of the Contribution of Rabs and Kifs to CATCHR Family Dependent Golgi Organization. Front Cell Dev Biol. 2019 Aug 2;7:126. doi: 10.3389/fcell.2019.00126. PMID: 31428608; PMCID: PMC6687757.

      Perkins LA, Holderbaum L, Tao R, Hu Y, Sopko R, McCall K, Yang-Zhou D, Flockhart I, Binari R, Shim HS, Miller A, Housden A, Foos M, Randkelv S, Kelley C, Namgyal P, Villalta C, Liu LP, Jiang X, Huan-Huan Q, Wang X, Fujiyama A, Toyoda A, Ayers K, Blum A, Czech B, Neumuller R, Yan D, Cavallaro A, Hibbard K, Hall D, Cooley L, Hannon GJ, Lehmann R, Parks A, Mohr SE, Ueda R, Kondo S, Ni JQ, Perrimon N. The Transgenic RNAi Project at Harvard Medical School: Resources and Validation. Genetics. 2015 Nov;201(3):843-52. doi: 10.1534/genetics.115.180208. Epub 2015 Aug 28. PMID: 26320097; PMCID: PMC4649654.

      Wu S, Mehta SQ, Pichaud F, Bellen HJ, Quiocho FA. Sec15 interacts with Rab11 via a novel domain and affects Rab11 localization in vivo. Nat Struct Mol Biol. 2005 Oct;12(10):879-85. doi: 10.1038/nsmb987. Epub 2005 Sep 11. PMID: 16155582.

      Yeaman C, Grindstaff KK, Wright JR, Nelson WJ. Sec6/8 complexes on trans-Golgi network and plasma membrane regulate late stages of exocytosis in mammalian cells. J Cell Biol. 2001 Nov 12;155(4):593-604. doi: 10.1083/jcb.200107088. Epub 2001 Nov 5. PMID: 11696560; PMCID: PMC2198873.

      Zhang XM, Ellis S, Sriratana A, Mitchell CA, Rowe T. Sec15 is an effector for the Rab11 GTPase in mammalian cells. J Biol Chem. 2004 Oct 8;279(41):43027-34. doi: 10.1074/jbc.M402264200. Epub 2004 Jul 29. PMID: 15292201.

    1. eLife Assessment

      In this important study, the authors use zebrafish to examine protein absorption in the gut. Using a combination of imaging and single-cell RNA-seq, they characterize a population of lysosome-rich enterocytes that are essential for protein uptake. They find that the microbiome impacts the ability of these cells to uptake protein. The RNA-seq provides a rich dataset for future functional experiments, which makes a convincing case for the importance of these cells.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We summarized the main changes:

      (1) In the Introduction part, we give a general definition of habitat fragmentation to avoid confusion, as reviewers #1 and #2 suggested.

      (2) We clarify the two aspects of the observed “extinction”——“true dieback” and “emigration”, as reviewers #2 and #3 suggested.

      (3) In the Methods part, we 1) clarify the reason for testing the temporal trend in colonization/extinction dynamics and describe how to select islands as reviewer #1 suggested; 2) describe how to exclude birds from the analysis as reviewer #2 suggested.

      (4) In the Results part, we modified and rearranged Figure 4-6 as reviewers #1, #2 and #3 suggested.

      (5) In the Discussion part, we 1) discuss the multiple aspects of the metric of isolation for future research as reviewer #3 suggested; 2) provide concrete evidence about the relationship between habitat diversity or heterogeneity and island area and 3) provide a wider perspective about how our results can inform conservation practices in fragmented habitats as reviewer #2 suggested.

      eLife Assessment

      This important study enhances our understanding of how habitat fragmentation and climate change jointly influence bird community thermophilization in a fragmented island system. The evidence supporting some conclusions is incomplete, as while the overall trends are convincing, some methodological aspects, particularly the isolation metrics and interpretation of colonization/extinction rates, require further clarification. This work will be of broad interest to ecologists and conservation biologists, providing crucial insights into how ecosystems and communities react to climate change.

      We sincerely extend our gratitude to you and the esteemed reviewers for acknowledging the importance of our study and for raising these concerns. We have clarified the rationale behind our analysis of temporal trends in colonization and extinction dynamics, as well as the choice of distance to the mainland as the isolation metric. Additionally, we further discuss the multiple aspects of the metric of isolation for future research and provide concrete supporting evidence about the relationship between habitat diversity or heterogeneity and island area.

      Incorporating these valuable suggestions, we have thoroughly revised our manuscript, ensuring that it now presents a more comprehensive and nuanced account of our research. We are confident that these improvements will further enhance the impact and relevance of our work for ecologists and conservation biologists alike, offering vital insights into the resilience and adaptation strategies of communities facing the challenges of climate change.

      Reviewer #1 (Public Review):

      Summary:

      This study reports on the thermophilization of bird communities in a network of islands with varying areas and isolation in China. Using data from 10 years of transect surveys, the authors show that warm-adapted species tend to gradually replace cold-adapted species, both in terms of abundance and occurrence. The observed trends in colonisations and extinctions are related to the respective area and isolation of islands, showing an effect of fragmentation on the process of thermophilization.

      Strengths:

      Although thermophilization of bird communities has been already reported in different contexts, it is rare that this process can be related to habitat fragmentation, despite the fact that it has been hypothesized for a long time that it could play an important role. This is made possible thanks to a really nice study system in which the construction of a dam has created this incredible Thousand Islands lake. Here, authors do not simply take observed presence-absence as granted and instead develop an ambitious hierarchical dynamic multi-species occupancy model. Moreover, they carefully interpret their results in light of their knowledge of the ecology of the species involved.

      Response: We greatly appreciate your recognition of our study system and the comprehensive approach and careful interpretation of results. 

      Weaknesses:

      Despite the clarity of this paper on many aspects, I see a strong weakness in the authors' hypotheses, which obscures the interpretation of their results. Looking at Figure 1, and in many sentences of the text, a strong baseline hypothesis is that thermophilization occurs because of an increasing colonisation rate of warm-adapted species and extinction rate of cold-adapted species. However, there does not need to be a temporal trend! Any warm-adapted species that colonizes a site has a positive net effect on CTI; similarly, any cold-adapted species that goes extinct contributes to thermophilization.

      Thank you very much for these thoughtful comments. The understanding depends on the time frame of the study and specifically, whether the system is at equilibrium. We think your claim is based on this background: if the system is not at equilibrium, then CTI can shift simply by having differential colonization (or extinction) rates for warm-adapted versus cold-adapted species. We agree with you in this case.

      On the other hand, if a community is at equilibrium, then there will be no net change in CTI over time. Imagine we have an archipelago where the average colonization of warm-adapted species is larger than the average colonization of cold-adapted species, then over time the archipelago will reach an equilibrium with stable colonization/extinction dynamics where the average CTI is stable over time. Once it is stable, then if there is a temporal trend in colonization rates, the CTI will change until a new equilibrium is reached (if it is reached).

      For our system, the question then is whether we can assume that the system is or has ever been at equilibrium. If it is not at equilibrium, then CTI can shift simply by having differential colonization (or extinction) rates for warm-adapted versus cold-adapted species. If the system is at equilibrium (at the beginning of the study), then CTI will only shift if there is a temporal change or trend in colonization or extinction rates.

      Habitat fragmentation can affect biomes for decades after dam formation. The “Relaxation effect” (Gonzalez, 2000) refers to the fact that the continent acts as a potential species pool for island communities. Under relaxation, some species will be filtered out over time, mainly through the selective extinction of species that are highly sensitive to fragmentation. Meanwhile, for a 100-hectare patch, it takes about ten years to lose 50% of bird species; The smaller the patch area, the shorter the time required (Ferraz et al., 2003; Haddad et al., 2015). This study was conducted 50 to 60 years after the formation of the TIL, making the system with a high probability of reaching “equilibrium” through “Relaxation effect”(Si et al., 2014). We have no way of knowing exactly whether “equilibrium” is true in our system. Thus, changing rates of colonization-extinction over time is actually a much stronger test of thermophilization, which makes our inference more robust.

      We add a note to the legend of Figure 1 on Lines 781-786:

      “CTI can also change simply due to differential colonization-extinction rates by thermal affinity if the system is not at equilibrium prior to the study. In our study system, we have no way of knowing whether our island system was at equilibrium at onset of the study, thus, focusing on changing rates of colonization-extinction over time presents a much stronger tests of thermophilization.”

      We hope this statement can make it clear. Thank you again for this meaningful question.

      Another potential weakness is that fragmentation is not clearly defined. Generally, fragmentation sensu lato involves both loss of habitat area and changes in the spatial structure of habitats (i.e. fragmentation per se). Here, both area and isolation are considered, which may be slightly confusing for the readers if not properly defined.

      Thank you for reminding us of that. Habitat fragmentation in this study involves both habitat loss and fragmentation per se. We have clarified the general definition in the Introduction on Lines 61-63:

      “Habitat fragmentation, usually defined as the shifts of continuous habitat into spatially isolated and small patches (Fahrig, 2003), in particular, has been hypothesized to have interactive effects with climate change on community dynamics.”

      Reviewer #2 (Public Review):

      Summary:

      This study addresses whether bird community reassembly in time is related to climate change by modelling a widely used metric, the community temperature index (CTI). The authors first computed the temperature index of 60 breeding bird species thanks to distribution atlases and climatic maps, thus obtaining a measure of the species realized thermal niche.

      These indices were aggregated at the community level, using 53 survey transects of 36 islands (repeated for 10 years) of the Thousand Islands Lake, eastern China. Any increment of this CTI (i.e. thermophilization) can thus be interpreted as a community reassembly caused by a change in climate conditions (given no confounding correlations).

      The authors show thanks to a mix of Bayesian and frequentist mixed effect models to study an increment of CTI at the island level, driven by both extinction (or emigration) of cold-adapted species and colonization of newly adapted warm-adapted species. Less isolated islands displayed higher colonization and extinction rates, confirming that dispersal constraints (created by habitat fragmentation per se) on colonization and emigration are the main determinants of thermophilization. The authors also had the opportunity to test for habitat amount (here island size). They show that the lack of microclimatic buffering resulting from less forest amount (a claim backed by understory temperature data) exacerbated the rates of cold-adapted species extinction while fostering the establishment of warm-adapted species.

      Overall these findings are important to range studies as they reveal the local change in affinity to the climate of species comprising communities while showing that the habitat fragmentation VS amount distinction is relevant when studying thermophilization. As is, the manuscript lacks a wider perspective about how these results can be fed into conservation biology, but would greatly benefit from it. Indeed, this study shows that in a fragmented reserve context, habitat amount is very important in explaining trends of loss of cold-adapted species, hinting that it may be strategic to prioritize large habitats to conserve such species. Areas of diverse size may act as stepping stones for species shifting range due to climate change, with small islands fostering the establishment of newly adapted warm-adapted species while large islands act as refugia for cold-adapted species. This study also shows that the removal of dispersal constraints with low isolation may help species relocate to the best suitable microclimate in a heterogenous reserve context.

      Thank you very much for your valuable feedback. We greatly appreciate your recognition of the scientific question to the extensive dataset and diverse approach. In particular, you provided constructive suggestions and examples on how to extend the results to conservation guidance. This is something we can’t ignore in the manuscript. We have added a paragraph to the end of the Discussion, stating how our results can inform conservation, on Lines 339-347:

      ‘Overall, our findings have important implications for conservation practices. Firstly, we confirmed the role of isolation in limiting range shifting. Better connected landscapes should be developed to remove dispersal constraints and facilitate species’ relocation to the best suitable microclimate. Second, small patches can foster the establishment of newly adapted warm-adapted species while large patches can act as refugia for cold-adapted species. Therefore, preserving patches of diverse sizes can act as stepping stones or shelters in a warming climate depending on the thermal affinity of species. These insights are important supplement to the previous emphasis on the role of habitat diversity in fostering (Richard et al., 2021) or reducing (Gaüzère et al., 2017) community-level climate debt.’

      Strength:

      The strength of the study lies in its impressive dataset of bird resurveys, that cover 10 years of continued warming (as evidenced by weather data), 60 species in 36 islands of varying size and isolation, perfect for disentangling habitat fragmentation and habitat amount effects on communities. This distinction allows us to test very different processes mediating thermophilization; island area, linked to microclimatic buffering, explained rates for a variety of species. Dispersal constraints due to fragmentation were harder to detect but confirms that fragmentation does slow down thermophilization processes.

      This study is a very good example of how the expected range shift at the biome scale of the species materializes in small fragmented regions. Specifically, the regional dynamics the authors show are analogous to what processes are expected at the trailing and colonizing edge of a shifting range: warmer and more connected places display the fastest turnover rates of community reassembly. The authors also successfully estimated extinction and colonization rates, allowing a more mechanistic understanding of CTI increment, being the product of two processes.

      The authors showed that regional diversity and CTI computed only by occurrences do not respond in 10 years of warming, but that finer metrics (abundance-based, or individual islands considered) do respond. This highlights the need to consider a variety of case-specific metrics to address local or regional trends. Figure Appendix 2 is a much-appreciated visualization of the effect of different data sources on Species thermal Index (STI) calculation.

      The methods are long and diverse, but they are documented enough so that an experienced user with the use of the provided R script can follow and reproduce them.

      Thank you very much for your profound Public Review. We greatly appreciate your recognition of the scientific question, the extensive dataset and the diverse approach. 

      Weaknesses:

      While the overall message of the paper is supported by data, the claims are not uniformly backed by the analysis. The trends of island-specific thermophilization are very credible (Figure 3), however, the variable nature of bird observations (partly compensated by an impressive number of resurveys) propagate a lot of errors in the estimation of species-specific trends in occupancy, abundance change, and the extinction and colonization rates. This materializes into a weak relationship between STI and their respective occupancy and abundance change trends (Figure 4a, Figure 5, respectively), showing that species do not uniformly contribute to the trend observed in Figure 3. This is further shown by the results presented in Figure 6, which present in my opinion the topical finding of the study. While a lot of species rates response to island areas are significant, the isolation effect on colonization and extinction rates can only be interpreted as a trend as only a few species have a significant effect. The actual effect on the occupancy change rates of species is hard to grasp, and this trend has a potentially low magnitude (see below).

      Thank you very much for pointing out this shortcoming. The R2 between STI and their respective occupancy trends is relatively small (R2\=0.035). But the R2 between STI and their respective abundance change trends are relatively bigger, in the context of Ecology research (R2\=0.123). The R2 between STI and their respective colonization rate (R2\=0.083) and extinction rate trends (R2\=0.053) are also relatively small. Low R2 indicates that we can’t make predictions using the current model, we must notice that except STI, other factors may influence the species-specific occupancy trend. Nonetheless, it is important to notice that the standardized coefficient estimates are not minor and the trend is also significant, indicating the species-specific response is as least related to STI.

      The number of species that have significant interaction terms for isolation (Figure 6) is indeed low. Although there is uncertainty in the estimation of relationships, there are also consistent trends in response to habitat fragmentation of colonization of warm-adapted species and extinction of cold-adapted species. This is especially true for the effect of isolation, where on islands nearer to the mainland, warm-adapted species (15 out of 15 investigated species) increased their colonization probability at a higher rate over time, while most cold-adapted species (21 out of 23 species) increased their extinction probability at a higher rate. We now better highlight these results in the Results and Discussion.

      While being well documented, the myriad of statistical methods used by the authors ampere the interpretation of the figure as the posterior mean presented in Figure 4b and Figure 6 needs to be transformed again by a logit-1 and fed into the equation of the respective model to make sense of. I suggest a rewording of the caption to limit its dependence on the method section for interpretation.

      Thank you for this suggestion. The value on the Y axis indicates the posterior mean of each variable (year, area, isolation and their interaction effects) extracted from the MSOM model, where the logit(extinction rate) or logit(colonization rate) was the response variable. All variables were standardized before analysis to make them comparable so interpretation is actually quite straight forward: positive values indicate positive influence while negative values indicate negative influence. Because the goal of Figure 6 is to display the negative/positive effect, we didn’t back-transform them. Following your advice, we thus modified the caption of Figure 6 (now renumbered as Figure 5, following a comment from Reviewer #3, to move Figure 5 to Figure 4c). The modified title and legends of Figure 5 are on Lines 817-820:

      “Figure 5. Posterior estimates of logit-scale parameters related to cold-adapted species’ extinction rates and warm-adapted species’ colonization rates. Points are species-specific posterior means on the logit-scale, where parameters >0 indicate positive effects (on extinction [a] or colonization [b]) and parameters <0 indicate negative effects...”

      By using a broad estimate of the realized thermal niche, a common weakness of thermophilization studies is the inability to capture local adaptation in species' physiological or behavioral response to a rise in temperature. The authors however acknowledge this limitation and provide specific examples of how species ought to evade high temperatures in this study region.

      We appreciate your recognition. This is a common problem in STI studies. We hope in future studies, researchers can take more details about microclimate of species’ true habitat across regions into consideration when calculating STI. Although challenging, focusing on a smaller portion of its distribution range may facilitate achievement.

      Reviewer #3 (Public Review):

      Summary:

      Juan Liu et al. investigated the interplay between habitat fragmentation and climate-driven thermophilization in birds in an island system in China. They used extensive bird monitoring data (9 surveys per year per island) across 36 islands of varying size and isolation from the mainland covering 10 years. The authors use extensive modeling frameworks to test a general increase in the occurrence and abundance of warm-dwelling species and vice versa for cold-dwelling species using the widely used Community Temperature Index (CTI), as well as the relationship between island fragmentation in terms of island area and isolation from the mainland on extinction and colonization rates of cold- and warm-adapted species. They found that indeed there was thermophilization happening during the last 10 years, which was more pronounced for the CTI based on abundances and less clearly for the occurrence-based metric. Generally, the authors show that this is driven by an increased colonization rate of warm-dwelling and an increased extinction rate of cold-dwelling species. Interestingly, they unravel some of the mechanisms behind this dynamic by showing that warm-adapted species increased while cold-dwelling decreased more strongly on smaller islands, which is - according to the authors - due to lowered thermal buffering on smaller islands (which was supported by air temperature monitoring done during the study period on small and large islands). They argue, that the increased extinction rate of cold-adapted species could also be due to lowered habitat heterogeneity on smaller islands. With regards to island isolation, they show that also both thermophilization processes (increase of warm and decrease of cold-adapted species) were stronger on islands closer to the mainland, due to closer sources to species populations of either group on the mainland as compared to limited dispersal (i.e. range shift potential) in more isolated islands.

      The conclusions drawn in this study are sound, and mostly well supported by the results. Only a few aspects leave open questions and could quite likely be further supported by the authors themselves thanks to their apparent extensive understanding of the study system.

      Strengths:

      The study questions and hypotheses are very well aligned with the methods used, ranging from field surveys to extensive modeling frameworks, as well as with the conclusions drawn from the results. The study addresses a complex question on the interplay between habitat fragmentation and climate-driven thermophilization which can naturally be affected by a multitude of additional factors than the ones included here. Nevertheless, the authors use a well-balanced method of simplifying this to the most important factors in question (CTI change, extinction, and colonization, together with habitat fragmentation metrics of isolation and island area). The interpretation of the results presents interesting mechanisms without being too bold on their findings and by providing important links to the existing literature as well as to additional data and analyses presented in the appendix.

      We appreciate very much for your positive and constructive comments and suggestions. Thank you for your recognition of the scientific question, the modeling approach and the conclusions. 

      Weaknesses:

      The metric of island isolation based on the distance to the mainland seems a bit too oversimplified as in real life the study system rather represents an island network where the islands of different sizes are in varying distances to each other, such that smaller islands can potentially draw from the species pools from near-by larger islands too - rather than just from the mainland. Thus a more holistic network metric of isolation could have been applied or at least discussed for future research. The fact, that the authors did find a signal of island isolation does support their method, but the variation in responses to this metric could hint at a more complex pattern going on in real-life than was assumed for this study.

      Thank you for this meaningful question. Isolation can be measured in different ways in the study region. We chose the distance to the mainland as a measure of isolation based on the results of a previous study. One study in our system provided evidence that the colonization rate and extinction rate of breeding bird species were best fitted using distance to the nearest mainland over other distance-based measures (distance to the nearest landmass, distance to the nearest bigger landmass)(Si et al., 2014). Besides, their results produced almost identical patterns of the relationship between isolation and colonization/extinction rate (Si et al., 2014). That’s why we only selected “Distance to the mainland” in our current analysis and we do find some consistent patterns as expected. The plants on all islands were cleared out about 60 years ago due to dam construction, with all bird species coming from the mainland as the original species pool through a process called “relaxation”. This could be the reason why distance to the nearest mainland is the best predictor.

      We agree with you that it’s still necessary to consider more aspects of “isolation” at least in discussion for future research. In our Discussion, we address these on Lines 292-299:

      “As a caveat, we only consider the distance to the nearest mainland as a measure of fragmentation, consistent with previous work in this system (Si et al., 2014), but we acknowledge that other distance-based metrics of isolation that incorporate inter-island connections could reveal additional insights on fragmentation effects. The spatial arrangement of islands, like the arrangement of habitat, can influence niche tracking of species (Fourcade et al., 2021). Future studies should take these metrics into account to thoroughly understand the influence of isolation and spatial arrangement of patches in mediating the effect of climate warming on species.”

      Further, the link between larger areas and higher habitat diversity or heterogeneity could be presented by providing evidence for this relationship. The authors do make a reference to a paper done in the same study system, but a more thorough presentation of it would strengthen this assumption further.

      Thank you very much for this question. We now add more details about the relationship between habitat diversity and heterogeneity based on a related study in the same system. The observed number of species significantly increased with increasing island area (slope = 4.42, R2 = 0.70, p < .001), as did the rarefied species richness per island (slope = 1.03, R2 = 0.43, p < .001), species density (slope = 0.80, R2 = 0.33, p = .001) and the rarefied species richness per unit area (slope = 0.321, R2 = 0.32, p = .001). We added this supporting evidence on Lines 317-321:

      “We thus suppose that habitat heterogeneity could also mitigate the loss of these relatively cold-adapted species as expected. Habitat diversity, including the observed number of species, the rarefied species richness per island, species density and the rarefied species richness per unit area, all increased significantly with island area instead of isolation in our system (Liu et al., 2020)”

      Despite the general clear patterns found in the paper, there were some idiosyncratic responses. Those could be due to a multitude of factors which could be discussed a bit better to inform future research using a similar study design.

      Thank you for these suggestions. We added a summary statement about the reasons for idiosyncratic responses on Lines 334-338:

      “Overall, these idiosyncratic responses reveal several possible mechanisms in regulating species' climate responses, including resource demands and biological interactions like competition and predation. Future studies are needed to take these factors into account to understand the complex mechanisms by which habitat loss meditates species range shifts.”

      Reviewer #1 (Recommendations For The Authors):

      (1) Figure 1: I disagree that there should be a temporal trend in colonisation/extinction dynamics.

      Thank you again for these thoughtful comments. We have explained in detail in the response to the Public Review.

      (2) L 485-487: As explained before I disagree. I don't see why there needs to be a temporal trend in colonization and extinction.

      Thank you again for these thoughtful comments. Because we can’t guarantee that the study system has reached equilibrium, changing rates of colonization-extinction over time is actually a much stronger test of thermophilization. More detailed statement can be seen in the response to the Public Review.

      (3) L 141: which species' ecological traits?

      Sorry for the confusion. The traits included continuous variables (dispersal ability, body size, body mass and clutch size) and categorical variables (diet, active layer, residence type). Specifically, we tested the correlation between STI and dispersal ability, body size, body mass and clutch size using Pearson correlation test. We also tested the difference in STI between different trait groups using the Wilcoxon signed-rank test for three Category variables: diet (carnivorous/ omnivorous/ herbivory), active layer (canopy/mid/low), and residence type (resident species/summer visitor). There is no significant difference between any two groups for each of the three category variables (p > 0.2). We added these on Lines 141-145:

      “No significant correlation was found between STI and species’ ecological traits; specifically, the continuous variables of dispersal ability, body size, body mass and clutch size (Pearson correlations for each, |r| < 0.22), and the categorial variables of diet (carnivorous/omnivorous/herbivory), active layer (canopy/mid/low), and residence type (resident species/summer visitor)”

      (4) L 143: CTIoccur and CTIabun were not defined before.

      Because CTIoccur and CTIabun were first defined in Methods part (section 4.4), we change the sentence to a more general statement here on Lines 147-150:

      “At the landscape scale, considering species detected across the study area, occurrence-based CTI (CTIoccur; see section 4.4) showed no trend (posterior mean temporal trend = 0.414; 95% CrI: -12.751, 13.554) but abundance-based CTI (CTIabun; see section 4.4) showed a significant increasing trend.”

      (5) Figure 4: what is the dashed vertical line? I assume the mean STI across species?

      Sorry for the unclear description. The vertical dashed line indicates the median value of STI for 60 species, as a separation of warm-adapted species and cold-adapted species. We have added these details on Lines 807-809:

      “The dotted vertical line indicates the median of STI values. Cold-adapted species are plotted in blue and warm-adapted species are plotted in orange.”

      (6) Figure 6: in the legend, replace 'points in blue' with 'points in blue/orange' or 'solid dots' or something similar.

      Thank you for this suggestion. We changed it to “points in blue/orange” on Lines 823.

      (7) L 176-176: unclear why the interaction parameters are particularly important for explaining the thermophilization mechanism: if e.g. colonization rate of warm-adapted species is constantly higher in less isolated islands, (and always higher than the extinction rate of the same species), it means that thermophilization is increased in less isolated islands, right?

      Thank you for this question. This is also related to the question about “Why use temporal trends in colonization/extinction rate to test for thermophilization mechanisms”. Colonization-extinction over time is actually a much stronger test of thermophilization (more details refer to response to Public Review and Recommendations 1&2).

      Based on this, the two main driving processes of thermophilization mechanism include the increasing colonization rate of warm-adapted species and the increasing extinction rate of cold-adapted species with year. The interaction effect between island area (or isolation) and year on colonization rate (or extinction rate) can tell us how habitat fragmentation mediates the year effect. For example, if the interaction term between year and isolation is negative for a warm-adapted species that increased in colonization rate with year, it indicates that the colonization rate increased faster on less isolated islands. This is a signal of a faster thermophilization rate on less-isolated islands.

      (8) L201-203: this is only little supported by the results that actually show that there is NO significant interaction for most species.

      Thank you for this comment. Although most species showed non-significant interaction effect, the overall trend is relatively consistent, this is especially true for the effect of isolation. To emphasize the “trend” instead of “significant effect”, we slightly modified this sentence in more rigorous wording on Lines 205-208: 

      “We further found that habitat fragmentation influences two processes of thermophilization: colonization rates of most warm-adapted species tended to increase faster on smaller and less isolated islands, while the loss rates of most cold-adapted species tended to be exacerbated on less isolated islands.”

      (9) Section 2.3: can't you have a population-level estimate? I struggled a bit to understand all the parameters of the MSOM (because of my lack of statistical/mathematical proficiency) so I cannot provide more advice here.

      Thank you for raising this advice. We think what you are mentioning is the overall estimate across all species for each variable. From MSOM, we can get a standardized estimate of every variable (year, area, isolation, interaction) for each species, separately. Because the divergent or consistent responses among species are what we are interested in, we didn’t calculate further to get a population-level estimate.

      (10) L 291: a dot is missing.

      Done. Thank you for your correction.

      (11) L 305, 315: a space is missing

      Done

      (12) L 332: how were these islands selected?

      Thank you for this question. The 36 islands were selected according to a gradient of island area and isolation, spreading across the whole lake region. The selected islands guaranteed there is no significant correlation between island area and isolation (the Pearson correlation coefficient r = -0.21, p = 0.21). The biggest 7 islands among the 36 islands are also the only several islands larger than 30 ha in the whole lake region. We have modified this in the Method part on Lines 360-363.

      “We selected 36 islands according to a gradient of island area and isolation with a guarantee of no significant correlation between island area and isolation (Pearson r = -0.21, p = 0.21). For each island, we calculated island area and isolation (measured in the nearest Euclidean distance to the mainland) to represent the degree of habitat fragmentation.”

      (13) L 334: "Distance to the mainland" was used as a metric of isolation, but elsewhere in the text you argue that the observed thermophilization is due to interisland movements. It sounds contradictory. Why not include the average or shortest distance to the other islands?

      Thank you very much for raising this comment. Yes, “Distance to the mainland” was the only metric we used for isolation. We carefully checked through the manuscript where the “interisland movement” comes from and induces the misunderstanding. It must come from Discussion 3.1 (n Lines 217-221): “Notably, when tested on the landscape scale (versus on individual island communities), only the abundance-based thermophilization trend was significant, indicating thermophilization of bird communities was mostly due to inter-island occurrence dynamics, rather than exogenous community turnover.”

      Sorry, the word “inter-island” is not exactly what we want to express here, we wanted to express that “the thermophilization was mostly due to occurrence dynamics within the region, rather than exogenous community turnover outside the region”. We have changed the sentence in Discussion part on Lines 217-221:

      “Notably, when tested on the landscape scale (versus on individual island communities), only the abundance-based thermophilization trend was significant, indicating thermophilization of bird communities was mostly due to occurrence dynamics within the region, rather than exogenous community turnover outside the region.”

      Besides, I would like to explain why we use distance to the mainland. We chose the distance to the mainland as a measure of isolation based on the results of a previous study. One study in our system provided evidence that the colonization rate and extinction rate of breeding bird species were best fitted using distance to the nearest mainland over other distance-based measures (distance to the nearest landmass, distance to the nearest bigger landmass)(Si et al., 2014). Besides, their results produced almost identical patterns of the relationship between isolation and colonization/extinction rate(Si et al., 2014). That’s why we only selected “Distance to the mainland” in our current analysis and we do find some consistent patterns as expected. The plants on all islands were cleared out about 60 years ago due to dam construction, with all bird species coming from the mainland as the original species pool through a process called “relaxation”. This may be the reason why distance to the nearest mainland is the best predictor.

      In Discussion part, we added the following discussion and talked about the other measures on Lines 292-299:

      “As a caveat, we only consider the distance to the nearest mainland as a measure of fragmentation, consistent with previous work in this system (Si et al., 2014), but we acknowledge that other distance-based metrics of isolation that incorporate inter-island connections could reveal additional insights on fragmentation effects. The spatial arrangement of islands, like the arrangement of habitat, can influence niche tracking of species (Fourcade et al., 2021). Future studies should take these metrics into account to thoroughly understand the influence of isolation and spatial arrangement of patches in mediating the effect of climate warming on species.”

      (14) L 347: you write 'relative' abundance but this measure is not relative to anything. Better write something like "we based our abundance estimate on the maximum number of individuals recorded across the nine annual surveys".

      Thank you for this suggestion, we have changed the sentence on Lines 377-379:

      “We based our abundance estimate on the maximum number of individuals recorded across the nine annual surveys.”

      (15) L 378: shouldn't the formula for CTIoccur be (equation in latex format):

      CTI{occur, j, t} =\frac{\sum_{i=1}^{N_{j,t}}STI_{i}}{N_{j,t}}

      Where Nj,t is the total number of species surveyed in the community j in year t

      Thank you very much for this careful check, we have revised it on Lines 415, 417:

      “where Nj,t is the total number of species surveyed in the community j in year t.”

      Reviewer #2 (Recommendations For The Authors):

      (1) Line 76: "weakly"

      Done. Thank you for your correction.

      (2) Line 98: I suggest a change to this sentence: "For example, habitat fragmentation renders habitats to be too isolated to be colonized, causing sedentary butterflies to lag more behind climate warming in Britain than mobile ones"

      Thank you for this modification, we have changed it on Lines 99-101.

      (3) Line 101: remove either "higher" or "increasing"

      Done, we have removed “higher”. Thank you for this advice.

      (4) Line 102: "benefiting from near source of"

      Done.

      (5) Line 104: "emigrate"

      Done.

      (6) Introduction: I suggest making it more explicit what process you describe under the word "extinction". At first read, I thought you were only referring to the dieback of individuals, but you also included emigration as an extinction process. It also needs to be reworded in Fig 1 caption.

      Thank you for this suggestion. Yes, we can’t distinguish in our system between local extinction and emigration. The observed “extinction” of cold-adapted species over 10 years may involve two processes that usually occur in order: first “emigration” and then if can’t emigrate or withstand, “real local dieback”. It should also be included in the legend of Figure 1, as you said. We have modified the legend in Lines 780-781:

      “Note that extinction here may include both the emigration of species and then the local extinction of species.”

      There is also one part in the Discussion that mentions this on Lines 287-291: “While we cannot truly distinguish in our system between local extinction and emigration, we suspect that given two islands equal except in isolation, and if both lose suitability due to climate change, individuals can easily emigrate from the island nearer to the mainland, while individuals on the more isolated island would be more likely to be trapped in place until the species went locally extinct due to a lack of rescue”.

      (7) I also suggest differentiating habitat fragmentation (distances between islands) and habitat amount (area) as explained in Fahrig 2013 (Rethinking patch size and isolation effects: the habitat amount hypothesis) and her latter paper. This will help the reader what lies behind the general trend of fragmentation: fragmentation per se and habitat amount reduction.

      Thank you for this suggestion! Habitat fragmentation in this study involves both habitat loss and fragmentation per se. We now give a general definition of habitat fragmentation on Lines 61-63:

      “Habitat fragmentation, usually defined as the shifts of continuous habitat into spatially isolated and small patches (Fahrig, 2003), in particular, has been hypothesized to have interactive effects with climate change on community dynamics.”

      (8) Line 136: is the "+-" refers to the standard deviation or confidence interval, I suggest being explicit about it once at the start of the results.

      Thank you for reminding this. The "+-" refers to the standard deviation (SD). The modified sentence is now on Lines 135-139:

      “The number of species detected in surveys on each island across the study period averaged 13.37 ± 6.26 (mean ± SD) species, ranging from 2 to 40 species, with an observed gamma diversity of 60 species. The STI of all 60 birds averaged 19.94 ± 3.58 ℃ (mean ± SD) and ranged from 9.30 ℃ (Cuculus canorus) to 27.20 ℃ (Prinia inornate), with a median of STI is 20.63 ℃ (Appendix 1—figure 2; Appendix 1—figure 3).”

      (9) Line 143: please specify the unit of thermophilization.

      The unit of thermophilization rate is the change in degree per unit year. Because in all analyses, predictor variables were z-transformed to make their effect comparable. We have added on Line 151:

      “When measuring CTI trends for individual islands (expressed as °/ unit year)”

      (10) Line 289: check if no word is missing from the sentence.

      The sentence is: “In our study, a large proportion (11 out of 15) of warm-adapted species increasing in colonization rate and half (12 out of 23) of cold-adapted species increasing in extinction rate were changing more rapidly on smaller islands.”

      Given that we have defined the species that were included in testing the third prediction in both Methods part and Result part: 15 warm-adapted species that increased in colonization rate and 23 cold-adapted species that increased in extinction rate. We now remove this redundant information and rewrote the sentence as below on Lines 300-302:

      “In our study, the colonization rate of a large proportion of warm-adapted species (11 out of 15) and the extinction rate of half of old-adapted species (12 out of 23) were increasing more rapidly on smaller islands.”

      (11) Line 319: I really miss a concluding statement of your discussion, your results are truly interesting and deserve to be summarized in two or three sentences, and maybe a perspective about how it can inform conservation practices in fragmented settings.

      Thank you for this profound suggestion both in Public Review and here. We have added a paragraph to the end of the Discussion, stating how our results can inform conservation, on Lines 339-347:

      “Overall, our findings have important implications for conservation practices. Firstly, we confirmed the role of isolation in limiting range shifting. Better connected landscapes should be developed to remove dispersal constraints and facilitate species’ relocation to the best suitable microclimate. Second, small patches can foster the establishment of newly adapted warm-adapted species while large patches can act as refugia for cold-adapted species. Therefore, preserving patches of diverse sizes can act as stepping stones or shelters in a warming climate depending on the thermal affinity of species. These insights are important supplement to the previous emphasis on the role of habitat diversity in fostering (Richard et al., 2021) or reducing (Gaüzère et al., 2017) community-level climate debt.”

      (12) Line 335: I suggest " ... the islands has been protected by forbidding logging, ..."

      Thanks for this wonderful suggestion. Done. The new sentence is now on Lines 365-366:

      “Since lake formation, the islands have been protected by forbidding logging, allowing natural succession pathways to occur.”

      (13) Line 345: this speed is unusually high for walking, check the speed.

      Sorry for the carelessness, it should be 2.0 km/h. It has been corrected on Lines 375-376:

      “In each survey, observers walked along each transect at a constant speed (2.0 km/h) and recorded all the birds seen or heard on the survey islands.”

      (14) Line 351: you could add a sentence explaining why that choice of species exclusion was made. Was made from the start of the monitoring program or did you exclude species afterward?

      We excluded them afterward. We excluded non-breeding species, nocturnal and crepuscular species, high-flying species passing over the islands (e.g., raptors, swallows) and strongly water-associated birds (e.g., cormorants). These records were recorded during monitoring, including some of them being on the shore of the island or high-flying above the island, and some nocturnal species were just spotted by accident.

      We described more details about how to exclude species on Lines 379-387:

      “We excluded non-breeding species, nocturnal and crepuscular species, high-flying species passing over the islands (e.g., raptors, swallows) and strongly water-associated birds (e.g., cormorants) from our record. First, our surveys were conducted during the day, so some nocturnal and crepuscular species, such as the owls and nightjars were excluded for inadequate survey design. Second, wagtail, kingfisher, and water birds such as ducks and herons were excluded because we were only interested in forest birds. Third, birds like swallows, and eagles who were usually flying or soaring in the air rather than staying on islands, were also excluded as it was difficult to determine their definite belonging islands. Following these operations, 60 species were finally retained.”

      (15) Line 370: I suggest adding the range and median of STI.

      Thanks for this good suggestion. The range, mean±SD of STI were already in the Results part, we added the median of STI there as well. The new sentence is now in Results part on Lines 137-139:

      “The STI of all 60 birds averaged 19.94 ± 3.58 ℃ (mean ± SD) and ranged from 9.30 ℃ (Cuculus canorus) to 27.20 ℃ (Prinia inornate), with a median of 20.63 ℃ (Appendix 1—figure 2; Appendix 1—figure 3).”

      (16) Figure 4.b: Is it possible to be more explicit about what that trend is? the coefficient of the regression Logit(ext/col) ~ year + ...... ?

      Thank you for this advice. Your understanding is right: we can interpret it as the coefficient of the ‘year’ effect in the model. More specifically, the ‘year’ effect or temporal trend here is the ‘posterior mean’ of the posterior distribution of ‘year’ in the MSOM (Multi-species Occupancy Model), in the context of the Bayesian framework. We modified this sentence on Lines 811-813:

      “ Each point in (b) represents the posterior mean estimate of year in colonization, extinction or occupancy rate for each species.”

      (17) Figure 6: is it possible to provide an easily understandable meaning of the prior presented in the Y axis? E.g. "2 corresponds to a 90% probability for a species to go extinct at T+1", if not, please specify that it is the logit of a probability.

      Thank you for this question both in Public Review and here. The value on the Y axis indicates the posterior mean of each variable (year, area, isolation and their interaction effects) extracted from the MSOM model, where the logit(extinction rate) or logit(colonization rate) was the response variable. All variables were standardized before analysis to make them comparable. So, positive values indicate positive influence while negative values indicate negative influence. Because the goal of Figure 6 is to display the negative/positive effect, we didn’t back-transform them. Following your advice, we thus modified the caption of Figure 6 (now renumbered as Figure 5, following a comment from Reviewer #3, to move Figure 5 to Figure 4c). The modified title and legends of Figure 5 are on Lines 817-820:

      “Figure 5. Posterior estimates of logit-scale parameters related to cold-adapted species’ extinction rates and warm-adapted species’ colonization rates. Points are species-specific posterior means on the logit-scale, where parameters >0 indicate positive effects (on extinction [a] or colonization [b]) and parameters <0 indicate negative effects.”

      (18) Line 773: points in blue only are significant? I suggest "points in color".

      Thank you for your reminder. Points in blue and orange are all significant. We have revised the sentence on Line 823:

      “Points in blue/orange indicate significant effects.”

      These are all small suggestions that may help you improve the readability of the final manuscript. I warmly thank you for the opportunity to review this impressive study.

      We appreciate your careful review and profound suggestions. We believe these modifications will improve the final manuscript.

      Reviewer #3 (Recommendations For The Authors):

      I have a few minor suggestions for paper revision for your otherwise excellent manuscript. I wish to emphasize that it was a pleasure to read the manuscript and that I especially enjoyed a very nice flow throughout the ms from a nicely rounded introduction that led well into the research questions and hypotheses all the way to a good and solid discussion.

      Thank you very much for your review and recognition. We have carefully checked all recommendations and addressed them in the manuscript.

      (1) L 63: space before the bracket missing and I suggest moving the reference to the end of the sentence (directly after habitat fragmentation does not seem to make sense).

      Thank you very much for this suggestion. The missed space was added, and the reference has been moved to the end of the sentence. We also add a general definition of habitat fragmentation. The new sentence is on Lines 61-64:

      “Habitat fragmentation, usually defined as the shifts of continuous habitat into spatially isolated and small patches (Fahrig, 2003), in particular, has been hypothesized to have interactive effects with climate change on community dynamics.”

      (2) L 102: I suggest to write "benefitting ..." instead.

      Done.

      (3) L 103: higher extinction rates (add "s").

      Done.

      (4) L 104: this should probably say "emigrate" and "climate warming".

      Done.

      (5) L 130-133: this is true for emigration (more isolated islands show slower emigration). But what about increased local extinction, especially for small and isolated islands? Especially since you mentioned later in the manuscript that often emigration and extinction are difficult to identify or differentiate. Might be worth a thought here or somewhere in the discussion?

      Thank you for this good question. I would like to answer it in two aspects:

      Yes, we can’t distinguish between true local extinction and emigration. The observed local “extinction” of cold-adapted species over 10 years may involve two processes that usually occur in order: first “emigration” and then, if can’t emigrate or withstand, “real local dieback”. Over 10 years, the cold-adapted species would have to tolerate before real extinction on remote islands because of disperse limitation, while on less isolated islands it would be easy to emigrate and find a more suitable habitat for the same species. Consequently, it’s harder for us to observe “extinction” of species on more isolated islands, while it’s easier to observe “fake extinct” of species on less isolated islands due to emigration. As a result, the observed extinction rate is expected to increase more sharply for species on less remote islands, while the observed extinction rate is expected to increase relatively moderately for the same species on remote islands.

      We have modified the legend of Figure 1 on Lines 780-781:

      “Note that extinction here may include both the emigration of species and then the local extinction of species.”

      There is also one part in the Discussion that mentions this on Lines 287-291: “While we cannot truly distinguish in our system between local extinction and emigration, we suspect that given two islands equal except in isolation, if both lose suitability due to climate change, individuals can easily emigrate from the island nearer to the mainland, while individuals on the more isolated island would be more likely to be trapped in place until the species went locally extinct due to a lack of rescue”.

      Besides, you said “But what about increased local extinction, especially for small and isolated islands?”, I think you are mentioning the “high extinction rate per se on remote islands”. We want to test the “trend” of extinction rate on a temporal scale, rather than the extinction rate per se on a spatial scale. Even though species have a high extinction rate on remote islands, it can also show a slower changing rate in time.

      I hope these answers solve the problem.

      (6) L 245: I think this is the first time the acronym appears in the ms (as the methods come after the discussion), so please write the full name here too.

      Thank you for pointing out this. I realized “Thousand Island Lake” appears for the first time in the last paragraph of the Introduction part. So we add “TIL” there on Lines 108-109:

      “Here, we use 10 years of bird community data in a subtropical land-bridge island system (Thousand Island Lake, TIL, China, Figure 2) during a period of consistent climatic warming.”

      (7) L 319: this section could end with a summary statement on idiosyncratic responses (i.e. some variation in the responses you found among the species) and the potential reasons for this, such as e.g. the role of other species traits or interactions, as well as other ways to measure habitat fragmentation (see main comments in public review).

      Thank you for this suggestion both in Public Review and here. We added a summary statement about the reasons for idiosyncratic responses on Lines 334-338:

      “Overall, these idiosyncratic responses reveal several possible mechanisms in regulating species' climate responses, including resource demands and biological interactions like competition and predation. Future studies are needed to take these factors into account to understand the complex mechanisms by which habitat loss meditates species range shifts.”

      We only strengthen “habitat loss” here, because idiosyncratic responses mainly come from the mediating effect of habitat loss. For the mediating effect of isolation, the response is relatively consistent (see Page 8, Lines 183-188): “In particular, the effect of isolation on temporal dynamics of thermophilization was relatively consistent across cold- and warm-adapted species (Figure 5a, b); specifically, on islands nearer to the mainland, warm-adapted species (15 out of 15 investigated species) increased their colonization probability at a higher rate over time, while most cold-adapted species (21 out of 23 species) increased their extinction probability at a higher rate”.

      (8) L 333: what about the distance to other islands? it's more of a network than a island-mainland directional system (Figure 2). You could address this aspect in the discussion.

      Thank you for this good question again. Isolation can be measured in different ways in the study region. We chose distance to the mainland because it was the best predictor of colonization and extinction rate of breeding birds in the study region, and produced similar results like the other distance-based measures, including distance to the nearest landmass, distance to the nearest larger landmass (Si et al., 2014). We still agree with you that it’s necessary to consider more aspects of “isolation” at least in discussion for future research. In Discussion part, we addressed these on Lines 292-299. For more details refer to the response to Public Review.

      (9) Figure 2: Is B1 one of the sampled islands? It is clearly much larger than most other islands and I think it could thus serve as an important population source for many of the adjacent smaller islands? Thus, the nearest neighbor distance to B1 could be as important in addition to the distance to the mainland?

      Yes, B1 is one of the sampled islands and is also the biggest island. In previous research in our study system, we tried distance to the nearest landmass, to the nearest larger landmass and the nearest mainland, they produced similar results (For more details refer to the response to Public Review). We agree with you that the nearest neighbor distance to B1 could be a potentially important measure, but need further research. In our Discussion, we address these on Lines 292-299:

      “As a caveat, we only consider the distance to the nearest mainland as a measure of fragmentation, consistent with previous work in this system (Si et al., 2014), but we acknowledge that other distance-based metrics of isolation that incorporate inter-island connections could reveal additional insights on fragmentation effects. The spatial arrangement of islands, like the arrangement of habitat, can influence niche tracking of species (Fourcade et al., 2021). Future studies should take these metrics into account to thoroughly understand the influence of isolation and spatial arrangement of patches in mediating the effect of climate warming on species.”

      (10) L 345: 20km/h walking seems impressively fast? I assume this is a typo.

      Sorry for the carelessness, it should be 2.0 km/h. it has been corrected on Lines 375-376:

      “In each survey, observers walked along each transect at a constant speed (2.0 km/h) and recorded all the birds seen or heard on the survey islands.”

      (11) L 485: I had difficulties fully understanding the models that were fitted here and could not find them in the codes you provided (which were otherwise very well documented!). Could you explain this modeling step in a bit more detail?

      Thank you for your recognition! According to Line 485 in the online PDF version (Methods part 4.6.3), it says: “An increasing colonization trend of warm-adapted species and increasing extinction trend of cold-adapted species are two main expected processes that cause thermophilization (Fourcade et al., 2021). To test our third prediction about the mediating effect of habitat fragmentation, we selected warm-adapted species that had an increasing trend in colonization rate (positive year effect in colonization rate) and cold-adapted species that had an increasing extinction rate (positive year effect in extinction rate)…..”

      We carefully checked the code in Figshare link and found that the MOSM JAGS code was not uploaded before. Very sorry for that. Now it can be found in the document [MOSM.R] at https://figshare.com/s/7a16974114262d280ef7. Hope the code, together with the modeling process in section 4.5 in the Methods can help to understand the whole modeling process. Besides, we would like to explain how to decide the temporal trend in colonization or extinction of each species related to Line 485. Let’s take the model of species-specific extinction rate for example:

      In this model, “Island” was a random effect, “Year” is added as a random slope, thus allowing “year effect” (that is: the temporal trend) of extinction rate of species to vary with “island”. Further, the interaction effect between island variables (isolation, area) was added to test if the “year effect” was related to island area or isolation.

      Because we are only interested in warm-adapted species that have a positive temporal trend in colonization and cold-adapted species that have a positive temporal trend in extinction, which are two main processes underlying thermophilizaiton, we choose warm-adapted species that have a positive year-effect in colonization, and cold-adapted species that has a positive year-effect in extinction. Hope this explanation and the JAGS code can help if you are confused about this part.

      Hope these explanations can make it clearer.

      (12) Figure 1: to me, it would be more intuitive to put the landscape configuration in the titles of the panels b, c, and d instead of "only" the mechanisms. E.g. they could be: a) fragmented islands with low climate buffering; b) small islands with low habitat heterogeneity; c) isolated islands with dispersal limitations?

      It is also slightly confusing that the bird communities are above "island" in the middle of the three fragmented habitats - which all look a bit different in terms of tree species and structure which makes the reader first think that it has something to do with the "new" species community. so maybe worth rethinking how to illustrate the three fragmented islands?

      We would like to thank you for your nice proposition. Firstly, it’s a good idea to put the landscape configuration in the title of the panels b, c, d. The new title (a) is “Fragmented islands with low climate buffering”, title (b) is “Small islands with low habitat heterogeneity”, and title (c) is “Isolated patches with dispersal limitations”.

      Second, we realized that putting the “bird community” above “island” in the middle of the three patches is a bit confusing. Actually, we wanted to show bird communities only on that one island in the middle. The other two patches are only there to represent a fragmented background. To avoid misunderstanding, we added a sentence in the legend of Figure 1 on Lines 778-780:

      “The three distinct patches signify a fragmented background and the community in the middle of the three patches was selected to exhibit colonization-extinction dynamics in fragmented habitats.”

      (13) Figure 4: please add the description of the color code for panel a.

      Sorry for the unclear description. The vertical dashed line indicates the median value of STI for 60 species, as a separation of warm-adapted species and cold-adapted species. We have added these details on Lines 807-809:

      “The dotted vertical line indicates the median of STI values. Cold-adapted species are plotted in blue and warm-adapted species are plotted in orange.”

      (14) Figure 5: You could consider adding this as panel c to Figure 4 as it depicts the same thing as in 4a but for CTI-abundance.

      Thank you for this advice. We have moved the original Figure 5 to Figure 4c. Previous Figure 6 thus turned into Figure 5. All corresponding citations in the main text were checked to adapt to the new index. The new figure is now on Lines 801-815:

      References

      Ferraz, G., Russell, G. J., Stouffer, P. C., Bierregaard Jr, R. O., Pimm, S. L., & Lovejoy, T. E. (2003). Rates of species loss from Amazonian forest fragments. Proceedings of the National Academy of Sciences, 100(24), 14069-14073. doi:10.1073/pnas.2336195100

      Fourcade, Y., WallisDeVries, M. F., Kuussaari, M., van Swaay, C. A., Heliölä, J., & Öckinger, E. (2021). Habitat amount and distribution modify community dynamics under climate change. Ecology Letters, 24(5), 950-957. doi:10.1111/ele.13691

      Gaüzère, P., Princé, K., & Devictor, V. (2017). Where do they go? The effects of topography and habitat diversity on reducing climatic debt in birds. Global Change Biology, 23(6), 2218-2229. doi:10.1111/gcb.13500

      Gonzalez, A. (2000). Community relaxation in fragmented landscapes: the relation between species richness, area and age. Ecology Letters, 3(5), 441-448. doi:10.1046/j.1461-0248.2000.00171.x

      Haddad, N. M., Brudvig, L. A., Clobert, J., Davies, K. F., Gonzalez, A., Holt, R. D., . . . Collins, C. D. (2015). Habitat fragmentation and its lasting impact on Earth’s ecosystems. Science advances, 1(2), e1500052. doi:10.1126/sciadv.1500052

      Richard, B., Dupouey, J. l., Corcket, E., Alard, D., Archaux, F., Aubert, M., . . . Macé, S. (2021). The climatic debt is growing in the understorey of temperate forests: Stand characteristics matter. Global Ecology and Biogeography, 30(7), 1474-1487. doi:10.1111/geb.13312

      Si, X., Pimm, S. L., Russell, G. J., & Ding, P. (2014). Turnover of breeding bird communities on islands in an inundated lake. Journal of Biogeography, 41(12), 2283-2292. doi:10.1111/jbi.12379

    2. eLife Assessment

      This fundamental study substantially advances our understanding of how habitat fragmentation and climate change jointly influence bird community thermophilization in a fragmented island system. The authors provide convincing evidence using appropriate and validated methodologies to examine how island area and isolation affect the colonization of warm-adapted species and the extinction of cold-adapted species. While minor clarifications regarding the definition of fragmentation could further enhance the presentation, the study is of high interest to ecologists and conservation biologists, as it provides insight into how ecosystems and communities respond to climate change.

    3. Reviewer #3 (Public review):

      Summary:

      Juan Liu et al. investigated the interplay between habitat fragmentation and climate-driven thermophilization in birds in an island system in China. They used extensive bird monitoring data (9 surveys per year per island) across 36 islands of varying size and isolation from the mainland covering 10 years. The authors use extensive modeling frameworks to test a general increase of the occurrence and abundance of warm-dwelling species and vice versa for cold-dwelling species using the widely used Community Temperature Index (CTI), as well the relationship between island fragmentation in terms of island area and isolation from the mainland on extinction and colonization rates of cold- and warm-adapted species. They found that indeed there was thermophilization happening during the last 10 years, which was more pronounced for the CTI based on abundances and less clearly for the occurrence based metric. Generally, the authors show that this is driven by an increased colonization rate of warm-dwelling and an increased extinction rate of cold-dwelling species. Interestingly, they unravel some of the mechanisms behind this dynamic by showing that warm-adapted species increased while cold-dwelling decreased more strongly on smaller islands, which is - according to the authors - due to lowered thermal buffering on smaller islands (which was supported by air temperature monitoring done during the study period on small and large islands). They argue, that the increased extinction rate of cold-adapted species could also be due to lowered habitat heterogeneity on smaller islands. With regards to island isolation, they show that also both thermophilization processes (increase of warm and decrease of cold-adapted species) was stronger on islands closer to the mainland, due to closer sources to species populations of either group on the mainland as compared to limited dispersal (i.e. range shift potential) in more isolated islands.

      The conclusions drawn in this study are sound, and mostly well supported by the results. Only few aspects leave open questions and could quite likely be further supported by the authors themselves thanks to their apparent extensive understanding of the study system.

      Strengths:

      The study questions and hypotheses are very well aligned with the methods used, ranging from field surveys to extensive modeling frameworks, as well as with the conclusions drawn from the results. The study addresses a complex question on the interplay between habitat fragmentation and climate-driven thermophilization which can naturally be affected by a multitude of additional factors than the ones included here. Nevertheless, the authors use a well balanced method of simplifying this to the most important factors in question (CTI change, extinction, colonization, together with habitat fragmentation metrics of isolation and island area). The interpretation of the results presents interesting mechanisms without being too bold on their findings and by providing important links to the existing literature as well as to additional data and analyses presented in the appendix.

      Weaknesses:

      The metric of island isolation based on distance to the mainland seems a bit too oversimplified as in real-life the study system rather represents an island network where the islands of different sizes are in varying distances to each other, such that smaller islands can potentially draw from the species pools from near-by larger islands too - rather than just from the mainland. Although the authors do explain the reason for this metric, backed up by earlier research, a network approach could be worthwhile exploring in future research done in this system. The fact, that the authors did find a signal of island isolation does support their method, but the variation in responses to this metric could hint on a more complex pattern going on in real-life than was assumed for this study.

    1. eLife assessment

      This proof-of-concept study focuses on an A->G DNA base editing strategy that converts CAG repeats to CAA repeats in the human HTT gene, which causes Huntington's disease (HD). These studies are conducted in human HEK293 cells engineered with a 51 CAG canonical repeat and in HD knock-in mice harboring 105+ CAG repeats. The findings of this study are valuable for the HD field, applying state-of-the-art techniques. However, the key experiments have yet to be performed in neuronal systems or brains of these mice: actual disease-rectifying effects relevant to patients have yet to observed, leaving the work incomplete.

    1. eLife assessment

      This study presents a useful examination of the prevalence of interactions between amino acids from different periods of Earth's history and coenzymes. While the premise of this work is well founded, the data lend themselves to alternative interpretations, suggesting that the main conclusions might be incompletely supported by the findings. The work would benefit from the inclusion of additional supplementary data and further analysis. This manuscript would be of interest to evolutionary biologists and biophysicists.

    1. eLife assessment

      This is a valuable study of the mechanisms of microtubule organization in pancreatic islet beta cells that enable optimal insulin secretion. Using a combination of live imaging and photo-kinetic assays in an in vitro culture system, the authors provide solid evidence to demonstrate that kinesin-1-mediated microtubule sliding, which has previously been known from neurons and embryos, is essential for establishing the sub-membranous microtubule band in response to glucose levels in beta cells. The inclusion of an animal model or primary cells, as well as data on the physiological relevance of the finding, would have strengthened the study. The work will be of interest to cell biologists studying cytoskeletal dynamics and organelle trafficking and to translational biologists working on diabetes.

    1. eLife Assessment

      In this valuable paper, the authors created a reporter mouse line in which the Axon Initial Segment (AIS) is intrinsically labeled by an ankyrin-G-GFP fusion protein activated by Cre recombinase, tagging the native Ank3 gene. Using confocal, superresolution, and two-photon microscopy as well as whole-cell patch-clamp recordings in vitro, ex vivo, and in vivo, the authors convincingly document that the subcellular scaffold of the AIS and electrophysiological parameters of labeled cells remain unchanged. They further uncover rapid AIS remodeling following increased network activity in this model system, as well as highly reproducible in vivo labeling of AIS over weeks.

    1. eLife Assessment

      This in several parts valuable study confirms the roles of Dact1 and Dact2, two factors involved in Wnt signaling, during zebrafish gastrulation and demonstrates their genetic interactions with other Wnt components to modulate craniofacial morphologies. Unfortunately, there are several limitations associated with the study, making it challenging to distinguish the primary and secondary effects of each factor, and their roles in craniofacial morphogenesis. The findings of a new potential target of dact1/2-mediated Wnt signaling are potentially of value; however, experimental evidence supporting their functional significance remains incomplete due to inconsistent results and limitations inherent to the overexpression approach.

    2. Reviewer #2 (Public review):

      Summary:

      Non-canonical Wnt signaling plays an important role in morphogenesis, but how different components of the pathway are required to regulate different developmental events remains an open question. This paper focuses on elucidating the overlapping and distinct functions of dact1 and dact2, two Dishevelled-binding scaffold proteins, during zebrafish axis elongation and craniofacial development. By combining genetic studies, detailed phenotypic analysis, lineage tracing, and single cell RNA-sequencing, the authors aimed to understand (1) the relative function of dact1/2 in promoting axis elongation, (2) their ability to modulate phenotypes caused by mutations in other non-canonical wnt components, and (3) pathways downstream of dact1/2.

      Corroborating previous findings, this paper showed that dact1/2 is required for convergent extension during gastrulation and body axis elongation. Qualitative evidence was also provided to support dact1/2's role in genetically modulating non-canonical wnt signaling to regulate body axis elongation and the morphology of the ethmoid plate (EP). However, the spatiotemporal function of dact1/2 remains unknown. The use of scRNA-seq identified novel pathways and targets downstream of dact1/2. Calpain 8 is one such example, and its overexpression in some of the dact1/2+/- embryos was able to phenocopy the dact1/2-/- mutant EP morphology, pointing to its sufficiency in driving the EP phenotype in a few embryos. However, the same effect was not observed in dact1-/-; dact2+/- embryos, leading to the question of how significant calpain 8 really is in this context. The requirement of calpain 8 in mediating the phenotype is unclear as well. This is the most novel aspect of the paper, but some weaknesses remain in convincingly demonstrating the importance of calpain 8.

      Strengths:

      (1) The generation of dact1/2 germline mutants and the use of genetic approaches to dissect their genetic interactions with wnt11f2 and gpc4 provide unambiguous and consistent results that inform the relative functions of dact1 and dact2, as well as their combined effects.<br /> (2) Because the ethmoid plate exhibits a spectrum of phenotypes in different wnt genetic mutants, it is a useful system for studying how tissue morphology can be modulated by different components of the wnt pathway.<br /> (3) The authors leveraged lineage tracing by photoconversion to dissect how dact1/2 differentially impacts the ability of different cranial neural crest populations to contribute to the ethmoid plate. This revealed that distinct mechanisms via dact1/2 and shh can lead to similar phenotypes.<br /> (4) The use of scRNA-seq was a powerful approach and identified potential novel pathways and targets downstream of dact1/2.

      Weaknesses:

      (1) Connecting the expression of dact1/2 and wnt11f2 to their mutant phenotypes: Given that dact1/2 and wnt11f2 expression are quite distinct, at least in the stages examined, the claim that dact1/2 function downstream of wnt11f2 is not well supported. That conclusion was based on shared craniofacial phenotypes between dact1/2-/-, wnt11f2-/-, and dact1/2-/-;wnt11f2-/- mutants. However, because the craniofacial phenotype is likely a secondary effect of dact1/2 deletion, using it to interpret the signaling axis between dact1/2 and wnt11f2 is not appropriate.<br /> (2) Spatiotemporal function of dact1/2: Germline mutations limit the authors' ability to study a gene's spatiotemporal functional requirement. They, therefore, cannot concretely attribute nor separate early-stage phenotypes (during gastrulation) to/from late stage phenotypes (EP morphological changes), which the authors postulated to result from secondary defects in floor plate and eye field morphometry. As a result, whether dact1/2 are directly involved in craniofacial development is not addressed, and the mechanisms resulting in the craniofacial phenotypes are also unclear.<br /> (3) The functional significance of calpain 8: Because calpain 8 was upregulated in many dact1/2-/- mutant cell populations (although not in the neural crest) during gastrulation, the authors tested its function by overexpressing capn8 mRNA in embryos. While only 1 out of 142 calpain 8-overexpressing wild type animals phenocopied dact1/2 mutants, 7.5% of dact1/2+/- embryos overexpressing capn8 exhibited dact1/2-like phenotypes. However, the same effect was not observed in dact1-/-; dact2+/- embryos. Given the expression pattern of calpain 8 and results from the overexpression study, the function of capn8 remains inconclusive. The requirement of calpain 8 in driving the phenotype remains unclear. The authors stated these limitations in their study.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript the authors explore the roles of dact1 and dact2 during zebrafish gastrulation and craniofacial development. Previous studies used morpholino (MO) knockdowns to show that these scaffolding proteins, which interact with dissheveled (Dsh), are expressed during zebrafish gastrulation and suggested that dact1 promotes canonical Wnt/B-catenin signaling, while dact2 promotes non-canonical Wnt/PCP-dependent convergent-extension (Waxman et al 2004). This study goes beyond this work by creating loss-of-function mutant alleles for each gene and unlike the MO studies finds little (dact2) to no (dact1) phenotypic defects in the homozygous mutants. Interestingly, dact1/2 double mutants have a more severe phenotype, which resembles those reported with MOs as well as homozygous wnt11/silberblick (wnt11/slb) mutants that disrupt non-canonical Wnt signaling (Heisenberg et al., 1997; 2000). Further analyses in this paper try to connect gastrulation and craniofacial defects in dact1/2 mutants with wnt11/slb and other wnt-pathway mutants. scRNAseq conducted in mutants identifies calpain 8 as a potential new target of dact1/2 and Wnt signaling.

      Previous comments:<br /> Strengths:

      When considered separately the new mutants are an improvement over the MOs and the paper contains a lot of new data.

      Weaknesses:

      However, the hypotheses are very poorly defined and misinterpret key previous findings surrounding the roles of wnt11 and gpc4, which results in a very confusing manuscript. Many of the results are not novel and focus on secondary defects. The most novel result overexpressing calpain8 in dact1/2 mutants is preliminary and not convincing.

      The authors addressed some of our comments, but not our main criticisms, which we reiterate here:

      (1) The authors argue that morpholino studies are unreliable and here they made new mutants to solve this uncertainty for dap 1/2. However, creating stable mutant lines to largely confirm previous results obtained by using morpholino knock-down phenotypes does not justify publication in eLife.

      (2) The authors argue that since it has not been shown conclusively that craniofacial defects in wnt11 and dap1/2 mutants are secondary to gastrulation defects there is no solid evidence preventing them from investigating these craniofacial defects. However, since it is extremely likely that the rod-like ethmoid plates of wnt11f2- and dact1/2 mutants focused on here are secondary to gastrulation defects previously described by others (Heisenberg and NussleinVolhard 1997; Waxman et al., 2004), the burden of proof is on the authors to provide much stronger evidence against this interpretation.

      (3) The data for calpain overexpression remains too preliminary.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors): 

      This is not a recommendation. While reading old literature, I found some interesting facts. The shape of the neurocranium in monotremes, birds, and mammals, at least in early stages, resembles the phenotype of 'dact'1/2, wnt11f2, or syu mutants. For more details, see DeBeer's: 'The Development of the Vertebrate Skull, !937' Plate 137. 

      Thank you for pointing this out. It is indeed interesting.

      Minor Comments: 

      • Lines 64, 66, and 69: same citation without interruption: Heisenberg, Brand et al. 1996

      Revised line 76. 

      • Lines 101 and 102: same citation without interruption: Li, Florez et al. 2013 

      Revised line 118.

      • Lines 144, 515, 527, and 1147: should be wnt11f2 instead of wntllf2 - if not, then explain 

      Revised lines 185, 625, 640,1300.

      • Lines 169 and 171: incorrect figure citation: Fig 1D - correct to Fig 1F 

      Revised lines 217, 219.

      • Line 173: delete (Fig. S1) 

      Revised line 221.

      • Line 207: indicate that both dact1 and dact2 mRNA levels increased, noting a 40% higher level of dact2 mRNA after deletion of 7 bp in the dact2 gene 

      Revised line 265.

      • Line 215: Fig 1F instead of Fig 1D 

      Revised line 217.

      • Line 248: unify naming of compound mutants to either dact1/2 or dact1/dact2 compound mutants 

      Revised to dact1/2 throughout.

      • Line 259: incorrect figure citation: Fig S1 - correct to Fig S2D/E 

      Revised line 324.

      • Line 302: correct abbreviation position: neural crest (NCC) cell - change to neural crest cell (NCC) population 

      Revised line 380.

      • Line 349: repeating kny mut definition from line 70 may be unnecessary 

      Revised line 434.

      • Line 351: clarify distinction between Fig S1 and Fig S2 in the supplementary section 

      Revised line 324.

      • Line 436: refer to the correct figure for pathways associated with proteolysis (Fig 7B) 

      Revised line 530.

      • Line 446-447: complete the sentence and clarify the relevance of smad1 expression, and correct the use of "also" in relation to capn8 

      Revised line 567.

      • Line 462: clarify that this phenotype was never observed in wildtype larvae, and correct figure reference to exclude dact1+/- dact2+/- 

      Revised line 563, 568.

      • Line 463: explain the injection procedure into embryos from dact1/2+/- interbreeding 

      Revised line 565.

      • Lines 488 and 491: same citation without interruption: Waxman, Hocking et al. 2004 

      Revised line 591.

      • Line 502: maintain consistency in referring to TGF-beta signaling throughout the article 

      Revised throughout.

      • Line 523: define CNCC; previously used only NCC 

      Revised to cranial NCC throughout.

      • Line 1105: reconsider citing another work in the figure legend 

      Revised line 1249.

      • Line 1143: consider using "mutant" instead of "mu" 

      Revised line 1295.

      • Fig 2A/B: indicate the number of animals used ("n") 

      N is noted on line 1274.

      • Fig 2C, D, E: ensure uniform terminology for control groups ("wt" vs. "wildtype") 

      Revised in figure.

      • Fig 7C: clarify analysis of dact1/2-/- mutant in lateral plate mesoderm vs. ectoderm 

      Revised line 1356.

      • Fig 8A: label the figure to indicate it shows capn8, not just in the legend 

      Revised.

      • Fig 8D: explain the black/white portions and simplify to highlight important data 

      Revised.

      • Fig S2: add the title "Figure S2" 

      Revised.

      • Consider omitting the sentence: "As with most studies, this work has contributed some new knowledge but generated more questions than answers." 

      Revised line 720.

      Reviewer #2 (Recommendations For The Authors): 

      Major comments: 

      (1) The authors have addressed many of the questions I had, including making the biological sample numbers more transparent. It might be more informative to use n = n/n, e.g. n = 3/3, rather than just n = 3. Alternatively, that information can be given in the figure legend or in the form of penetrance %. 

      The compound heterozygote breeding and phenotyping analyses were not carried out in such a way that we can comment on the precise % penetrance of the ANC phenotype, as we did not dissect every ANC and genotype every individual that resulted from the triple heterozygote in crossings. We collected phenotype/genotype data until we obtained at least three replicates.

      We did genotype every individual resulting from dact1/2 dHet crosses to correlate genotype to the phenotype of the embryonic convergent extension phenotype and narrowed ethmoid plate (Fig. 2A, Fig. 3) which demonstrated full penetrance.

      (2) The description of the expression of dact1/2 and wnt11f2 is not consistent with what the images are showing. In the revised figure 1 legend, the author says "dact2 and wnt11f2 transcripts are detected in the anterior neural plate" (line 1099)", but it's hard to see wnt11f2 expression in the anterior neural plate in 1B. The authors then again said " wnt11f2 is also expressed in these cells", referring to the anterior neural plate and polster (P), notochord (N), paraxial and presomitic mesoderm (PM) and tailbud (TB). However, other than the notochord expression, other expression is actually quite dissimilar between dact2 and wnt11f2 in 1C. The authors should describe their expression more accurately and take that into account when considering their function in the same pathway. 

      We have revised these sections to more carefully describe the expression patterns. We have added references to previous descriptions of wnt11 expression domains.

      (3) Similar to (2), while the Daniocell was useful in demonstrating that expression of dact1 and dact2 are more similar to expression of gpc4 and wnt11f2, the text description of the data is quite confusing. The authors stated "dact2 was more highly expressed in anterior structures including cephalic mesoderm and neural ectoderm while dact1 was more highly expressed in mesenchyme and muscle" (lines 174-176). However, the Daniocell seems to show more dact1 expression in the neural tissues than dact2, which would contradict the in situ data as well. I think the problem is in part due to the dataset contains cells from many different stages and it might be helpful to include a plot of the cells at different stages, as well as the cell types, both of which are available from the Daniocell website. 

      We have revised the text to focus the Daniocell analysis on the overall and general expression patterns. Line 220.

      (4) The authors used the term "morphological movements" (line 337) to describe the cause of dact1/2 phenotypes. Please clarify what this means. Is it cell movement? Or is it the shape of the tissues? What does "morphological movements" really mean and how does that affect the formation of the EP by the second stream of NCCs? 

      We have revised this sentence to improve clarity. Line 416.

      (5) In the first submission, only 1 out of 142 calpain-overexpressing animals phenocopied dact1/2 mutants and that was a major concern regarding the functional significance of calpain 8 in this context. In the revised manuscript, the authors demonstrated that more embryos developed the phenotype when they are heterozygous for both dact1/2. While this is encouraging, it is interesting that the same phenomenon was not observed in the dact1-/-; dact2+/- embryos (Fig. 6D). The authors did not discuss this and should provide some explanation. The authors should also discuss sufficiency vs requirement tested in this experiment. However, given that this is the most novel aspect of the paper, performing experiments to demonstrate requirements would be important. 

      We have added a statement regarding the non-effect in dact1-/-;dact2+/- embryos. Line 568-570. We have also added discussion of sufficiency vs necessity/requirement testing. Line 676-679.

      (6) Related to (5), the authors cited figure 8c when mentioning 0/192 gfp-injected embryos developed EP phenotypes. However, figure 8c is dact1/2 +/- embryos. The numbers also doesn't match the numbers in Figure 8d either. Please add relevant/correct figures. 

      The text has been revised to distinguish between our overexpression experiment in wildtype embryos (data not shown) versus overexpression in dact1/2 double het in cross embryos (Fig 8).

      Minor comments: 

      (1) Fig 1 legend line 1106 "the midbrain (MP)" should be MB 

      Revised line 1250.

      (2) Wntllf2, instead of wnt11f2, (i.e. the letter "l" rather than the number "1") was used in 4 instances, line 144, 515, 527, 1147 

      Revised lines 185, 625, 640,1300.

      (3) The authors replaced ANC with EP in many instances, but ANC is left unchanged in some places and it's not defined in the text. It's first mentioned in line 170.

      Revised line 218.

    1. eLife Assessment

      This important work presents a consolidated overview of the NeuroML2 open community standard and provides convincing evidence for its central role within a broader software ecosystem for the development of neuronal models that are open, shareable, reproducible, and interoperable. A major strength of the work is the continued development over more than two decades to establish, maintain, and adapt this standard to meet the evolving needs of the field. This work is of broad interest to the sub-cellular, cellular, computational, and systems neuroscience communities undertaking studies involving theory, modeling, and simulation.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript gives a broad overview of how to write NeuroML, a brief description of how to use it with different simulators and for different purposes - cells to networks, simulation, optimization and analysis. From this perspective it can be an extremely useful document to introduce new users to NeuroML.

      Strengths:

      The modularity of NeuroML is indeed a great advantage. For example, the ability to specify the channel file allows different channels to be used with different morphologies without redundancy. The hierarchical nature of NeuroML also is commendable, and well illustrated.

      The number of tools available to work with NeuroML is impressive.

      Having a python API and providing examples using this API is fantastic. Exporting to NeuroML from python is also a great feature.

      The tutorials should assist additional scientists in adopting NeuroML.

      Weaknesses:

      None noted.

    3. Reviewer #2 (Public review):

      Summary:

      Developing neuronal models that are shareable, reproducible, and interoperable allows the neuroscience community to make better use of published models and to collaborate more effectively. In this manuscript, the authors present a consolidated overview of the NeuroML model description system along with its associated tools and workflows. They describe where different components of this ecosystem lay along the model development pathway and highlight resources, including documentation and tutorials, to help users employ this system.

      Strengths:

      The manuscript is well-organized and clearly written. It effectively uses the delineated model development life cycle steps, presented in Figure 1, to organize its descriptions of the different components and tools relating to NeuroML. It uses this framework to cover the breadth of the software ecosystem and categorize its various elements. The NeuroML format is clearly described, and the authors outline the different benefits to its particular construction. As primarily a means of describing models, NeuroML also depends on many other software components to be of high utility to computational neuroscientists; these include simulators (ones that both pre-date NeuroML and those developed afterwards), visualization tools, and model databases.

      Overall, the rationale for the approach NeuroML has taken is convincing and well-described. The pointers to existing documentation, guides, and the example usages presented within the manuscript are useful starting points for potential new users. This manuscript can also serve to inform potential users of features or aspect of the ecosystem that they may have been unaware of, which could lower obstacles to adoption. While much of what is presented is not new to this manuscript, it still serves as a useful resource for the community looking for information about an established, but perhaps daunting, set of computational tools.

      Weaknesses:

      The manuscript in large part catalogs the different tools and functionalities that have been produced through the long development cycle of NeuroML. Overall, the interoperability of NeuroML is a benefit, but it does increase the complexity of choices facing users entering into the ecosystem.

      In many respects this is an intractable fact of the current environment, but the authors do try to mitigate the issue with user guides (e.g., Table 1) and example code (e.g. Box 1) which address a range of target user audiences, from those learning about the ecosystem for the first time to those looking to implement specific model features. They also categorize different simulator options (Figure 5) and provide feature comparisons (Table 3), which could assist with the most daunting choice faced by new users.

      Comments on revised version:

      The authors have addressed my major concerns with the original manuscript. The discussion of simulators in particular is much clearer now, and the manuscript has been restructured so that specific details pertinent to a much more focused audience have been rewritten or shifted to more appropriate locations.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript gives a broad overview of how to write NeuroML, and a brief description of how to use it with different simulators and for different purposes - cells to networks, simulation, optimization, and analysis. From this perspective, it can be an extremely useful document to introduce new users to NeuroML.

      We are glad the reviewer found our manuscript useful.

      However, the manuscript itself seems to lose sight of this goal in many places, and instead, the description at times seems to target software developers. For example, there is a long paragraph on the board and user community. The discussion on simulator tools seems more for developers, not users. All the information presented at the level of a developer is likely to be distracting to eLife readership.

      To make the paper less developer focussed and more accessible to the end user we have shortened the long paragraphs on the board and user community (and moved some of this text to the Methods section; lines: 524-572 in the document with highlighted changes). We have also made the discussion on simulator tools more focussed on the user (lines 334-406). However, we believe some information on the development and oversight of NeuroML and its community base are relevant to the end user, so we have not removed these completely from the main text.

      Strengths:

      The modularity of NeuroML is indeed a great advantage. For example, the ability to specify the channel file allows different channels to be used with different morphologies without redundancy. The hierarchical nature of NeuroML also is commendable, and well illustrated in Figures 2a through c.

      The number of tools available to work with NeuroML is impressive.

      The abstract, beginning, and end of the manuscript present and discuss incorporating NeuroML into research workflows to support FAIR principles.

      Having a Python API and providing examples using this API is fantastic. Exporting to NeuroML from Python is also a great feature.

      We are glad the reviewer appreciated the design of NeuroML and its support for FAIR principles.

      Weaknesses:

      Though modularity is a strength, it is unclear to me why the cell morphology isn't also treated similarly, i.e., specify the morphology of a multi-compartmental model in a separate file, and then allow the cell file to specify not only the files containing channels, but also the file containing the multi-compartmental morphology, and then specify the conductance for different segment groups. Also, after pynml_write_neuroml2_file, you would not have a super long neuroML file for each variation of conductances, since there would be no need to rewrite the multi-compartmental morphology for each conductance variation.

      We thank the reviewer for highlighting this shortcoming in NeuroML2. We have now added the ability to reference externally defined (e.g. in another file) <morphology> and <biophysicalProperties> elements from <cells>. This has enabled the morphologies and/or specification of ionic conductances to be separated out and enables more streamlined analysis of cells with different properties, as requested. Simulators NEURON, NetPyNE and EDEN already support this new form. Information on this feature has been added to https://docs.neuroml.org/Userdocs/ImportingMorphologyFiles.html#neuroml2 and also mentioned in the text (lines 188-190).

      This would be especially important for optimizations, if each trial optimization wrote out the neuroML file, then including the full morphology of a realistic cell would take up excessive disk space, as opposed to just writing out the conductance densities. As long as cell morphology must be included in every cell file, then NeuroML is not sufficiently modular, and the authors should moderate their claim of modularity (line 419) and building blocks (551).

      We believe the new functionality outlined above addresses this issue, as a single file containing the <morphology> element could be referenced, while a much smaller file, containing the channel distributions in a <biophysicalProperties> element would be generated and saved on each iteration of the optimisation.

      In addition, this is very important for downloading NeuroML-compliant reconstructions from NeuroMorpho.org. If the cell morphology cannot be imported, then the user has to edit the file downloaded from NeuroMorpho.org, and provenance can be lost.

      While the NeuroMorpho.Org website does support converting reconstructed morphologies in SWC format to NeuroML, this export feature is no longer supported on most modern browsers due to it being based on Java Applet technologies. However, a desktop version of this application, CVApp, is actively maintained

      (https://github.com/NeuroML/Cvapp-NeuroMorpho.org), and we have updated it to support export of the SWC to the standalone <morphology> element form of NeuroML discussed above. Additionally, a new Python application for conversion of SWC to NeuroML is in development and will be incorporated into PyNeuroML (Google Summer of Code 2024). Our documentation has been updated with the recommended use of SWC in NeuroML based modelling here: https://docs.neuroml.org/Userdocs/Software/Tools/SWC.html

      We have also included URLs to the tool and the documentation in the paper (lines: 473-474).

      SWC files, however, cannot be used “as is” for modelling since they only include information (often incomplete—for example a single point may represent a soma in SWC files) on the points that make the cell, but not on the sections/segments/cables that these form. Therefore, NeuroML and other simulation tools, including NEURON, must convert these into formats suitable for simulation. The suggested pipeline for use of NeuroMorpho SWC files would therefore be to convert them to NeuroML, check that they represent the intended compartmentalisation of the neuron and then use them in models.

      To ensure that provenance is maintained in all NeuroML models (including conversions from other formats), NeuroML supports the addition of RDF annotations using the COMBINE annotation specifications in model files:

      https://docs.neuroml.org/Userdocs/Provenance.html. We have added this information to the paper (lines: 464-465).

      Also, Figure 2d loses the hierarchical nature by showing ion channels, synapses, and networks as separate main branches of NeuroML.

      While an instance of an ion channel is on a segment, in a cell, in a population (and hence there is a hierarchy between them), in terms of layout in a NeuroML file the ion channel is defined at the “top level” so that it can be referenced and used by multiple cells, the cell definitions are also defined top level, and used in multiple populations, etc. There are multiple ways to depict these relationships between entities, and we believe Fig 2d complements Fig 2a-c (which is more hierarchical), by emphasising the different categories of entities present in NeuroML files. We have modified the caption of Figure 2d to clarify that it shows the main categories of elements included in the NeuroML standard in their respective hierarchies.

      In Figure 5, the difference between the core and native simulator is unclear.

      We have modified the figure and text (lines: 341) to clarify this. We now say “reference” simulators instead of “core”. This emphasises that jNeuroML and pyLEMS are intended as reference implementations in each of their languages of how to interpret NeuroML models, as opposed to high performance simulators for research use. We have also updated the categorization of the backends in the text accordingly.

      What is involved in helper scripts?

      Simulators such as NetPyNE can import NeuroML into their own internal format, but require some boilerplate code to do this (e.g. the NetPyNE scripts calls the importNeuroML2SimulateAnalyze() method with appropriate parameters). The NeuroML tools generate short scripts that use this boilerplate code. We have renamed “helper scripts” to “import scripts'' for clarity (Figure 5 and its caption).

      I thought neurons could read NeuroML? If so, why do you need the export simulator-specific scripts?

      The NEURON simulator does have some NeuroML functionality (it can export cells, though not the full network, to NeuroML 2 through its ModelView menu), but does not natively support reading/importing of NeuroML in its current version. But this is not a problem as jNeuroML/PyNeuroML translates the NeuroML model description into NEURON’s formats: Python scripts/HOC/Nmodl which NEURON then executes.

      As NEURON is the simulator which allows simulation of the widest range of NeuroML elements, we have (in agreement with the NEURON developers) concentrated on incorporating the best support for NeuroML import/export in the latest (easy to install/update) releases of PyNeuroML, rather than adding this to the Neuron source code. NEURON’s core features have been very stable for years and many versions of the simulator are used by modellers - installing the latest PyNeuroML gives them the latest NEURON support without having to reinstall the latter.

      In addition, it seems strange to call something the "core" simulation engine, when it cannot support multi-compartmental models. It is unclear why "other simulators" that natively support NeuroML cannot be called the core.

      We agree that this terminology was confusing. As mentioned above, we have changed “core simulator” to “reference simulator”, to emphasise the roles of these simulation engine options.

      It might be more helpful to replace this sort of classification with a user-targeted description. The authors already state which simulators support NeuroML and which ones need code to be exported. In contrast, lines 369-370 mention that not all NeuroML models are supported by each simulator. I recommend expanding this to explain which features are supported in each simulator. Then, the unhelpful separation between core and native could be eliminated.

      As suggested, we have grouped the simulators in terms of function and removed the core/ non-core distinction. We have also added a table (Table 3) in the appendices that lists what features each simulation engine supports and updated the text to be more user focussed (lines: 348-394).

      The body of the manuscript has so much other detail that I lose sight of how NeuroML supports FAIR. It is also unclear who is the intended audience. When I get to lines 336-344, it seems that this description is too much detail for the eLife audience. The paragraph beginning on line 691 is a great example of being unclear about who is the audience. Does someone wanting to develop NeuroML models need to understand XSD schema? If so, the explanation is not clear. XSD schema is not defined and instead explains NeuroML-specific aspects of XSD. Lines 734-735 are another example of explaining to code developers (not model developers).

      We have modified these sentences to be more suitable for the general eLife audience: we have moved the explanation of how the different simulator backends are supported to the more technically detailed Methods section (lines 882-942).

      While the results sections focus on documenting what users can do with NeuroML, the Methods sections include information on “how” the NeuroML and software ecosystem function. While the information in the methods sections may not be required by users who want to use the standard NeuroML model elements, those users looking to extend NeuroML with their own model entities and/or contribute these for inclusion in the NeuroML standard will require some understanding of how the schema and component types work.

      We have tried to limit this information to the bare minimum, pointing to online documentation where appropriate. XSD schemas are, for example, briefly introduced at the beginning of the section “The NeuroML XML Schema”. We have also included a link to the W3C documentation on XSD schemas as a footnote (line 724).

      Reviewer #2 (Public Review):

      Summary:

      Developing neuronal models that are shareable, reproducible, and interoperable allows the neuroscience community to make better use of published models and to collaborate more effectively. In this manuscript, the authors present a consolidated overview of the NeuroML model description system along with its associated tools and workflows. They describe where different components of this ecosystem lay along the model development pathway and highlight resources, including documentation and tutorials, to help users employ this system.

      Strengths:

      The manuscript is well-organized and clearly written. It effectively uses the delineated model development life cycle steps, presented in Figure 1, to organize its descriptions of the different components and tools relating to NeuroML. It uses this framework to cover the breadth of the software ecosystem and categorize its various elements. The NeuroML format is clearly described, and the authors outline the different benefits of its particular construction. As primarily a means of describing models, NeuroML also depends on many other software components to be of high utility to computational neuroscientists; these include simulators (ones that both pre-date NeuroML and those developed afterwards), visualization tools, and model databases.

      Overall, the rationale for the approach NeuroML has taken is convincing and well-described. The pointers to existing documentation, guides, and the example usages presented within the manuscript are useful starting points for potential new users. This manuscript can also serve to inform potential users of features or aspects of the ecosystem that they may have been unaware of, which could lower obstacles to adoption. While much of what is presented is not new to this manuscript, it still serves as a useful resource for the community looking for information about an established, but perhaps daunting, set of computational tools.

      We are glad the reviewer appreciated the utility of the manuscript.

      Weaknesses:

      The manuscript in large part catalogs the different tools and functionalities that have been produced through the long development cycle of NeuroML. As discussed above, this is quite useful, but it can still be somewhat overwhelming for a potential new user of these tools. There are new user guides (e.g., Table 1) and example code (e.g. Box 1), but it is not clear if those resources employ elements of the ecosystem chosen primarily for their didactic advantages, rather than general-purpose utility. I feel like the manuscript would be strengthened by the addition of clearer recommendations for users (or a range of recommendations for users in different scenarios).

      To make Table 1 more accessible to users and provide recommendations we have added the following new categories: Introductory guides aimed at teaching the fundamental

      NeuroML concepts; Advanced guides illustrating specific modelling workflows; and Walkthrough guides discussing the steps required for converting models to NeuroML. Box 1 has also been improved to clearly mark API and command line examples.

      For example, is the intention that most users should primarily use the core NeuroML tools and expand into the wider ecosystem only under particular circumstances? What are the criteria to keep in mind when making that decision to use alternative tools (scale/complexity of model, prior familiarity with other tools, etc.)? The place where it seems most ambiguous is in the choice of simulator (in part because there seem to be the most options there) - are there particular scenarios where the authors may recommend using simulators other than the core jNeuroML software?

      The interoperability of NeuroML is a major strength, but it does increase the complexity of choices facing users entering into the ecosystem. Some clearer guidance in this manuscript could enable computational neuroscientists with particular goals in mind to make better strategic decisions about which tools to employ at the outset of their work.

      As mentioned in the response to Reviewer 1, the term “core simulator” for jNeuroML was confusing, as it suggested that this is a recommended simulation tool. We have changed the description of jNeuroML to a “reference simulator” to clarify this (Figure 5 and lines 341, 353).

      In terms of giving specific guidance on which simulator to use, we have focussed on their functionality and limitations rather than recommending a specific tool (as simulator independent standards developers we are not in a position to favour particular simulators). While NEURON is the most widely used simulator currently, other simulation opinions (e.g. EDEN) have emerged recently which provide quite comprehensive NeuroML support and similar performance. Our approach is to document and promote all supported tools, while encouraging innovation and new developments. The new Table 3 in the Appendix gives a guide to assist users in choosing which simulator may best suit their needs and we have updated the text to include a brief description (lines 348-394).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I do not understand what the $comments mean in Box 1. It isn't until I get further in the text that I realize that those are command line equivalents to the Python commands.

      We thank the reviewer for highlighting this confusion. We’ve now explicitly marked the API usage and command line usage example columns to make this clearer. We have also used “>” instead of “$” now to indicate the command line,

      In Figure 9 Caption "Examples of analysis functions ..", the word analysis seems a misnomer, as these graphs all illustrate the simulation output and graphing of existing variables. I think analysis typically refers to the transformation of variables, such as spike counts and widths.

      To clarify this we have changed the caption to “Examples of visualizing biophysical properties of a NeuroML model neuron”.

      Figure 10: Why is the pulse generator part of a model? Isn't that the input to a model?

      Whether the input to the model is described separately from the NeuroML biophysical description or combined with it is a choice for the researcher. This is possible because in NeuroML any entity which has time varying states can be a NeuroML element, including the current pulse generator. In this simple example the input is contained within the same file (and therefore <neuroml> element) as the cell. However, this does not need to be the case. The cell could be fully specified in its own NeuroML file and then this can be included in other files which add different inputs to facilitate different simulation scenarios. The Python scripting interface facilitates these types of workflows.

      In the interest of modularity, can stim information be stored in a separate file and "included"?

      Yes, as mentioned above, the stimulus could be stored in a separate file.

      I find it strange to use a cell with mostly dimensionless numbers as an example. I think it would be more helpful to use a model that was more physiological.

      In choosing an example model type to use to illustrate the use of LEMS (Fig 12), NeuroML (Fig 10), XML Schema (Fig 11), the Python API (Fig 13) and online documentation (Fig 15), we needed an example which showed a sufficiently broad range of concepts (dimensional parameters, state variables, time derivatives), but which is sufficiently compact to allow a concise depiction of the key elements in figures, that fit in a single page (e.g. Fig 12). We felt that the Hindmarsh Rose model, while not very physiological, was well suited for this purpose (explaining the underlying technologies behind the NeuroML specification). The simplicity of the Hindmarsh Rose model is counterbalanced in the manuscript by the detailed models of neurons and circuits in Figures 7 & 9. The latter shows a morphologically and biophysically detailed cortical L5b pyramidal cell model.

      In lines 710-714, it is unclear what is being validated. That all parameters are defined? Using the units (or lack thereof) defined in the schema?

      Validation against the schema is “level 1” validation where the model structure, parameters, parameter values and their units, cardinality, and element positioning in the model hierarchy are checked. We have updated the paragraph to include this information and to also point to Figure 6 where different levels of validation are explained.

      Lines 740 to 746 are confusing. If 1-1 between XSD and LEMS (1st sentence) then how can component types be defined in LEMS and NOT added to the standard? Which is it? 1-1 or not 1-1?

      For the curated model elements included in the NeuroML standard, there will be a 1-1 correspondence between their component type definitions in LEMS and type definitions in the XSD schema. New user defined component types (e.g. a new abstract cell model) can be specified in LEMS as required, and these do not need to be included in the XSD schema to be loaded/simulated. However, since they are not present in the schema definition of the core/curated elements, they cannot be validated against it (level 1 validation). We have modified the text to make this clearer (line: 778).

      Nonetheless, if the new type is useful for the wider community, it can be accepted by the Editorial Board, and at that stage it will be incorporated into the core types, and added to the Schema, to be part of “valid NeuroML”.

      Figure 12. select="synapses[*]/i" is not explained. Does /i mean that iSyn is divided by i, which is current (according to the sentence 3 lines after 766) or perhaps synapse number?

      We thank the reviewer for highlighting this confusion. We have now explained the construct in the text (lines 810-812). It denotes “select the i (current) values from all Attachments which have the id ‘synapses’”. These multiple values should be reduced down to a single value through addition, as specified by the attribute: reduce=”add”.

      The line after 766 says that "DerivedVariables, variables whose values depend on other variables". You should add "and that are not derivatives, which are handled separately" because by your definition derivatives are derived variables.

      Thank you. We have updated the text with your suggestion

      Reviewer #2 (Recommendations For The Authors):

      - Figure 9: I found it somewhat confusing to have the header from the screenshot at the top ("Layer 5 Burst Accommodating Double Bouquet Cell (5)") not match the morphology shown at the bottom. It's not visually clear that the different panels in Figure 9 may refer to unrelated cells/models.

      Thank you for pointing this out. We have replaced the NeuroML-DB screenshot with one of the same Layer 5b pyramidal cells shown in the panels below it.

      Additional change:

      Figure 7c (showing the NetPyNE-UI interface) has been replaced. Previously, this displayed a 3D model which had been created in NetPyNE itself, but now shows a model which has been created in NeuroML and imported for display/simulation in NetPyNE-UI, and therefore better illustrates NeuroML functionality.