1,295 Matching Annotations
  1. Mar 2021
    1. Reviewer #1 (Public Review):

      This manuscript, which follows on from a recent eLife paper documenting the relevance of the multi-basic cleavage site (MBCS) in the spike (S) protein of SARS-CoV-2, shows that growing SARS-CoV-2 on relevant epithelial cell lines or differentiated stem cell-derived culture systems prevents the emergence of MBCS mutations than impact on properties of S that contribute to cell tropism and the viral entry mechanism.

      The paper builds on the authors previous work and that of others, and in some respects the results are not surprising. Nevertheless, the paper sets out a number of important findings. 1) That SARS-CoV-2 grown in Vero cells rapidly acquire MBCS mutations, where as virus grown in airway epithelial cells or Vero-TMPRSSR2 cells do not; 2) that deep sequencing is necessary to see mutations that are not apparent in consensus sequence reads, 3) that factors such as the addition of fetal calf serum can influence the selection of mutant phenotypes and 4) that cultures derived from differentiated stem cells can provide reproducible systems for virus culture. Together, the work sets out clear guidelines for the production of SARS-CoV-2, and potentially other viruses, avoiding the pitfalls that can arise from growing viruses in permissive transformed cell lines.

      The data and manuscript are clearly presented, and my concerns are minimal. Overall, the paper will make a useful addition to the SARS-CoV-2 literature and will be of value to researchers working not just of SARS-CoV-2 but on many other viruses.

    2. Evaluation Summary:

      This manuscript follows up on work documenting the relevance of the multi-basic cleavage site (MBCS) in the spike (S) protein of SARS-CoV-2 for determining cell tropism and mode of cell entry. The paper describes a number of important findings: 1) That SARS-CoV-2 grown in Vero cells rapidly acquires MBCS mutations, where as virus grown in airway epithelial cells or Vero-TMPRSSR2 cells do not; 2) that deep sequencing is necessary to see mutations emerging that are not apparent in consensus sequence reads; 3) that factors such as fetal calf serum can influence the selection of mutant phenotypes, and 4) that cultures derived from differentiated stem cells can provide reproducible systems for virus culture. Together, the work sets out clear guidelines for the propagation of SARS-CoV-2 to avoid adaptations to laboratory cell-lines/conditions and maintain the authenticity of clinical isolates. The work has relevance to other viruses and the use of permissive transformed cell lines.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

    1. Reviewer #3 (Public Review):

      In some species, supporting cells (SCs) of the cochlea can replace hair cells and thus restore hearing. In the mouse, neonatal SCs can also produce hair cells; however, this property is lost during early postnatal life. This study sought to test whether forced expression of two transcription factors normally associated with OHC development, Atoh1 and Ifzh2, can induce adult mammalian supporting cells to take OHC-like properties. Using Cre-dependent expression in mice, the authors showed that co-expression of Atoh1 and Izfh2 could induce a small number of adult SCs to express the OHC-specific gene, Prestin. This conversion was significantly enhanced when existing OHCs were ablated, in this case using a Prestin-DTR mouse model generated by the authors. A detailed phenotypic analysis combined with single cell RNA-sequencing (scRNA-seq) supports the idea that Atoh1/Izfh2 can partially convert adult SCs into OHC-like cells. However, the conversion is not complete, with immature bundles and a gene signature that resembles P1 OHCs (and sometimes E16 OHCs) more than P7/P30 OHCs or P60 SCs. Accordingly, the new OHCs are not sufficient to restore hearing in the Prestin-DTR mouse model. Together, these data encourage optimism that adult SCs can be steered along the OHC path, though clearly more manipulations will be needed to produce mature, functional OHCs.

      The main weakness of the study is the scRNA-seq analysis, which depends on very small sample sizes. Suggestions to improve upon the analysis are listed under Specific Recommendations.

    2. Reviewer #2 (Public Review):

      The goal of this study is to devise a means of promoting adult mouse auditory sensory cell development from supporting cells (SCs), as occurs naturally in birds and fish following sensory cell death. Previous studies indicated that activating Atoh1, an early acting transcription factor that specifies sensory cell fate during embryogenesis, was not sufficient for such regeneration. The authors hypothesized that adding a second transcription factor, Ikzf2, which maintains outer hair cell (OHC) fate, would synergize with Atoh1 and push adult SCs to differentiate as OHCs. They tested this hypothesis by over-expressing both Atoh1 and Ikzf2 in supporting cells after killing the endogenous OHCs in adult cochleae. The authors showed that the induced cells first express the general HC marker, Myo6, and only later become Prestin-positive, much as occurs during normal development. Unfortunately, these induced OHC-like cells had abnormal stereocilia and did not restore auditory (ABR) thresholds. Moreover, there was a loss of IHCs (the primary auditory receptors) suggesting that much more is needed to induce a real OHC and to protect IHCs than simply inducing the two selected transcription factors. Single-cell RNAseq (scRNA-seq) results showed that the induced OHC-like cells are enriched for HC genes and depleted for SC genes, but overall are most similar to neonatal HCs as defined in published scRNA-seq data from other groups. Overall, the scRNA-seq data did not offer a clear path forward, other than to identify and test additional transcription factors that might push the induced cells to the next stage. Nevertheless, the extent of SC transformation is impressive and has not been seen in previous approaches. This is an important contribution to our understanding of the control of OHC gene expression and differentiation contributed by two important transcription factors.

    3. Reviewer #1 (Public Review):

      Mature mammalian hair cells in the cochlea do not regenerate after damage. The outer hair cells of the cochlea, which function to amplify sound, are particularly susceptible to damage. Ectopic activation of two key transcription factors for outer hair cell formation, Atoh1 and Ikzf2, in damaged adult cochlea is sufficient to convert supporting cells into hair cells expressing Prestin, which is an essential protein mediating outer hair cell functions. Although there is no functional recovery in these transgenic mice based on auditory brainstem response, this study paves the way for future design of models for hearing recovery. The main concern is the identity of the OHC-like cells drawn from the small sample size in the scRNA-seq experiments.

    4. Evaluation Summary:

      This manuscript demonstrated the effectiveness of combined activation of Atoh1 and Ikzf2 in converting adult supporting cells to outer hair cell (OHC)-like cells in a mouse model, in which the OHCs were selectively ablated with diphtheria toxin. The authors showed that while the number of regenerated hair cells was low and there was no functional recovery based on ABR, these OHC-like cells do express Prestin and exhibit a genetic profile that resembles nascent hair cells. This paper will be of great interest to researchers interested in hearing restoration, as well as regenerative biology.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

    1. Author Response:

      Evaluation Summary:

      This paper will be of considerable interest to anybody focusing on highly sensitive T cell antigen recognition. It uses an extended experimental protocol and analytical methods to assess very low T cell receptor binding affinities, and to determine how T cells discriminate between self- and non-self antigens. The main conclusions are well supported by the presented analysis and provide a novel view on a previously considered concept.

      Reviewer #1 (Public Review):

      The presented manuscript takes a comprehensive and elaborated look at how T cell receptors (TCR) discriminate between self and non-self antigens. By extending a previous experimental protocol for measuring T cell receptor binding affinities against peptide MHC complexes (pMHC), they are able to determine very low TCR-pMHC binding affinities and, thereby, show that the discriminatory power of the TCR seems to be imperfect. Instead of a previously considered sharp threshold in discriminating between self and non-self antigen, the TCR can respond to very low binding affinities leading to a more transient affinity threshold. However, the analysis still indicates an improved discrimination ability for TCR compared to other cell surface receptors. These findings could impact the way how T cell mediated autoimmunity is studied.

      The authors follow a comprehensive and elaborated approach, combining in vitro experiments with analytical methods to estimate binding affinities. They also show that the general concept of kinetic proofreading fits their data with providing estimates on the number of proofreading steps and the corresponding rates. The statistical and analytical methods are well explained and outlined in detail within the Supplemental Material. The source of all data, and especially how the data to analyze other cell surface receptor binding affinities was extracted, are given in detail as well. Besides being able to quantify TCR-pMHC interactions for very low binding affinities, their findings will improve the ability to assess how autoimmune reactions are potentially triggered, and how potent anti-tumour T cell therapies can be generated.

      In summary, the study represents an elaborated and concise analysis of TCR-pMHC affinities and the ability of TCR to discriminate between self and non-self antigens. All conclusions are well supported by the presented data and analyses without major caveats.

      Reviewer #2 (Public Review):

      The paper revisits the question of ligand discrimination ability of TCRs of T cells. The authors find that the commonly held notion of very sharp discrimination between strongly and weakly binding peptides does not hold when the affinities of the weak peptides are re-measured more accurately, using their own new method of calibration of SPR measurements. They are able to phenomenologically fit their results with a ~2 step Kinetic Proofreading model.

      It is a very carefully researched and thorough paper. The conclusions seem to be supported by the data and fundamental for our understanding of the T cell immune response with potentially very high impact in many scientific and applied fields. The calibration method could be of potential use in other cases where low affinities are an issue.

      As a non-expert in the details of experimental technique, it is somewhat difficult to understand in detail the Ab calibration of the SPR curve - which is a central piece of the paper. The main question is - what are the grounds (theoretical and/or empirical) to expect that the B_max of the TCR dose response curve will continue to be proportional to the plateau level of the Ab. Figure 1D does suggest that, but it would be hard to predict what proportionality shape the curve will take for lower affinity peptides. Given that essentially all the paper claims rest on this assumption, this should explained/reasoned/supported more clearly.

      We have revised the relevant Results and Methods sections to provide additional information. This information should clarify the expected relationship between Bmax and W6/32 binding. We emphasise that we have only interpolated within the curve and therefore, have not relied on any assumptions about the relationship between these two values outside of the empirical curve that we have generated.

      On the theoretical side - I think the scaling alpha\simeq 2 in Figure 2 is indeed consistent with a two-step KPR amplification. However, there are some questions regarding the fitting of the full model to the P_15 of the CD69 response. As explained in the Supplementary Material the authors use 3 global and 2 local parameters resulting in 37 (or 27) parameters for 32 data points. To a naive reader this might look excessive and prone to overfitting. On the other hand, looking at Figure S8 shows the value ranges of lambda and k_p are quite tight. This is in contrast to gamma and dellta that look completely unconstrained.

      We have revised the relevant Results section to explicitly indicate that the number of data points ex- ceeds the number of free parameters, which together with the ABC-SMC results, should provide additional confidence that we are not over-fitting.

      Finally, one of the stated advantages of the adaptive proof-reading model is that it is capable of explaining antagonism. It is hard to see how a 'vanilla" KPR model is capable of explaining antagonism.

      We have added a discussion paragraph to discuss antagonism, which cannot be explained by the basic KP model that we found is sufficient to explain our data on antigen discrimination in the presence of self pMHCs on autologous APCs. We describe how the methods we have employed can be used to study antagonism.

      Reviewer #3 (Public Review):

      Pettmann et al. aimed at significantly improving the accuracy of SPR-based measurements of low affinity TCR-pMHC interactions by including a 100% binding control (injecting of a conformation-specific HLA-antibody) in the surface plasmon resonance protocol. Interpolating with the information of saturated pMHC binding on the chip The authors arrive at KDs for low affinity binders that are significantly higher than the previously reported constants. If correct, this has considerable ramifications for the interpretations of the results obtained from functional assays measuring the T cell response towards pMHCs featured in a titrated fashion. Unlike what was put forward by earlier reports, the authors conclude that the discriminatory power of TCRs is far from perfect, as T cells still respond to low affinity pMHC-ligands without a sharp affinity threshold. This is also because they managed to detect T cells responding to even ultra-low affinity ligands if provided in sufficient numbers.

      The body of work convinces in several regards:

      (i) It is exceedingly well thought out and introduces a quality of analytical strength that is absent in most of the literature published thus far on this topic.

      (ii) At the same time theoretical arguments are bolstered by a large body of experimental "wet" work, which combines a synthetic approach with cellular immunology and which appears overall well executed.

      (iii) The data lead to hypotheses in the field of T cell antigen recognition in general and in the theatre of autoimmunity, cancer and infectious diseases.

      There are a few aspects that may limit the impact of the study. I have listed them below:

      (i) The study does not provide kinetic data for the low affinity ligand-TCR binding but rather argues from the position of affinities as determined via Bmax. This limits somewhat the robustness of the statements made with regard to kinetic proofreading.

      We agree with this statement and are hoping to directly measure off-rates in the future. We note that in the published literature, including our own work, point mutations to the peptide generally modify the off-rate with only minor impact on the on-rate. An example of this can be found in Lever et al (2016) PNAS where point mutations led to 100,000-fold change in the off-rate but only a 10-fold change in the on-rate. This likely explains why antigen potency is often well-correlated with affinity when using point mutations to the peptide.

      (ii) Thresholds for readouts were arbitrarily chosen (e.g. 15% activation). It appears such choices were based on system behavior (with the largest differences observed among the groups) but may have implications for the drawn conclusions.

      We have chosen 15% in order to capture the ultra-low affinity pMHCs in our potency plots and have now added a sentence for why we have chosen this particular threshold. We did explore different thresholds but found that they produced similar values of α. The precise threshold could change the estimate of α if the shape of dose-response curves was dependent on antigen affinity but we did not find any evidence for this within our data.

      In summary, the work presented contributes to demystifying the link between TCR-engagement and (membrane proximal) signaling. It also provides a fresh perspective on the potential of TCR-cossreactivity.

    2. Reviewer #3 (Public Review):

      Pettmann et al. aimed at significantly improving the accuracy of SPR-based measurements of low affinity TCR-pMHC interactions by including a 100% binding control (injecting of a conformation-specific HLA-antibody) in the surface plasmon resonance protocol. Interpolating with the information of saturated pMHC binding on the chip The authors arrive at KDs for low affinity binders that are significantly higher than the previously reported constants. If correct, this has considerable ramifications for the interpretations of the results obtained from functional assays measuring the T cell response towards pMHCs featured in a titrated fashion. Unlike what was put forward by earlier reports, the authors conclude that the discriminatory power of TCRs is far from perfect, as T cells still respond to low affinity pMHC-ligands without a sharp affinity threshold. This is also because they managed to detect T cells responding to even ultra-low affinity ligands if provided in sufficient numbers.

      The body of work convinces in several regards:

      (i) It is exceedingly well thought out and introduces a quality of analytical strength that is absent in most of the literature published thus far on this topic.

      (ii) At the same time theoretical arguments are bolstered by a large body of experimental "wet" work, which combines a synthetic approach with cellular immunology and which appears overall well executed.

      (iii) The data lead to hypotheses in the field of T cell antigen recognition in general and in the theatre of autoimmunity, cancer and infectious diseases.

      There are a few aspects that may limit the impact of the study. I have listed them below:

      (i) The study does not provide kinetic data for the low affinity ligand-TCR binding but rather argues from the position of affinities as determined via Bmax. This limits somewhat the robustness of the statements made with regard to kinetic proofreading.

      (ii) Thresholds for readouts were arbitrarily chosen (e.g. 15% activation). It appears such choices were based on system behavior (with the largest differences observed among the groups) but may have implications for the drawn conclusions.

      In summary, the work presented contributes to demystifying the link between TCR-engagement and (membrane proximal) signaling. It also provides a fresh perspective on the potential of TCR-cossreactivity.

    3. Reviewer #2 (Public Review):

      The paper revisits the question of ligand discrimination ability of TCRs of T cells. The authors find that the commonly held notion of very sharp discrimination between strongly and weakly binding peptides does not hold when the affinities of the weak peptides are re-measured more accurately, using their own new method of calibration of SPR measurements. They are able to phenomenologically fit their results with a ~2 step Kinetic Proofreading model.

      It is a very carefully researched and thorough paper. The conclusions seem to be supported by the data and fundamental for our understanding of the T cell immune response with potentially very high impact in many scientific and applied fields. The calibration method could be of potential use in other cases where low affinities are an issue.

      As a non-expert in the details of experimental technique, it is somewhat difficult to understand in detail the Ab calibration of the SPR curve - which is a central piece of the paper. The main question is - what are the grounds (theoretical and/or empirical) to expect that the B_max of the TCR dose response curve will continue to be proportional to the plateau level of the Ab. Figure 1D does suggest that, but it would be hard to predict what proportionality shape the curve will take for lower affinity peptides. Given that essentially all the paper claims rest on this assumption, this should explained/reasoned/supported more clearly.

      On the theoretical side - I think the scaling alpha\simeq 2 in Figure 2 is indeed consistent with a two-step KPR amplification. However, there are some questions regarding the fitting of the full model to the P_15 of the CD69 response. As explained in the Supplementary Material the authors use 3 global and 2 local parameters resulting in 37 (or 27) parameters for 32 data points. To a naive reader this might look excessive and prone to overfitting. On the other hand, looking at Figure S8 shows the value ranges of lambda and k_p are quite tight. This is in contrast to gamma and dellta that look completely unconstrained.

      Finally, one of the stated advantages of the adaptive proof-reading model is that it is capable of explaining antagonism. It is hard to see how a 'vanilla" KPR model is capable of explaining antagonism.

    4. Reviewer #1 (Public Review):

      The presented manuscript takes a comprehensive and elaborated look at how T cell receptors (TCR) discriminate between self and non-self antigens. By extending a previous experimental protocol for measuring T cell receptor binding affinities against peptide MHC complexes (pMHC), they are able to determine very low TCR-pMHC binding affinities and, thereby, show that the discriminatory power of the TCR seems to be imperfect. Instead of a previously considered sharp threshold in discriminating between self and non-self antigen, the TCR can respond to very low binding affinities leading to a more transient affinity threshold. However, the analysis still indicates an improved discrimination ability for TCR compared to other cell surface receptors. These findings could impact the way how T cell mediated autoimmunity is studied.

      The authors follow a comprehensive and elaborated approach, combining in vitro experiments with analytical methods to estimate binding affinities. They also show that the general concept of kinetic proofreading fits their data with providing estimates on the number of proofreading steps and the corresponding rates. The statistical and analytical methods are well explained and outlined in detail within the Supplemental Material. The source of all data, and especially how the data to analyze other cell surface receptor binding affinities was extracted, are given in detail as well. Besides being able to quantify TCR-pMHC interactions for very low binding affinities, their findings will improve the ability to assess how autoimmune reactions are potentially triggered, and how potent anti-tumour T cell therapies can be generated.

      In summary, the study represents an elaborated and concise analysis of TCR-pMHC affinities and the ability of TCR to discriminate between self and non-self antigens. All conclusions are well supported by the presented data and analyses without major caveats.

    5. Evaluation Summary:

      This paper will be of considerable interest to anybody focusing on highly sensitive T cell antigen recognition. It uses an extended experimental protocol and analytical methods to assess very low T cell receptor binding affinities, and to determine how T cells discriminate between self- and non-self antigens. The main conclusions are well supported by the presented analysis and provide a novel view on a previously considered concept.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors)

    1. Reviewer #3 (Public Review):

      Sun et al have assembled, modified, and applied a series of existing gene editing tools to tissue-derived human fetal lung organoids in a workflow they have termed "Organoid Easytag". Using approaches that have previously been applied in iPSCs and other cell models in some cases including organoids, the authors demonstrate: 1) endogenous loci can be targeted with fluorochromes to generate reporter lines; 2) the same approach can be applied to genes not expressed at baseline in combination with an excisable, constitutively active promoter to simplify identification of targeted clones; 3) that a gene of interest could be knocked-out by replacing the coding sequence with a fluorescent reporter; 4) that knockdown or overexpression can be achieved via inducible CRISPR interference (CRISPRi) or activation (CRISPRa). In the case of CRISPRi, the authors alter existing technology to lessen unwanted leaky expression of dCas9-KRAB. While these tools have previously been applied in other models, their assembly and demonstrated application to tissue-derived organoids here could facilitate their use in tissue-derived organoids by other groups.

      Limitations of the study include:

      1) is demonstrated application of these technologies to a limited set of gene targets;

      2) a lack of detail demonstrating the efficiency and/or kinetics of the approaches demonstrated.

      While access to human fetal lung organoids is likely not available to many or most researchers, it is probable that the principles applied here could carry over to other organoid models.

    2. Reviewer #2 (Public Review):

      There is now a considerable body of knowledge about the genetic and cellular mechanisms driving the growth, morphogenesis and differentiation of organs in experimental organisms such as mouse and zebrafish. However, much less is known about the corresponding processes in developing human organ systems. One powerful strategy to achieve this important goal is to use organoids derived from self-renewing, bona fide progenitor cells present in the fetal organ. The Rawlins' lab has pioneered the long-term culture of organoids derived from multipotent epithelial progenitors located in the distal tips of the early human lung. They have shown that clonal cell "lines" can be derived from the organoids and that they capable of not only long-term self-renewal but also limited differentiation in vitro or after grafting under the kidney capsule of mice. Here, they now report a strategy to efficiently test the function of genes in the embryonic human lung, regardless of whether the genes are actively transcribed in the progenitor cells. The strengths of the paper are that the authors describe a number of different protocols (work-flows), based on Crisper/Cas9 and homology directed repair, for making fluorescent reporter alleles (suitable for cell selection) and for inducible over-expression or knockout of specific genes. The so-called "Easytag" protocols and results are carefully described, with controls. The work will be of significant interest to scientists using organoids as models of many human organ systems, not just the lung. The weaknesses are that they authors do not show that their lines can undergo differentiation after genetic manipulation, and therefore do not provide proof of principle that they can determine the function in human lung development of genes known to control mouse lung epithelial differentiation. It would also be of general interest to know whether their methods based on homologous recombination are more accurate (fewer incorrect targeting events or off target effects) than methods recently described for organoid gene targeting using non homologous repair.

    3. Reviewer #1 (Public Review):

      The authors demonstrate applications including fluorescent marking of membranes with GFP or monomeric RFP, reporter alleles for convenient assessment of differentiation status based on fluorescence, and targeted gene knockout. They also demonstrate conditional gene knockdown and induction with tight control achieved by engineering a protein destabilizing domain. The design of the constructs is clever and imparts the ability to leverage iterative FACS to enrich successfully targeted cells, particularly useful when targeting alleles that are not actively expressed by the progenitors. The work is well done and clearly presented.

    4. Evaluation Summary:

      In this paper Sun and colleagues aimed to demonstrate the feasibility of using CRISPR-based gene editing techniques applied to tissue-derived human fetal lung organoids. While previous studies have used CRISPR-Cas9 to perform knock-in or knock-out studies in organoids (such as intestinal, hepatic or tumor organoids), this is the first report to apply these tools to a tissue-derived lung organoid model. A major strength of this report is the additional use of CRISPRi and CRISPRa technologies. The work is well done, clearly presented and makes an important contribution to the literature.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their name with the authors.)

    1. Author Response:

      Reviewer #1:

      This manuscript reports the results of two timing experiments. The experimental paradigm asks participants to judge the time of target items in an unfilled interval between two landmark stimuli. In experiment 1, there is one item that must be judged. In experiment 2, there are two items to be judged. The basic empirical result is that relative order judgments in experiment 2 are more accurate than one might expect from the absolute timing judgments of experiment 1. A model is presented.

      My overall reaction is that this paper does not present a sufficiently noteworthy empirical result. I can't imagine that there is a cognitive psychologist studying memory who would be surprised by the finding that relative order judgments in the second experiment are more accurate than one might expect from the absolute judgments in experiment 1. On the encoding side, in these really short lists (with no secondary task), there is nothing preventing the participant from noting and encoding the order as the items are presented (not unlike the recursive reminding). On the retrieval side, we've known for a very long time that judgments of serial position use temporal landmarks (see for instance a series of remarkable studies by Hintzman and colleagues circa 1970).

      Methodologically, this paper falls short of the standards one would expect for a cognitive psychology paper. There are basically no statistics or description of the distribution of the effect across participants. Although I'm pretty well-convinced that the basic finding (distributions in experiment 2 are different from experiment 1), I could not begin to guess at an effect size. The model is not seriously evaluated. The bimodal distributions are a large qualitative discrepancy that is not really discussed.

      Although the title of the paper invites us to understand these results as telling us something about episodic memory, the empirical burden of this claim is not carried. Amnesia patients (and animals with hippocampal lesions) show relatively subtle differences in timing tasks. There is no evidence presented here, nor literature review, to convince the reader of this point.

      Reviewer #1

      We regret that the reviewer did not focus on the main results of the paper, and limited their remarks to just one analysis comparing the relative order precision to the one predicted from the naive assumption on independent absolute time judgements for each item. This analysis was done to confirm that relative order is quite precisely remembered for short lists that is indeed not surprising, but we still did it in order to get a quantitative estimate of ordering mistakes that we needed for our Bayesian experiments. Another purpose was to filter out the participants that don’t pay attention to the task (a common problem when performing experiments over the internet).

      Regarding the title of the paper, we are not aware of similar experiments as ours done with amnesic patients. However we take the reviewer's point that the relation of our experiments to episodic memory as usually understood is not direct, so we took the word 'episodic' off the title in the revised version. We also added statistical analysis of the results.

      Reviewer #2:

      In this manuscript, the authors set out to measure participant's decisions about when an item occurred in a short list of 3 or 4 items, where the first and last items were always at the beginning and end, respectively. They report two behavioral studies that examine time judgments to items in the intermediate positions. They show that time judgments (when did you see X item using a continuous line scale) are always a little off but, more importantly, they tend to be anchored to other items presented. The results are interesting and add to our knowledge of the representation of time in the brain mainly by introducing a new paradigm with which to study time. Within the broader context of research on timing capacities, it should not be surprising that participants do not have a continuous representation of time that lasts beyond traditional time interval training of a few hundred milliseconds to a few seconds. Furthermore, research has also shown that 'events' that require attentional resources do morph our perception and memory for time. So while the paradigm is worth expanding on, the behavioral results are not surprising given this past literature. I do feel however that this work is an important first step in developing a more firm model of memory for time.

      Reviewer #2

      Indeed, as mentioned above in response to Reviewer #1, we are not surprised that subjects don't remember well the absolute presentation time, especially when several items were involved. Exactly what they remember is the main point of this study, and the model is quite crucial in understanding what we believe is our novel result about how ordinal and absolute time representations interact in memory. The reviewer did not seem to appreciate this; rather they re-formulated our results as time judgments (when did you see X item using a continuous line scale) being 'anchored' to other items presented. We are not sure what this exactly means, probably that on average the difference between reported times of different items stayed almost constant for each presentation conditions. However our study not only presented this result but showed how it follows from the Bayesian theory.

    1. Reviewer #3:

      The authors hypothesized lower GABA levels in older adults would influence cortico-cortical phase relationships more than cortico-muscular phase relationships during performance of a bimanual motor task. To this end, they evaluated the mediating role of endogenous bilateral sensorimotor cortex GABA content in relation to behavioral performance and patterns of interhemispheric and cortico-muscular electrophysiological phase coherence during a bimanual motor control task. The central finding was that the mediating influence of right M1 GABA on the relationship between cortico-cortical electrophysiology and behavior diverged between the younger and older groups, with lower endogenous GABA concentrations potentially benefitting bimanual motor performance in young adults and hindering performance in older adults. The result was specific to right M1 GABA, raising questions about hemispheric asymmetry, and behavioral performance differed substantially between groups, possibly influencing the sensitivity of the analyses of the electrophysiological phase relationships. Moreover, several earlier studies suggest endogenous M1 GABA content relates to cortico-muscular excitability measurements, other than phase synchrony, and it is unclear what distinguishes phase synchrony from these other measurements. The behavioral, MRS, and electrophysiological methods employed are fairly well-established and are combined in a novel manner. The Bayesian moderated mediation analysis represents a new approach to evaluating relationships between these measures under the moderating influence of age. The central questions concerning the roles of cortical endogenous GABA in bimanual control, and in age-related changes in motor control more generally, are important for determining the neural computations underlying flexible and precise behavior.

      1) The total number of finger taps within the 2000 ms transition epoch likely differed between groups and could influence the ISPC measures. It would be helpful to rule out this possibility by examining relationships between ISPC measures and the total number of taps.

      2) The differences between right and left M1 are somewhat surprising and merit further attention, particularly given the cortico-cortical ISPC results. The interpretation provided in the discussion (lines 607-618) is not particularly satisfying since this asymmetry is a critical feature of a key result. Can the authors leverage their own data to provide further insight into why RM1 GABA+ may be more likely to exhibit a relationship than LM1 GABA+? Would analyzing the behavioral data separately for the left and right hands provide further insight? Does the non-dominant hand lag behind the dominant hand, and/or is it more susceptible to errors?

      3) There were some general issues concerning the GABA+ data:

      a. Figure 2a suggests an interaction in the pattern of variance in the GABA+ data between the Young and Older groups for the LM1 and RM1 voxels. Is this interaction in variance significant, and if so, what might this mean for the M1 GABA+ results? Specifically, Young show greater variance for LM1, and Older show greater variance for RM1. Also, Young appear to show considerably lower variance for RM1 than LM1. However, the data in Figure 2 supplement 2 suggest that variance in the Young is similar between LM1 and RM1. Do these numbers accurately reflect the data depicted in Figure 2a?

      b. It would be helpful to show the difference spectra in Figure 2 supplement 1b with separate plots for Young and Older.

      c. Figure 2, supplement 1a: Was the LM1 voxel more dorsal and medial than the RM1 voxel?

      4) The authors interpret the decrease in failure and increase in error rate across the task in the Older group as an indication of a loss of precision over time. Alternatively, might this pattern also arise because these participants are becoming faster at correcting their errors (i.e. within 2000 ms), avoiding trials from being categorized as a failure? More generally speaking, it would be helpful if the authors provided additional information about the cumulative error rate trials and what behavior looked like on these trials.

      5) The authors should provide further justification for the assignment of age as the moderator and GABA+ as the mediator in their statistical model. Conceptually, it seems these factors could be reversed.

      6) Several studies have established relationships between transcranial magnetic stimulation measures of cortico-muscular excitability and endogenous GABA+ content in the dominant M1. The manuscript would benefit from further discussion of the relationship of the phase connectivity measurements used here in comparison to these other previous studies.

      7) It is not clear that data or analysis code are available.

    2. Reviewer #2:

      I like this type of multimodal study, and I think that the rationale for the study is good. I am not, however, convinced about the results/conclusions provided. Here are my main points:

      I don't agree with your conclusion that the mediating role of GABA changes in aging. This requires longitudinal data, the cross-sectional approach in this study can only conclude differences between groups since only 1 time point is available.

      No age interaction, this is surprising to me since there are age differences?

      Compensatory explanation: Is there a correlation with performance? If there isn't, the proposal of compensatory mechanisms is unclear since it is then not obvious what the compensation is for?

    3. Reviewer #1:

      The authors have acquired a substantial multimodal dataset and have used careful statistical approaches throughout. The data are acquired and analysed using appropriate methods.

      Overall, this is an impressive body of work that aims to answer an interesting question. However, a number of questions over the methods and interpretation make the authors' conclusions difficult to justify.

      When comparing between older and younger adults it would also be helpful to know the amount of grey matter in the voxels of interest. It might be expected that older adults might have more atrophy and therefore lower GABA+, than younger adults and this should be controlled for in the statistical models. The authors have put assumptions into their quantification, which are reasonable but are still assumptions. It would be helpful to directly test for a difference in grey matter fraction in the voxel between the two groups, and include this in the model if necessary.

      The authors then look at behaviour, where they use a previously described task which consists of bimanual tapping, with switching between two patterns. The results are complex as there are a number of behavioural metrics, and no clear pattern emerges. While older adults produced more errors in continuation, they also produced more fully correct switching transitions. Older subjects were slower than younger adults in all trials. While this task produces a very rich dataset, which is helpful for analysing complex behaviour, it is not clear how each metric should be interpreted in terms of the underlying neural mechanisms, and how they can be usefully combined, could be given.

      In terms of connectivity, the authors found no significant group X task difference between in-phase and anti-phase conditions. They therefore look at the groups and tasks separately. They show different changes in connectivity between age groups in different frequency bands, for example between left and right M1 in the alpha/mu and beta, between EMG and left M1 in the theta band. I am not sure that describing EEG-EMG connectivity as cortico-spinal is strictly accurate - there may be a number of other factors in this -corticomuscular would seem to be more precise. The frequency bands used are not typical, and it would be helpful to have an a priori explanation of which are being tested and why - as well as details about correction for multiple comparisons across these bands.

      Finally, the authors bring their GABA, behaviour and connectivity metrics together in a number of mediation analyses. They demonstrate a relationship between cortico-cortical connectivity and behaviour, which is mediated by age.

      The authors describe their finding of higher GABA+ in the occipital cortex as a posterior-anterior gradient, which I think is not justified by the results - there could be a number of other reasons for this, for example that different functional networks have different GABA+ levels, which is not related to their anatomical position. With only three voxels it is difficult to make a general claim such as this, and this should probably be reworded.

      The authors state that higher GABA+ indicated neural system integrity and better functioning in the older group. This seems to be rather over-interpreting their results - there are many other metrics of integrity and functioning that have not been assessed here. I would suggest rewording.

    1. Author Response:

      Reviewer #1 (Public Review):

      This paper presents the exciting statement that increasing viral loads within a community can be used as an epidemiological early-warning indicator preceeding increased positivity. It would be interesting to support this claim to present both Ct and positivity on the same graph to demonstrate that indeed, declining Ct can be used as an early marker of a COVID-19 epidemic wave. Percentage of positive test data should not only include the ones obtained in the present study but should be compared with "national data" as the present study design includes a bias in patients selection that might not reflect the "true" situation at the time. Only with this comparison, we could claim that the present study design could predict COVID-19 epidemic waves. A correlation of Ct with clinical evidence to rank the confidence of positive results is also included and further support the high specificity of the RT-PCR for detecting SARS-CoV-2 (99.995%).

      In a serological investigation, it was observed that some of these RT-PCR-positive cases do not appear to seroconvert and that possible re-infections might occures despite the presence of anti-spike antibodies. Although, reported on few individuals and therefore to be taken with extreme caution, this add some piece of information to the current unknown of the serological response of COVID-19 patient and would be of uttermost importance in the context of the current vaccination campaign.

      We do not agree that this study is biased in terms of patient selection – it invites randomly selected households to join the survey and is in fact the major source of unbiased surveillance data in the 4 nations of the United Kingdom.

    2. Reviewer #1 (Public Review):

      This paper presents the exciting statement that increasing viral loads within a community can be used as an epidemiological early-warning indicator preceding increased positivity. It would be interesting to support this claim to present both Ct and positivity on the same graph to demonstrate that indeed, declining Ct can be used as an early marker of a COVID-19 epidemic wave. Percentage of positive test data should not only include the ones obtained in the present study but should be compared with "national data" as the present study design includes a bias in patients selection that might not reflect the "true" situation at the time. Only with this comparison, we could claim that the present study design could predict COVID-19 epidemic waves. A correlation of Ct with clinical evidence to rank the confidence of positive results is also included and further support the high specificity of the RT-PCR for detecting SARS-CoV-2 (99.995%).

      In a serological investigation, it was observed that some of these RT-PCR-positive cases do not appear to seroconvert and that possible re-infections might occur despite the presence of anti-spike antibodies. Although, reported on few individuals and therefore to be taken with extreme caution, this add some piece of information to the current unknown of the serological response of COVID-19 patient and would be of uttermost importance in the context of the current vaccination campaign.

    3. Evaluation Summary:

      The authors present a systematic and complete study of Ct (cycle threshold) values in RT-PCR tests and gene-specific positivity for the UK ONS infection survey. There are very few datasets like this for any viral pathogen, regardless of pandemics. The patterns are fascinating and thought-provoking.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

    1. Reviewer #3 (Public Review):

      In this revised manuscript (Oon and Prehoda), the authors performed additional live-imaging experiments and recorded aPKC and actin dynamics simultaneously in larval neuroblasts. They also provide evidence that aPKC polarization is lost upon F-actin disruption by Latrunculin A treatment. These are great improvements. The pulsatile dynamics of actin and myosin II showed in the manuscript are compelling. Images presented in this manuscript are of high-quality and impressive.

      However, the pulsatile apical myosin network in delaminating neuroblasts in Drosophila embryos was reported previously (An Y. et al., Development, 2017). This important and relevant paper should be cited in the introduction of the current manuscript. Therefore, the finding on the pulsatile actomyosin in larval brain neuroblasts reported in this manuscript is not a total novel discovery. Another major concern is that Lat-A did not specifically disrupt actomyosin pulsatile movements, as it generally disrupts the F-actin network. So these experiments only strengthened the link between the F-actin network and Par polarity (which was already demonstrated in Kono et al., 2019; Oon 22 and Prehoda, 2019). Low doses of Cytochalasin D are known to disrupt myosin pulses still allowing the assembly of the actomyosin network (Mason et al., Nature Cell Biology 2014). The author should treat neuroblasts with low doses of CytoD to only disrupt actomyosin pulses, not the entire F-actin network, and examine the effect on Par polarity. It is also worthwhile to knockdown sqh to disrupt apical pulsatile actin dynamics. Besides, most of the concerns previously raised by the reviewer were not addressed in the revised manuscript.

    2. Reviewer #2 (Public Review):

      Previously, Oon and Prehoda showed apically directed movement of aPKC clusters during polarization of the neuroblast prior to asymmetric cell division. They found that these movements required F-actin, but the distribution of F-actin has only been reported for later stages of neuroblast polarization and division. Here, the authors report pulses of cortical F-actin during interphase, followed by an apically directed flow at the onset of mitosis, a strong apical accumulation of F-actin at metaphase and anaphase, followed by fragmentation and basally directed flow of the fragments. aPKC clusters are shown to colocalize with the F-actin networks as they flow apically. The F-actin networks are also shown have partial colocalization with non-muscle myosin II, suggesting a possible mechanism for their movement. Finally, the authors solidify the results of actin inhibitor studies from their 2019 study by showing that reported effects on aPKC localization are preceded by F-actin loss as would be expected but was not previously shown. Overall, the Research Advance extends the past study by more directly showing the involvement of F-actin and myosin in the apical localization mechanism of aPKC, and by describing F-actin and myosin dynamics prior to this transition. The following concerns should be addressed.

      1) The pulsatile nature of broad F-actin networks is evident during interphase, but these pulsations substantially subside upon entry into mitosis, and at this stage an apically directed flow of F-actin is the main behavior evident. This transition from pulses to flow is evident in both the movies and the kymographs of the F-actin probe. However, the authors state that the pulsations continue at the onset of mitosis and as the apical cap of aPKC matures. It is unclear whether the apical flow of aPKC and F-actin is associated with small-scale defined F-actin pulses, or small-scale random fluctuations of F-actin. The F-actin flow alone is an informative finding. The authors should consider revising their descriptions of these data (including in the manuscript title), or provide clearer examples of defined F-actin pulsations during the stage when aPKC polarizes.

      2) I checked the main text, methods, figures and figure legends, but could not find listings of sample sizes. Thus, the reproducibility of the findings has not been reported.

    3. Reviewer #1 (Public Review):

      Oon and Prehoda report pulsatile contraction of apical membrane in the process of Par protein polarization in Drosophila neuroblasts. This explains how/why actin filament was required to localize/polarize Par complex. Specifically, using spinning disc confocal microscopy with high temporal resolution, they found the directed actin movement toward the apical pole, which nicely correlates with concentration of aPKC. They also show that myosin II is involved in this pulsatile movement of actin filament. This very much resembles the observation in C. elegans embryos, and nicely unifies observations across systems. Although descriptive in nature, I think this is an important observation and indicates a universal mechanism by which cells are polarized. I think this is a well executed study.

    4. Evaluation Summary:

      Oon and Prehoda report pulsatile contraction of apical membrane in the process of Par protein polarization in Drosophila neuroblasts. This explains how/why actin filament was required to localize/polarize Par complex. This very much resembles the observation in C. elegans embryos, and nicely unifies observations across systems.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

    1. Reviewer #2:

      This study reports a new cell line model for Dyskeratosis congenita, generated by introducing a disease-causing mutation, DKC1 A386T, into human iPS-derived type II alveolar epithelial cells (iAT2). The authors found that the mutant cells failed to form organoids after serial passaging and displayed hallmarks of cellular senescence and telomere shortening. Transcriptomics analysis for the mutant cells unveiled defects in Wnt signaling and down-regulation of the downstream shelterin complex components. Finally, treating the mutant cells with a Wnt agonist, a GSK3 inhibitor CHIR99021 can rescue these defects and enhance telomerase activity. Overall, the study is well designed and executed. Data presented are generally clear and convincing. The new model presented here can be of great interests in the field to study the effects of DC disease causing mutants in diverse cell types.

    2. Reviewer #1:

      In this manuscript, Fernandez et al examine the impact of defective telomere length maintenance on type II alveolar epithelial cells, which are thought to be central to the pathogenesis of pulmonary fibrosis in dyskeratosis congenita (DC) and related telomere biology disorders. Murine models have been used to address how telomere dysfunction in AT2 cells drives pulmonary fibrosis however these models have limitations. Therefore, the investigators' study of human AT2 cells/organoids derived from induced pluripotent stem cells (iAT2 cells) in the presence and absence of a known DC pathogenic variant provides an exceptional model. In addition, the investigators use expression profiling to uncover decreased canonical WNT signaling in iAT2 cells with telomere dysfunction and then demonstrate rescue of telomere dysfunction and iAT2 cell growth with the addition of a GSK3 inhibitor, a canonical WNT agonist. The data appear to be of high quality, the approaches and interpretation appropriate, with some noted exceptions below. Given the importance of the problem (dysfunctional telomere-induced pulmonary fibrosis) and the apparent benefit of GSK3 inhibition of iAT2 cell growth and telomere dysfunction, which extends the work published by this group previously on intestinal organoids (might enhanced canonical WNT signaling more broadly affect other tissues with telomere-induced senescence?), this work is significant.

      A few aspects of the studies dampen the ability to draw certain conclusions. For example, the authors use iPSCs that are 5 vs 25 passages after introduction (or not) of the DKC1 A386T mutation for the generation of iAT2 cells. They then show iAT2 DKC1 mutant organoids generated from the later passage iPSCs have an apparent growth defect as early as Day 50 but that those generated from the earlier passage iPSCs do not at Day 70 [with caveats the images are of different quality (comparing Fig. 1B and Fig. S3D) and quantitative data (similar to Fig. 1C) are lacking for the iAT2 organoids generated from the early passage iPSCs]. They argue that progressive telomere shortening is the cause of the growth defects. If this is the case, then the iAT2 cells generated from the earlier passages should eventually show growth defects with progressive telomeres shortening, which was not shown.

      The telomere length analysis of the iAT2 cells at Day 50 and Day 70 are not markedly different, and neither the % p21 + nor TIF+ cells is shown for Day 50. Therefore, the conclusion that it is the accumulation of short uncapped telomeres in the DKC1 mutant iAT2 cells that alters gene expression and induces senescence at Day 70 ignores the extent of these changes at Day 50.

      The statement that CHIR99021 (when present in the medium from Day 49-70) rescued the growth defect seems generous; the effect is partial and the assay is for organoid formation efficiency only. Moreover, it is most likely prohibiting the further accumulation of senescent cells rather than rescuing cells that were not previously growing.

      It is striking that prolonged CHIR99021 treatment (ie, through to Day 70) resulted in increased telomerase activity, and more so in mutant compared to wild type cells. First, how reproducible was this effect? I appreciate that the authors have not explored this for this manuscript, however, TERT expression does not rescue DKC1 mutants but TERC does. Were TERC levels increased? Also, given this robust increase, it is striking that no difference is detected in TeSLA assays given the proportion of very short detected telomeres that would presumably be substrates for telomerase. It is noteworthy that, in the protocol to derive iAT2 cells, CHIR99021 is present in the media prior to Day 28. This raises the question of whether there is rescue of telomerase in the cells exposed to CHIR99021 in the interval of iAT2 specification?

    3. Summary: The investigators' study of human AT2 cells-derived from induced pluripotent stem cells (iAT2 cells) in the presence and absence of a known dyskeratosis congenita (DC) pathogenic variant provides an exceptional model for understanding pathogenesis of pulmonary fibrosis in dyskeratosis congenita (DC) and related telomere biology disorders (TBDs). They provide convincing data demonstrating altered WNT signaling in iAT2 cells with short, dysfunctional telomeres and improved growth of iAT2 cells by GSK3 inhibition but fall short of convincingly showing the latter is due restored telomere end protection. The work should be of interest to those in the fields of telomere biology and the TBDs, lung physiology, WNT signaling and stem cell biology.

    1. Reviewer #3 (Public Review):

      The authors sought to show how the segments of influenza viruses co-evolve in different lineages. They use phylogenetic analysis of a subset of the complete genomes of H3N2 or the two H1N1 lineages (pre and post 2009), and use a method - Robinson-Foulds distance analysis - to determine the relationships between the evolutionary patterns of each segment, and find some that are non-random.

      1) The phylogenetic analysis used leaves out sequences that do not resolve well in the phylogenic analysis, with the goal of achieving higher bootstrap values. It is difficult to understand how that gives the most accurate picture of the associations - those sequences represent real evolutionary intermediates, and their inclusion should not alter the relationships between the more distantly related sequences. It seems that this creates an incomplete picture that artificially emphasizes differences among the clades for each segment analyzed?

      2) It is not clear what the significance is of finding that sequences that share branching patterns in the phylogeny, and how that informs our understanding of the likelihood of genetic segments having some functional connection. What mechanism is being suggested - is this a proxy for the gene segments having been present in the same viruses - thereby revealing the favored gene segment combinations? Is there some association suggested between the RNA sequences of the different segments? The frequently evoked HA:NA associations may not be a directly relevant model as those are thought to relate to the balance of sialic acid binding and cleavage associated with mutations focused around the receptor binding site and active site, length of NA stalk, and the HA stalk - does that show up in the overall phylogeny of the HA and NA segments? Is there co-evolution of the polymerase gene segments, or has that been revealed in previous studies, as is suggested?

      The mechanisms underlying the genomic segment associations described here are not clear. By definition they would be related to the evolution of the entire RNA segment sequence, since that is being analyzed - (1) is this because of a shared function (seems unlikely but perhaps pointing to a new activity), or is it (2) because of some RNA sequence-associated function (inter-segment hybridization, common association of RNA with some cellular or viral protein)? (3) Related to specific functions in RNA packaging - please tell us whether the current RNA packaging models inform about a possible process. Is there a known packaging assembly process based on RNA sequences, where the association leads to co-transport and packaging - in that case the co-evolution should be more strongly seen in the region involved in that function and not elsewhere? The apparent increased association in the cytoplasm of the subset of genes examined for the single virus looks mainly in the cytoplasm close to the nucleus - suggesting function (2) and/or (3)?.

      It is difficult to figure out how the data found correlates with the known data on reassortment efficiency or mechanisms of systems for RNA segment selection for packaging or transport - if that is not obvious, maybe you can suggest processes that might be involved.

    2. Reviewer #2 (Public Review):

      The influenza A genome is made up of eight viral RNAs. Despite being segmented, many of these RNAs are known to evolve in parallel, presumably due to similar selection pressures, and influence each other's evolution. The viral protein-protein interactions have been found to be the mechanism driving the genomic evolution. Employing a range of phylogenetic and molecular methods, Jones et al. investigated the evolution of the seasonal Influenza A virus genomic segments. They found the evolutionary relationships between different RNAs varied between two subtypes, namely H1N1 and H3N2. The evolutionary relationships in case of H1N1 were also temporally more diverse than H3N2. They also reported molecular evidence that indicated the presence of RNA-RNA interaction driving the genomic coevolution, in addition to the protein interactions. These results do not only provide additional support for presence of parallel evolution and genetic interactions in Influenza A genome and but also advances the current knowledge of the field by providing novel evidence in support of RNA-RNA interactions as a driver of the genomic evolution. This work is an excellent example of hypothesis-driven scientific investigation.

      The communication of the science could be improved, particularly for viral evolutionary biologists who study emergent evolutionary patterns but do not specialise in the underlying molecular mechanisms. The improvement can be easily achieved by explaining jargon (e.g., deconvolution) and methodological logics that are not immediately clear to a non-specialist.

      The introduction section could be better structured. The crux of this study is the parallel molecular evolution in influenza genome segments and interactions (epistasis). The authors spent the majority of the introduction section leading to those two topics and then treated them summarily. This structure, in my opinion, is diluting the story. Instead, introducing the two topics in detail at the beginning (right after introducing the system) then discussing their links to reassortments, viral emergence etc. could be a more informative, easily understandable and focused structure. The authors also failed to clearly state all the hypotheses and predictions (e.g., regarding intracellular colocalisation) near the end of the introduction.

      The authors used Robinson-Foulds (RF) metric to quantify topological distance between phylogenetic trees-a key variable of the study. But they did not justify using the metric despite its well-known drawbacks including lack of biological rational and lack of robustness, and particularly when more robust measures, such as generalised RF, are available.

      Figure 1 of the paper is extremely helpful to understand the large number of methods and links between them. But it could be more useful if the authors could clearly state the goal of each step and also included the molecular methods in it. That would have connected all the hypotheses in the introduction to all the results neatly. I found a good example of such a schematic in a paper that the authors have cited (Fig. 1 of Escalera-Zamudio et al. 2020, Nature communications). Also this methodological scheme needs to be cited in the methods section.

      Finally, I found the methods section to be difficult to navigate, not because it lacked any detail. The authors have been excellent in providing a considerable amount of methodological details. The difficulty arose due to the lack of a chronological structure. Ideally, the methods should be grouped under research aims (for example, Data mining and subsampling, analysis of phylogenetic concordance between genomic segments, identifying RNA-RNA interactions etc.), which will clearly link methods to specific results in one hand and the hypotheses, in the other. This structure would make the article more accessible, for a general audience in particular. The results section appeared to achieve this goal and thus often repeat or explain methodological detail, which ideally should have been restricted to the methods section.

    3. Reviewer #1 (Public Review):

      In this paper, authors did a fine job of combining phylogenetics and molecular methods to demonstrate the parallel evolution across vRNA segments in two seasonal influenza A virus subtypes. They first estimated phylogenetic relationships between vRNA segments using Robinson-Foulds distance and identified the possibility of parallel evolution of RNA-RNA interactions driving the genomic assembly. This is indeed an interesting mechanism in addition to the traditional role for proteins for the same. Subsequently, they used molecular biology to validate such RNA-RNA driven interaction by demonstrating co-localization of vRNA segments in infected cells. They also showed that the parallel evolution between vRNA segments might vary across subtypes and virus lineages isolated from distinct host origins. Overall, I find this to be excellent work with major implications for genome evolution of infectious viruses; emergence of new strains with altered genome combination.

      Comments:

      I am wondering if leaving out sequences (not resolving well) in the phylogenic analysis interferes with the true picture of the proposed associations. What if they reflect the evolutionary intermediates, with important implications for the pathogen evolution which is lost in the analyses?

      Lines 50-51: Can you please elaborate? I think this might be useful for the reader to better understand the context. Also, a brief description on functional association between different known fragments might instigate curiosity among the readers from the very beginning. At present, it largely caters to people already familiar with the biology of influenza virus.

      Lines 95-96 Were these strains all swine-origin? More details on these lineages will be useful for the readers.

      Lines 128-132: I think it will be nice to talk about these hypotheses well in advance, may be in the Introduction, with more functional details of viral segments.

      Lines 134-136: Please rephrase this sentence to make it more direct and explain the why. E.g. "... parallel evolution between PB1 and HA is likely to be weaker than that of PB1 and PA" .

      Lines 222-223: Please include a set of hypotheses to explain you results? Please add a perspective in the discussion on how this contribute might to the pandemic potential of H1N!?.

      Lines 287-288: I am wondering how likely is this to be true for H1N1.

    4. Evaluation Summary:

      The manuscript reports phylogenetic and molecular evidence of novel RNA-RNA interactions driving the genomic coevolution of Influenza virus subtypes, in addition to protein interactions. With a few minor changes, this study could reveal how the likelihood of certain genetic combinations might lead to new viral variants emerging with the possibility of new antigenic properties and implications in disease spread.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

    1. Reviewer #3 (Public Review):

      This is a well-executed study with interesting and novel findings. The main strength is the combined use of well-executed flow cytometry studies in human patients with MI and in vitro experiments to suggest a role for immature neutrophils in infarction. The main weakness is the descriptive/associative nature of the data. What is lacking is in vivo experimentation documenting the proposed pro-inflammatory role of immature neutrophils. This limits the conclusions. The following specific concerns are raised:

      Major:

      1.In some cases, conclusions are not supported by robust data. For example, the authors conclude that CD14+HLA-DRneg/lo monocytes play a crucial role in post-infarction inflammation based exclusively on in vitro experiments. Moreover, conclusions regarding the pro-inflammatory role of immature neutrophils are based on in vitro data and associative studies.

      2.Immature neutrophils have a short lifespan. Information on the fate of immature neutrophils in the infarct is lacking. The in vivo mouse model may be ideal to address whether immature neutrophils undergo apoptosis or mature within the infarct environment

      3.The rationale for selective assessment of specific genes and for the specific neutrophil-lymphocyte co-culture system is unclear. In neutrophils, the basis for selective assessment of some specific genes (MMP9, IL1R1, IL1R2, STAT3 etc), vs. other inflammatory genes known to be expressed at high levels by neutrophils is not explained. Similarly, the rationale for the experiment examining interactions of CD10neg neutrophils with T cells is not clear. Considering the effects of neutrophils on macrophage phenotype and on cardiomyocytes, study of interactions with other cell types may have made more sense.

      4.The concept of CMV seropositivity is suddenly introduced without a clear rationale. The data show infiltration of the infarcted heart with immature neutrophils and CD14+HLA-DRneg monocytes. One would have anticipated more experiments investigating the (proposed) role of these cells in the post-infarction inflammatory response, rather than comparison of CMV+ vs negative patients.

    2. Reviewer #2 (Public Review):

      In this study, Fraccarollo and colleagues describe the existence and higher prevalence of subpopulations of immature monocytes and neutrophils with pro-inflammatory responses in patients with acute myocardial infarction. CD14+HLA-DRneg/low monocytes and CD16+CD66b+CD10neg neutrophils correlate with markers of systemic inflammation and parameters of cardiac damage. In particular in patients positive for cytomegalovirus and elevated levels of CD4+CD28null T cells, the expansion of immature neutrophils associates with increased levels of circulating IFNg. Mechanistically, immature neutrophils regulate T-cell responses by inducing IFN release through IL-12 production in a contact-independent manner. Besides, CD14+HLA-DRneg/low monocytes differentiate into macrophages with a potent pro-inflammatory phenotype characterized by the release of pro-inflammatory cytokines upon IFNg stimulation.

      This very interesting study provides new insights into the diversity and complexity of myeloid populations and responses in the context of cardiac ischemia. It is technically well performed and the results sufficiently support the conclusions of the study.

      Strengths

      The authors provide a detailed analysis of the phenotype and function of two subpopulations of CD14+HLA-DRneg/low monocytes and CD16+CD66b+CD10neg neutrophils in the context of acute myocardial infarction (AMI). Extensive phenotyping of these immune populations at different time-points after the onset of the disease provides strong correlations with multiple parameters of inflammation and severity of the disease. Hence, these subpopulations emerge as biomarkers of heart ischemic diseases with predictive potential. Using in vitro approaches, the authors support these correlations with mechanistic analyses of the inflammatory and immunomodulatory function of these populations. Finally, the authors use mouse models of ischemia-reperfusion injury to mimic the conditions observed in the AMI patients and supporting the pro-inflammatory role of immature neutrophils in this disease.

      Weaknesses

      The associations between immature neutrophils, IFNg, and CD4+CD28null T cells found in AMI patients positive for cytomegalovirus are not well supported by the mechanistic findings observed in vitro. Here, the induction of IFNg production by immature neutrophils is restricted to CD4+CD28+ T cells but not CD4+CD28null T cells.

      The experimental data obtained from mouse models of AMI to support their findings in humans would require a more extensive study. Causality between the expansion of these immature populations and the course of the disease is missing. Also, although expected, substantial differences are found between equivalent subpopulations in mice and humans thus limiting the relevance of the mouse data.

    3. Reviewer #1 (Public Review):

      In this paper, the authors tried to investigate complex roles of immune cells during acute myocardial infarction (AMI) by examining immune cells in blood samples from acute coronary syndrome (ACS) patients. They found an increase in the circulating levels of CD14+HLA-DRneg/low monocytes and CD16+CD66b+CD10neg neutrophils in the blood of ACS patients compared to healthy people, all of which were correlated with elevated levels of inflammatory markers in serum. Those findings were then further explored at a mechanistic level by using in vitro and in vivo experiments. Interestingly, the researchers also found that high cytomegalovirus (CMV) antibody titers could affect the immunoregulatory mechanisms in AMI patients. Taken together, the findings of the researchers could potentially contribute to the development of a more effective strategy to prevent cardiac deterioration and cardiovascular adverse events after AMI.

      Strengths:

      This paper contains novel insight regarding role of neutrophil and monocyte subset in pathophysiology of AMI. Although the increased level of CD10neg subsets of neutrophils in AMI patients has recently been reported (Marechal, P., et al. 2020. Neutrophil phenotypes in coronary artery disease. Journal of Clinical Medicine), the current paper aptly complemented the previous findings obtained by using its in vitro and in vivo mice model. This study also has robust methods to support their conclusion.

      Weakness:

      To further improve the strength of their conclusion, the experiments investigating the effects of immunoregulatory function of immature neutrophils and HLA-DRneg/low monocytes subsets would be advised.

    4. Evaluation Summary:

      This paper will be of broad interest to cardiologist and scientists studying acute myocardial infarction (AMI), especially to those focussing on the immune responses during AMI. Using combination of in vivo and in vitro model, as well as tissue from patients, the authors reveal new insights regarding the immune mechanisms during AMI, highlighting the importance of neutrophils and monocytes during the early days of its process. The findings in this paper add to the understanding of how immune mechanisms may contribute to subsequent adverse events after AMI.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors)

    1. Reviewer #3 (Public Review):

      The differences in signaling and responses in the three different T cell receptor transgenics are shown by several different means. These include Nur77 and CD5 expression as markers for the strength of signaling, the frequency of calcium fluxes and length of signaling-induced pauses in movement, using 2 photon microscopy of thymic slices (comparing selecting and non-selecting thymus), time course of induction of markers of positive selection signaling, the time course of "arrival" of CD8 single positive cells and CCR7+ cells in the post-natal thymus, and a time course of development of SP thymocytes after injection of EdU. Each of these methods is fairly convincing on its own, but added up, they are very convincing.

      The only issues that I could take issue with are about how we define self-reactivity. Because it is not feasible to measure the affinities for self peptides on MHC (due to low affinity and the fact that we mostly don't know what they are), the authors have to rely on surrogate markers, the upregulation of CD69 and of Nur77. These are widely accepted in the field, so they are as good a surrogate as is possible at this time.

      Similarly, 3 transgenic strains are taken as examples of high, medium and low self-reactivity. Two of the strains are positively selected on H2Kb, one on Db, one on Ld. Therefore, the experiments cannot be genetically controlled in the same manner. On balance, I accept that there aren't too many other ways to do the experiment, and that all the main points are supported by other types of experiment.

      The most interesting aspect of the work consists of analysis of gene expression by RNASeq from cells from each of the three TCR transgenic mice from early positive selection, late positive selection, and mature CD8 SP. Perhaps unsurprisingly, the more strongly self-reactive cells showed increased expression of genes involved in protein translation, RNA processing, etc. However, genes associated with lower self-reactivity were enriched for lots of different ion channels. These included calcium, potassium, sodium and chloride channels. One of these was Scn4b, part of a voltage gated sodium channel previously shown by Paul Allen's lab to be involved in positive selection. These types of genes were associated with the stage of development before selection, and were retained through selection in the weakly self-reactive thymocytes. Other ion channel genes that typically came on at the end of selection were also upregulated earlier in the lower self-reactivity cells, and may be involved in allowing long-term signaling for these cells to undergo the whole positive selection program.

    2. Reviewer #2 (Public Review):

      In their study, Lutes et al examine the fate of thymocytes expressing T cell receptors (TCR) with distinct strengths of self-reactivity, tracking them from the pre-selection double positive (DP) stage until they become mature single positive (SP) CD8+ T cells. Their data suggest that self-reactivity is an important variable in the time it takes to complete positive selection, and they propose that it thus accounts for differences in timescales among distinct TCR-bearing thymocytes to reach maturity. They make use of three MHC-I restricted T cell receptor transgenics, TG6, F5, and OT1, and follow their thymic development using in vitro and in vivo approaches, combining measures at the individual cell-level (calcium flux and migratory behaviour) with population-level positive selection outcomes in neonates and adults. By RNA-sequencing of the 3 TCR transgenics during thymic development, Lutes et al make the additional observation that cells with low self-reactivity have greater expression of ion channel genes, which also vary through stages of thymic maturation, raising the possibility that ion channels may play a role in TCR signal strength tuning.

      This is a well-written manuscript that describes a set of elegant experiments. However, in some instances there are concerns with how analyses are done (especially in the summaries of individual cell data in Fig 2 and 3), how the data is interpreted, and the conclusions from the RNA-seq with regard to the ion channel gene patterns are overstated given the absence of any functional data on their role in T cell TCR tuning. As such the abstract is currently not an accurate reflection of the study, and the discussion also focuses disproportionately on the data in the final figure, which forms the most speculative part of this paper.

      (1) As the authors themselves point out (discussion), one of the strengths of this study is the tracking of individual cells, their migratory behaviour and calcium flux frequency and duration over time. However, the single-cell experiments presented (Figure 2 and 3) do not make use of the availability of single-cell read-outs, but focus instead on averaging across populations. For instance, Figure 3a,b provides only 2 sets of examples, but there is no summary of the data providing a comparison between the two transgenics across all events imaged. In Figure 3c, the question that is being asked, which is to test for between-transgenic differences is ultimately not the question that is being answered: the comparison that is made is between signaling and non-signaling events within transgenics. However, this latter question is less interesting as it was already shown previously that thymocytes pause in their motion during Ca flux events (as do mature T cells). Moreover, the average speed of tracks is probably not the best measure here in reading out self-reactivity differences between TCR transgenic groups.

      (2) The authors conclude from their data that the self-reactivity of thymocytes correlates with the time to complete positive selection. However the definition of what this includes is blurry. It could be that while an individual cell takes the same amount of time to complete positive selection (ie, the duration from the upregulation of CD69 until transition to the SP stage is the same), but the initial 'search' phase for sufficient signaling events differs (eg. because of lower availability of selecting ligands for TG6 than for OT1), in which case at the population level positive selection would appear to take longer. Given that from Fig 2/3 it appears that both the frequency of events and their duration differ along the self-reactivity spectrum, this needs to be clarified. Moreover, whether the positive selection rate and positive selection efficiency can be considered independently is not explained. It appears that the F5 transgenic in particular has very low positive selection efficiency (substantially lower %CD69+ and of %CXCR4-CCR7+ cells than the OT1 and TG6) and how this relates to the duration of positive selection, or is a function of ligand availability is unclear.

      (3) While the question of time to appearance of SP thymocytes of distinct self-reactivities during neonatal development presented (Figure 5) is interesting, it is difficult to understand the stark contrast in time-scales seen here compared with their in vitro thymic slice (Figure 4) and in vivo EdU-labelling data (Figure 6), where differences in positive selection time was estimated to be ~1-2 days between TCR transgenics of high versus low affinity. This would suggest that there may be other important changes in the development of neonates to adults not being considered, such as the availability of the selecting self-antigens.

      (4) The conclusion that "ion channel activity may be an important component of T cell tuning during both early and late stages of T cell development" is not supported by any data provided. The authors have shown an interesting association between levels of expression of ion channels, their self-affinity and the thymus selection stage. However, some functional data on their expression playing a role in either the strength of TCR signaling or progression through the thymus (for instance using thymic slices and the level of CD69 expression over time), would be needed to make this assertion. Moreover, from how the data is presented it is difficult to follow the conclusion that a 'preselection signature' is retained by the low but not the high self-reactivity thymocytes.

    3. Reviewer #1 (Public Review):

      The work by Lutes et al. addresses how thymocytes undergo positive selection during their differentiation into mature T cells. The authors make use of several in vitro and in vivo model systems to the test whether developing thymocytes at the critical preselection CD4+CD8+ stage, expressing T cell receptors (TCRs) with different levels of putative self-reactivity, undergo different or similar differentiation events, in terms of migration, thymic epithelial cell engagement and temporal kinetics, and gene expression changes.

      The authors selected three TCR-transgenes, which have increasing levels of self-reactivity, TG6, F5 and OT1, respectively, to test their hypothesis, that TCR signals during positive selection are not only sensed differently but lead to different outcomes that then define the functional status of mature T cells. The author's conclusions that thymocytes with low self-reactivity differentiate with distinct kinetics (migration, engagement and temporal) and express a different suite of genes than thymocytes that experience high self-reactivity is well supported by several elegant approaches, and convincing findings.

      The authors clearly established that low to high TCR signaling outcomes affect the timing of positive selection, which is beautifully illustrated in Figures 3-6, and extend that work to non-TCR transgenic mice as well. Lastly, their findings from RNA-seq analyses shed light into the different genetic programs experienced by high-reactivity fast differentiating CD8 T cells as compared to low-reactivity slower differentiating cells, which appear to retain the expression a unique set of ion channels during later stages of their differentiation process.

      However, what the expression of these ion channels means in terms of either supporting the slow progression or perhaps responsible for the slow progression is not directly addressed, and likely beyond the scope. Nevertheless, the authors posit as to the potential role(s) for the differently expressed gene subsets. Overall, the work is crisply executed, and the findings reveal new aspects as to how positive selection can be achieved by thymocytes expressing very different TCR reactivities.

    4. Evaluation Summary:

      This study is of interest to immunologists as it fills a key knowledge gap in understanding factors involved in determining the duration of intrathymic positive selection of T cells. The findings come from a series of both in vitro and in vivo experiments implicating the self-reactivity of thymocytes in the time to completion of positive selection. An RNA-sequencing analysis suggests that gene expression differences from the pre-selection to the single-positive thymocyte stage is self-reactivity dependent, correlating in particular the level of ion channel expression with positive selection completion rates.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their names with the authors.)

    1. Reviewer #3 (Public Review):

      Miskolci et al have investigated if it is possible to measure the natural fluorescence of two important co-enzymes (NADH/NADPH and FAD) in living cells to determine their metabolic status. This tests the hypothesis that changes to the relative ratio of NADH/NADPH to FAD+ reflect a shift between glycolytic and oxidative phosphorylation in living macrophages. To investigate this they have used 2-photon FLIM to measure intensity and fluorescence lifetime of NAD/NADPH and FAD+ in mouse macrophages in vitro and zebrafish macrophages in vivo in a tail injury model. By comparing their measures of NAD(P)H and FAD+ from macrophages responding to different injury or infection cues and comparing this to a maRker of inflammation (TNF-alpha) they argue that there is a reduced redox state indicative of glycolytic metabolism in pro-inflammatory macrophages.

      The adoption of label free imaging techniques to measure metabolic processes in cells in vivo is a valuable and important development that, although not novel to this work, will help researchers to probe cell biology in situ. FLIM using time correlated single photon counting (TCSPC) allows an accurate and robust measure of the relative state of a molecule that shows changes in its fluorescent lifetime as a consequence of changing chemical state. Although Stringari et al (doi.org/10.1038/s41598-017-03359-8) were the first to describe the utility of wavelength mixing FLIM for measuring NAD(P)H and FAD+ levels in zebrafish, they did not focus on macrophages which is the focus of this work.

      The results from this work are interesting, as they argue that it is possible to determine cell metabolism in cells within living animals without a need to use a genetically encoded sensor and they argue that pro-inflammatory macrophages in zebrafish appear to have a lower redox state, which may reflect a more glycolytic metabolism. This assumption is not tested but rather inferred based on the measures of fluorescence intensity and lifetime of endogenous NADH/NADPH and FAD coupled with a small metabolic sampling of injured tissue. This lack of corroboration for a the supposed difference in metabolism between pro-inflammatory and non-inflammatory macrophages is a weakness of the paper and makes it hard to accept the conclusion that the redox state may reflect different metabolic profiles. A biosensor for NADH/NADPH (iNap) has been demonstrated to be a sensitive tool for measuring NADPH concentration in vivo in zebrafish during the injury response (Tao et al (doi: 10.1038/nmeth.4306) and it would be intriguing to know how similar the response is of this biosensor to the label free measurements described using FLIM. This is additionally relevant as the authors also note that in mouse macrophages cultured in vitro, they observe an opposite redox response which is well supported by the literature and a variety of different methods. Why the zebrafish macrophages should show a different redox state to mouse macrophages is not clear and an alternative explanation is that the measures used do not directly reflect the metabolic profile of the cells. One further caveat to the chosen method of using fluorescence lifetime to measure the redox state of NADH/NADPH is that lifetime of NADH is affected by which proteins it is bound to. This is not accounted for in the method used for calculating the redox ratio used for defining the redox state and could potentially alter the interpretations of relative NADH/NADPH levels in a cell. The authors acknowledge this, but do not consider whether this would affect the conclusions they arrive at from their measures of NAD(P)H intensity and fluorescence lifetime in macrophages.

    2. Reviewer #2 (Public Review):

      • The aim of this paper was to demonstrate whether FLIM-based imaging of optical redox ratio can be used to monitor metabolic states of immune cells in vivo during the course of inflammatory responses.

      • The study is rigorous and well-presented and the findings are interesting and novel. The main strength is in the in vivo data, where the authors used a variety of inflammatory challenges and perturbations and were able to detect previously unreported trends in metabolic states of macrophages.

      • The authors have demonstrated the potential of the technique to be used in vivo. Their initial findings are intriguing and can be followed up by more mechanistic studies.

      • The work is timely, because of growing interest in the role of metabolism in immune cell signalling and functions. Relevant microscopy-based assays in vivo are limited, so this innovation is important and can form the basis of further technology developments.

    3. Reviewer #1 (Public Review):

      The zebrafish has a rich history of enabling innovative microscopy techniques, and is also a well established system to model inflammation and infection by human pathogens. Consistent with this, Miskolci et al use zebrafish to test a novel imaging approach that has great potential to significantly impact the field of immunometabolism. Fluorescence lifetime is a label-free, non-invasive imaging approach to detect metabolic changes in situ, at the level of the single cell. In this report, Miskolci et al use fluorescence lifetime imaging of NAD(P)H and FAD to detect metabolic changes in zebrafish macrophages (with temporal and spatial resolution) in response to inflammatory and infectious cues.

      Miskolci et al (eLife 2019) have previously characterized inflammatory and wound healing responses to distinct caudal fin injuries (tail wound, infection and tail wound, thermal injury). In this report, authors use these different injury models to show that fluorescence lifetime imaging can detect variations in macrophage metabolism. Although many interesting results are presented and future directions are proposed, the study in its current state is descriptive and lacks validation across different modalities. As a result, the reliability of fluorescence lifetime imaging in zebrafish macrophages is not yet convincing. Moreover, any metabolomic changes in macrophages are not functionally linked to zebrafish phenotypes (eg inflammation, bacterial burden, caudal fin regeneration).

    4. Evaluation Summary:

      Immunometabolism is an emerging field, and to understand immune cell metabolism during inflammation and infection is of great interest. In this report, cutting edge (label free) microscopy techniques and innovative zebrafish models are used to characterize the metabolism of macrophages in situ. In the future, fluorescence microscopy approaches pioneered using zebrafish may illuminate strategies to therapeutically manipulate metabolism in human immune cells.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #3 agreed to share their name with the authors.)

    1. Reviewer #3 (Public Review):

      This analysis is enormous in scope. That said, approximately half the glomeruli were either truncated or had very fragmented ALRNs. The authors may wish to reserve use of the term "full" in the title ("....a full olfactory connectome") until a subsequent paper.

      ALRN-ALRN connectivity seems very interesting. It would be helpful to provide more information about this in the text (line 148 or so). The information in Fig. 3D is hard for non-specialists to interpret. Does the connectivity show any patterns? Is it stereotyped? Do the connections make functional sense?

      One intriguing finding is the "shortcuts" between the olfactory and motor systems that could be used for behaviors that are hard-wired or require fast responses. These may be particularly relevant to thermosensory and hygrosensory input, but can the authors say anything about what kind of olfactory information flows through these shortcuts? For example, the ALRNs that respond to wasp odorants have been identified. Please note that most readers do not know what kind of odorants project to individual glomeruli, e.g. "DC4" .

      Fig. 8C It's hard to know how confident to be of the neurotransmitter assignments here. It would be helpful to provide in the text a statement about what assumptions these assignments are based on. In the same vein, line 380 refers to "a neurotransmitter prediction pipeline". Some kind of reference should be provided here.

      line 522 "This suggests that thermo/hygrosensation might employ labeled lines whereas olfaction uses population coding to affect motor output." This brings up the question of whether very narrowly tuned ORNs such as the one signaling geosmin show any differences in connectivity from broadly tuned ORNs.

      lines 94-96 Graph traversal model. Some more discussion of this model and its underlying assumptions would be helpful. Are the results influenced by the lack of some of the glomeruli from the dataset?

      Fig. 7D Can the authors provide more discussion of the possible functional significance of the two uPN types?

    2. Reviewer #2 (Public Review):

      Here are three notable examples (among a long list of new discoveries). (1) The authors provided a comprehensive description of the antennal lobe local interneuron (LN) network for the first time, providing a "final" counts of neuronal number and type of LNs as well as the preference for the input and output partners of each LN type. (2) They introduced "layer" as a quantitative parameter to describe how many synapses away on average a particular neuron or neuron type is from the sensory world. A few interesting new discoveries from this analysis include that on average, multi-glomerular antennal lobe projection neurons (PNs) are further away from the sensory world than uniglomerular PNs; inhibitory lateral horn neurons are closer to the sensory world than excitatory lateral horn neurons. (3) By leveraging previous analyses they performed on another EM volume (FAFB) and comparing n = 3 (bilateral FAFB, unilateral hemibrain) samples, they analyzed stereotypy and variability of neurons and connections, something rarely done in serial EM reconstruction studies but is very important.

      Overall, the text is clearly written, figures well illustrated, and quantitative analysis expertly performed. I have no doubt that this work will have long-lasting values for anyone who study the fly olfactory system, and for the connectomics field in general.

    3. Reviewer #1 (Public Review):

      The manuscript presents a very nice and very detailed approach to illustrate the anatomical hierarchies and also some differences of signal transmission in the olfactory vs. thermosensory-/hygrosensory systems.

      The authors first provide a complete description of the Drosophila olfactory system, from first, second and third-order neurons in the lateral horn. Using a generally applicable analysis methods, they extract information flow and layered organisation between olfactory input and descending interneurons. Among the results is the interesting finding that downstream of the mushroom body and lateral horn, output neurons converge to presumably regulate behavior. In an additional set of analyses, Schlegel et al. describe and quantify inter- vs. intraindividual stereotypy of neurons and motifs. They actually compare neurons from three hemispheres of two brains and show an astounding degree of similarity across brains. This is somewhat reassuring and helpful to the field of Drosophila connectomics.

      While the many details and data make the manuscript a somewhat strenuous read, and the sheer flood of data could be a bit overwhelming, the data and findings are impressive and important.

      1) The work is very complementary to the data presented by Li et al. on the mushroom body.

      2) The structure and the step-by-step approach to showing increasingly complex circuitry and by defining different layers of the circuitry is very helpful for the reader to get an impression of the complexity of this brain.

      3) Of significant importance and of use for the community are, in addition to the data, the described methods tools for data analysis.

      4) Using this type of analysis, the authors test hypotheses and prevailing assumptions in the field. For instance, they find that in early layers of the olfactory system neurons tend to connect to the next higher layer, whereas neurons in higher layers interconnect or even connect back to earlier layers. This is a very interesting finding that might have important implications regarding top-down feedback and recurrent loops in olfactory processing.

      5) Analysis of connectivity in the antennal lobe suggests that the system is highly lateralized. This finding also has important implications and helps to explain why flies might be able to discern left from right odor sources.

      6) The manuscript shows many examples of what other scientists/readers of the manuscript could extract from the raw anatomical data. This will be very useful for the community beyond the data that is actually already shown in the manuscript.

      7) The authors also compare their findings to the connectome/motifs identified for the larval olfactory system. There are many similarities as expected.

    4. Evaluation Summary:

      This study is a tour-de-force that makes a major contribution to the field. It provides a wealth of information about connectivity in the Drosophila olfactory system, identifying a variety of novel features of its neural organization. The study provides a careful analysis of the practically important and biologically interesting question of stereotypy among animals which previous connectomic studies of the fly brain lacked. A variety of interesting hypotheses are generated. Finally, it establishes a paradigm for the analysis other neural systems.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

    1. Reviewer #2 (Public Review):

      Open source software for data rendering in neuroanatomy is either too specific to be generically useful (for example, designed for only one specific brain atlas, or brain atlases of a single species), or too general, and thus not integrated with atlases or other relevant software. Additionally, despite the growing popularity of the Python programming language in science, 3D rendering tools in Python are still very limited. Claudi et al have sought to narrow both of these gaps with brainrender. Biologists can use their software to display co-registered data on any atlas available through their AtlasAPI, explore the data in 3D, and create publication quality screenshots and animations.

      The authors should be commended for the level of modularity they have achieved in the design of their software. Brainrender depends on atlasAPI (Claudi et al, 2020), which means that compatibility for new atlases can be added in that package and brainrender will support them automatically. Similarly, by supporting standard data storage formats across the board, brainrender lets users import data registered with brainreg (Tyson et al, 2020), but does not depend on brainreg for its functionality.

      Like all software, brainrender still has limitations. For example, it's unclear from the paper exactly what input and output formats are supported, particularly from the GUI. Additionally, at publication, using the software still requires a Python installation, with all the complexity that currently entails. However, thanks to the rich and growing scientific Python ecosystem, including application packaging tools, I am confident that the authors, perhaps in collaboration with some readers, will be able to address these issues as the software matures.

    2. Reviewer #1 (Public Review):

      Claudi et al. present a new tool for visualizing brain maps. In the era of new technologies to clear and analyze brains of model organisms, new tools are becoming increasingly important for researchers to interact with this data. Here, the authors report on a new tool for just this: exploring, visualizing, and rendering this high dimensional (and large) data. This tool will be of great interest to researchers who need to visualize multiple brains within several key model organisms.

      The authors provide a nice overview of the tool, and the reader can quickly see its utility. What I would like to ask the authors to add is more information about computational resources and computing time for rendering; i.e. in the paper, they state "Brainrender uses vedo as the rendering engine (Musy et al., 2019), a state-of-the-art tool that enables fast, high quality rendering with minimal hardware requirements (e.g.: no dedicated GPU is needed)" - but would performance be improved with a GPU, runtimes, etc?

      I would also be happy to see the limitations and directions expanded. For example, napari is a powerful n-dimensional viewer, how does performance compare (i.e. any plans for a napari plug in, or ImageJ plug in, or is this not compatible with this software's vision?). How does brain render compare (run time, computing wise) to Blender, for example, or another rendering tool standard in fields outside of neuroscience?

      The methods are short (maybe check for all open source code citations are included, as needed), but they have excellent docs elsewhere; it would be nice to have minimal code examples in the methods though, i.e. "it's as easy as pip install brainrender" ... or such.

      Lastly, I congratulate the authors on a clear paper, excellent documentation (https://docs.brainrender.info/), and I believe this is a very nice contribution to the community.

    3. Evaluation Summary:

      This paper by Claudi et al. will be of interest to any scientist working in neuroanatomy and related fields. Dissemination of scientific results is one of the key products of science, and the software presented here will help scientists achieve that task more easily than ever before.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

    1. Author Response

      Summary

      This manuscript examines how N-linked glycosylation regulates the binding of polysaccharide hyaluronan (HA) to cell surface receptor CD44, to conclude that multiple sites exist but are controlled by the nature of the glycosylation. The reviewers appreciated many aspects of the work, but they have raised serious concerns about the experimental and simulation design. The reviewers suggested that the proposed alternative binding site may not be biologically relevant, as the relevant CD44-HA interactions are multivalent and cannot be supported by that site. They also suggested that the findings are not well supported by the NMR experiments, which could have been extended to allow comparisons of the glycosylation patterns hypothesised. Moreover, the MD simulations, despite being considerable in size, were limited in sampling different possibilities without bias from the initial HA placement, and there is not enough data to convince the readers of thorough sampling and reproducibility.

      We understand the concerns raised in the review process. However, these concerns can be readily explained and fixed, as we explain below and are briefly introduced here.

      • Our data are compatible with the currently accepted multivalent interaction of hyaluronan with several CD44 receptors. The argument that our data goes against it stems from an unfortunate figure provided in the first version of the manuscript. This figure suggested that a bound hyaluronan would not be able to span the length the protein in the upright binding mode. That is not true. We now show another, and more relevant snapshot where the bound hyaluronan indeed spans the length of HABD. Hence, we show that multivalent interaction is not precluded by the upright binding mode.

      • We also clarify how our extensive simulation data were designed to avoid any bias. We admit that this was not obvious in the phrasing of our previous version.

      • Many of the raised issues stem from the lack of certain critical simulations. We have now added these simulations into the revision.

      Below we summarize the main issues raised by the reviewers, accompanied by our responses on how we have fixed them in the revised version of the manuscript.

      Reviewer #1

      The authors use MD simulations and NMR to study the cell surface adhesion receptor CD44 with the purpose of understanding the binding of carbohydrate polymer, hyaluronan (HA). In particular, this study focuses on the effects of N-glycosylation of the CD44 glycoprotein on potential HA binding. The authors previously proposed two lower affinity HA binding modes as alternatives to the primary mode seen in the crystal structure of the HA binding domain of CD44, driven by different arginine interactions, but overlapping with glycosylation sites that will affect HA binding. This study suggests that, because the canonical site appears blocked by glycans attached to the surface, HA would instead likely bind to an alternate parallel site with lower affinity, thus changing receptor affinity. The authors do not study HA binding to the glycosylated form directly, but undertake simulations of bound glycans to draw their conclusion. They do, however, place HA near the non-glycosylated CD44 in simulations, although it is not clear that MD sampling has been designed to provide unbiased observations of HA binding, or how the simulations help explain the NMR experiments.

      To better highlight the message, we left out a significant portion of our total simulation data from the initial version of the manuscript. We have now added e.g. simulations of HA binding to the glycosylated form into our revised manuscript. Furthermore, we are confident that our design of the simulation systems allows unbiased sampling of the binding surface. That is, the hyaluronan hexamers were initially placed several nanometres away from the protein surface. After this, they were allowed to spontaneously sample the space and find their respective binding sites during the course of the simulations. They were not placed into the binding sites manually. However, there was a one system with two HA hexamers from which the other was placed into the canonical binding groove. This was done to test where the freely floating hexamer would bind when the primary binding site is taken. These points are illustrated more clearly in the new version of the manuscript. Finally, all our simulation data is publicly available (see the DOIs provided in the paper).

      The data rely on libraries of MD simulation, which are substantial, with several replicas of a microsecond each. But what have these simulations really proved with reliability? Figure 2a shows that, while glycans stay roughly where they started, they are dynamic and cover much of the canonical HA binding site, which may be the case. From this the authors imply that the crystallographic site is significantly obstructed, the lower-affinity upright mode remains most accessible, and that the level of occlusion of the main site depends on the degree of glycosylation and size of the oligosaccharides. However, a full simulation of HA binding to this glycosylated surface was not attempted. It would have been good to see the glycans actually block unbiased simulation of canonical binding to the crystallographic site on long timescales (not being dislodged), but allow alternative binding to the parallel site, without initial placement there.

      Commenting both points 1.1 and 1.2, we cropped a large portion of our simulation data from the initial version of the manuscript in order to better highlight the current message. However, we do have extensive simulation data of hyaluronan binding spontaneously to CD44 with different glycosylation patterns. For example, see Figure A below where HA is bound to glycosylated CD44-HABD. These data have been carefully analysed and incorporated into the revised manuscript.

      Figure A. A representative binding pose between HA oligomer (dark red) and glycosylated (light blue, yellow, green, pink and purple) CD44-HABD (pale surface) extracted from our simulations.

      HA was, however, added to the non-glycosylated CD44-HABD surface in simulations, but no clear data is shown to illustrate the extent of sampling, convergence and reproducibility, beyond some statistical analysis of contacts. It seems a total of 30 microseconds of the non-glycosylated protein with 2 or 3 nearby HA placed was run, leading to contacts. But how well did these 30 simulations sample HA movement and relative binding to sites, if at all? Figure 4 suggests that the HA stay where they have been put. As the MD is the dominant source of data for the paper, the extent of sampling and how the outcomes depend on the initial placement of molecules requires proof. Was any sampling of HA movement, such as between canonical and alternative parallel conformations seen in MD?

      It is important to note that, in the non-glycosylated systems, the hyaluronan hexamers were initially placed several nanometres away from the protein surface. After this, they were allowed to spontaneously sample the space and find their respective binding sites during the course of the simulations. That is, they were not manually placed into the binding sites. We have changed the manuscript to better illustrate this key point.

      We have also made the simulation data publicly available (see the DOIs provided in the paper). After inspection of the simulations, we are confident that the reviewers will agree that the results are reliable and do not suffer from convergence problems that could compromise the message we provide.

      Moreover, we have even more simulation replicas ready with slightly different initial conditions that provide the same qualitative picture, see Figure B below (compare with Figure 4c in the original submission where one of the hyaluronan hexamers was initially placed in the crystallographic binding site). In these simulations, the hexamers have enhanced contacts with the crystallographic and upright mode residues despite being initially placed far from these binding sites. These simulations were already part of the manuscript.

      Figure B. Hyaluronate-perturbed residues in the simulations. The colored surface displays the probability of a given residue to be in contact with HA6 in our additional simulations, where three hyaluronan hexamers were placed in solution far from the binding site.

      The NMR is suggested to show that a short HA hexamer can bind to non-glycosylated CD44-HABD simultaneously in several modes at distinct binding sites, and that MD "correlates" with this. But is this MD biased by initial choices of where and how many HAs are placed, given HA movement is likely not well sampled?

      The hyaluronan hexamers were initially placed several nanometers away from the binding sites. They were not placed into these binding sites manually. During the simulations the hexamers displayed several binding and unbinding events as they were spontaneously sampling the space and finding their respective binding sites during the course of the simulations.

      While we saw multiple binding events to the proposed binding sites, the short size of the hyaluronan fragments was likely not enough for stable binding as the fragments often dissociated within few hundreds of nanoseconds. These points are now more clearly presented in the revised manuscript.

      No MD seems to have been used to examine the blocking or lack thereof by antibody MEM-85 in glycosylated or non-glycosylated CD44.

      This is not feasible using MD simulations, since the structure of the antibody is not available. Fortunately, there is no need for it, as we have direct and reliable experimental evidence using NMR as provided in the manuscript and in our previous work (Skerlova et.al. 2015; doi: 10.1016/j.jsb.2015.06.005). We therefore know where the antibody binds in CD44.

      Reviewer #2

      This manuscript is focused on understanding how N-linked glycosylation regulates the binding of the (very large) polysaccharide hyaluronan (HA) to its major cell surface receptor CD44, a question relevant, for example to the role of CD44 in mediating leukocyte migration in inflammation. The paper concludes that multiple binding sites for HA exist and that their occupancy is determined by the nature of the glycosylation, a suggestion first made by Teriete et al. (2004). The work is based on atomistic simulations with different glycan compositions and NMR spectroscopy on a non-glycosylated CD44 HA-binding domain (HABD) expressed in E. coli. While the question being researched is interesting and of biological relevance, there are flaws in the work.

      The relevance also stems from the increasing applicability of HA in many biomedical devices and treatment strategies, such as tissue scaffolds and HA-coated nanoparticles for targeted drug delivery. However, we respectfully disagree with the proposed flaws. We address these suggested issues point-by-point in sections 2.2–2.5.

      The paper describes how the well-established HA-binding site on CD44 (determined by a co-crystal structure; Banerji et al., 2007) is blocked by N-linked glycosylation (principally at N25 with a contribution from glycans at N100 and N110) and how certain glycans favour binding at a completely distinct binding site that lies perpendicular to the canonical 'crystallographic' binding site. This alternative 'upright' binding site, which has been proposed previously by the authors (Vuorio et al., 2017), needs further supporting experimental data.

      Indeed, a characterization of the upright mode can be found from (Vuorio et al., 2017. PloS CB. 13:7). This characterization is based on mircoseconds of unbiased MD simulation data as well as extensive free energy calculations. We for example analysed the most important interactions, orientations of the sugar rings, and binding affinities. These data indicate that while the upright binding mode is weaker than the canonical binding mode (Banerji et al., 2007), it has good shape complementarity between the protein, with e.g. most of the sugar rings lying flat on the surface of the protein, indicating that it might have biological relevance.

      The supporting experimental data is presented in the current publication. It has been improved and clarified for the revised version of the manuscript.

      Firstly, unlike the 'crystallographic' binding site that forms an open-ended shallow groove on the surface of the protein allowing polymeric HA to bind (and multivalent interactions to take place), the 'upright' binding site is closed at one end and can thus only accommodate the reducing end of the polysaccharide (as apparent from Appendix 1 Figure 1). Its configuration means that it would be impossible for this mode of binding to allow multivalent interactions with polymeric HA. This is a major problem since biologically relevant CD44-HA interactions are multivalent where a single HA polymer interacts with a large number of CD44 molecules (e.g. see Wolny et al., 2010 J. Biol. Chem. 285, 30170-30180). So even if this binding site existed, an interaction between a single CD44 molecule on the cell surface with the reducing terminus of an HA polymer would be exceptionally weak.

      We have data to show that our proposed secondary binding mode does not preclude multivalent CD44-hyaluronan interactions. This multivalent interaction, where a long hyaluronan binds simultaneously to several CD44 moieties, is important, and our secondary mode is compatible with it, see the new Figure C below. We acknowledge that our Figure 1 in the Appendix 1 was not sufficiently clear on this matter. That figure illustrated a structure of one possible CD44-hyaluronan complex obtained from just one of our simulations. However, we have a number of related CD44-hyaluronan complexes from other simulations where the bound ligand spans the full length of the protein, showing that the binding site can accommodate more than just the reducing end of the polysaccharide, and this is highlighted in the attached Figure C. Therefore, multivalent binding is not precluded by the upright binding mode. Unfortunately, the figure depicted in the SI of the original manuscript was misleading. To avoid this issue, it has been replaced in the revised manuscript.

      Figure C. The secondary CD44-hyaluronan binding mode.

      Secondly the NMR experiments performed in this study, purporting to provide evidence for multiple modes of binding, are problematic. Why weren't differentially glycosylated proteins used, i.e. where individual sites were mutated (e.g. +/- N25); this would have allowed comparisons of the glycosylation patterns hypothesised (based on the computer simulations) to favour the 'crystallographic' versus 'upright' modes.

      Indeed, NMR experiments with glycosylated material would be ideal, but obtaining the required quantities of isotopically labelled protein with a homogeneous glycosylation pattern is not possible even using the state-of-the-art technology. In addition, the substantially increased molecular weight of the glycosylated protein would be out of the experimental window accessible by NMR spectroscopy. We strongly believe that the message of the paper is already sustained by a combination of our observations based on NMR experiments and MD simulation techniques together with the available literature data as detailed in Appendix A (see below).

      While being aware of the difficulties of dealing with glycosylated CD44 using NMR, we designed a way to bypass this issue by combining multiple data from different experimental and simulation setups. All the data support the claims and conclusions made in our paper, see appendix A of this rebuttal. The existence of a weaker binding mode promoted upon glycosylation due to the primary binding site being covered is compatible with all available experimental and simulation data.

      Furthermore, previous NMR studies have shown that the binding of HA to CD44 causes a considerable number of chemical shift changes due to the induction of a large conformational change in the protein (Teriete et al., 2004; Banerji et al., 2007), making it very difficult to identify amino acids directly involved in HA binding based on the NMR data. Moreover, this conformational change has been fully characterised for mouse CD44 with structures available in the absence and presence of HA (Banerji et al., 2007); this information should have been used to inform the interpretation of the shift mapping. In fact, the way in which the shift mapping data are interpreted is simplistic and doesn't fully take account of the reasons that NMR spectra can exhibit different exchange regimes.

      We interpreted the NMR data very carefully. We are aware of the extent of conformational changes induced by HA binding in CD44-HABD, in fact, we identified them as a molecular mechanism underlying the mode of action for the MEM-85 antibody (Skerlova et.al. 2015; doi: 10.1016/j.jsb.2015.06.005). Therefore, we focused on the differential changes in the NMR signal positions of surface exposed residues upon titration with HA and MEM-85. We also observed different exchange regimes that allowed us to discriminate between different HA binding sites. We emphasized these points in the revised manuscript.

      Reviewer #3

      Vuorio and colleagues combine atomic resolution molecular dynamics simulations and NMR experiments to probe how glycosylation can bias binding of hyaluronan to one of several binding sites/modes on the CD44 hyaluronan binding domain. The results are of interest specifically to the field of CD44 biophysics and more generally to the broad field of glycosylation-dependent protein-ligand binding. The manuscript is clearly written, and the combination of data from computational and experimental methodologies is convincing. I especially commend the authors on the thorough molecular dynamics work, wherein they ran multiple simulations at microsecond timescale and tried different force fields to minimize the likelihood of their findings being an artifact of a particular force field.

      The use of multiple force fields was indeed meant to alleviate potential force field specific issues. Likewise, the use of multiple simulation repeats with different starting positions and randomized atom velocities were meant to provide comprehensive statistics, minimizing the chances of over-interpreting any isolated phenomena.

      Appendix A: Summary of the logic of the research procedure together with the experimental, simulation and literature results supporting each step.

      1) Non-glycosylated CD44 binds HA (NMR experiments)

      2) Non-glycosylated CD44 also binds HA in the presence of MEM-85 (NMR experiments)

      3) Glycosylated CD44s that bind HA do not bind HA in the presence of MEM-85 (from literature [J. Bajorath, B. Greenfield, S. B. Munro, A. J. Day, A. Aruffo, Journal of Biological Chemistry 273, 338 (1998).]).

      4) We show the MEM-85 binding site in non-glycosylated CD44 to be far from the canonical crystallographic binding region (NMR experiments). This MEM-85 binding site region is mostly inaccessible to typical N-glycans found in CD44 (MD simulation). Therefore, we expect that MEM-85 binds glycosylated CD44 in the same region. (Our working hypothesis)

      5) Taken together, the above points indicate that MEM-85 covers at least partially the relevant HA binding mode in glycosylated CD44, which has zero overlap with the crystallographic mode. This supports the idea of an alternative binding mode to the crystallographic mode which must be readily available for glycosylated CD44. (Our finding)

      6) Furthermore, heavily glycosylated CD44 variants cover a significant fraction of the crystallographic mode binding region (MD simulation), potentially making it unavailable for HA binding. This explains why non-glycosylated CD44 binds HA in the presence of MEM-85 (i.e., crystallographic mode is free), while glycosylated CD44 does not (i.e., crystallographic mode is covered with N-glycans). The upright region, on the other hand, experiences only minor coverage by the N-glycans in the glycosylated CD44 and is thus free to bind the ligand (MD simulations).

      7) Non-glycosylated CD44 binds HA simultaneously with the crystallographic mode and the upright mode when exposed to high concentrations of small hyaluronan hexamers (NMR titration and MD simulations).

      8) Pinpointing the position of the residues that experience the largest chemical shift during the titration experiments using non-glycosylated CD44 clearly shows the fingerprint of the canonical crystallographic mode but also a region compatible with our proposed upright mode (NMR titration experiments). These results are compatible with our simulations of several hyaluronan hexamers (MD simulation).

      9) Upright binding mode is accessible to hyaluronan binding in the glycosylated CD44 (MD simulations shown in this letter that could be included to the paper if deemed necessary).

      Glycosylation, and glycoscience in general, is one of the most challenging topics to understand in life sciences. We believe that our paper makes a very significant contribution to this area of research in the context of a central research problem and is exceptionally able to provide an atomic-level description of the HA-CD44 interaction under unambiguously known conditions.

    1. Reviewer #3 (Public Review):

      This manuscript is well written and presents several new mouse models including animals with brown fat specific deletion of multiple genes of interest to assess whether they may function in a common pathway. The authors draw on their existing expertise in mitochondrial biology to provide new information regarding the role of OPA1 and mitochondrial dynamics in brown fat function. Weaknesses of this study include a relative lack of mechanistic insights and incomplete characterization of whole-body energy expenditure data from the multiple models reported here.

    2. Reviewer #2 (Public Review):

      Understanding the mechanisms by which thermogenic brown adipocytes become activated in response to adrenergic signaling remains a high priority for the field of adipose tissue biology. The authors of this study investigate the importance of mitochondrial fusion protein optic atrophy 1 (OPA1) in brown adipocytes, which is highly regulated at the transcriptional and post-transcriptional level upon cold exposure and obesogenic conditions. Using a genetic loss of function mouse model, the authors demonstrate BAT specific knockout of OPA1 results in brown adipocyte mitochondrial dysfunction; however, knockout animals have improved thermoregulations due to the activation of compensatory mechanisms. Part of this compensatory mechanism involves the activation of an ATF4 mediated stress response leading to the induction of FGF21 from brown adipose tissue. These data highlight the presence of homeostatic mechanisms that can ensure thermoregulation in mammals.

      Overall, the manuscript is very well-written and the data is nicely presented. The use of multiple genetic mouse models is elegant, rigorous, and yields convincing results. The authors acknowledge the strengths and limitations of the work in a nicely written discussion. This should be a valuable addition to the field, including those interested in mitochondrial biology, brown adipose tissue biology, and FGF21 function. There are minor issues that require attention and one important issue regarding the variability in FGF21 levels observed in the knockout model.

    3. Evaluation Summary:

      The new work utilizes several elegant genetic mouse models to evaluate the importance of the mitochondrial fusion protein OPA1 in thermogenic brown adipocytes. This well-written and rigorous study sheds insight into the importance of OPA1 in brown adipocytes and also uncovers an unexpected compensatory mechanism that ensures thermoregulation in mice.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

    1. Reviewer #2 (Public Review):

      This work evaluates the role for GAGA factor (GAF) as a pioneer factor during the zygotic genome activation (ZGA) of early Drosophila embryogenesis. GAF has previously been shown to regulate chromatin accessibility and higher order genome organization in a variety of biological contexts. However, it has historically been difficult to evaluate the role of GAF specifically during early embryogenesis through standard genetic approaches. This paper solves this problem by employing a combination of gene editing and targeted degradation strategies to specifically knock down GAF in early embryos. Through a combination of imaging and genomic approaches, this paper demonstrates a population of genomic loci that depend on GAF to gain chromatin accessibility and to be expressed during the maternal to zygotic transition. This work identifies an additional pioneer factor activity operating at ZGA and furthermore evaluates the potential interdependency of GAF and another pioneer, Zelda.

    2. Evaluation Summary:

      This paper will be of interest to a broad audience of developmental biologists and molecular biologists in the field of transcriptional control and epigenetics. It evaluates the pioneer factor activity associated with GAGA-Factor during the process of zygotic genome activation. The experiments are rigorously performed and the data analysis supports the conclusions.

      This manuscript is in revision at eLife.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors)

    1. Reviewer #2:

      This work combines an interesting experimental approach to measure temporal expansion/compression with EEG recordings. The authors find consistent evidence that a visual reference is judged as shorter/longer dependent on a previous adaptation. They report several EEG analyses suggesting the early visual activity is correlated with such temporal distortions.

      Strengths:

      The paper uses an interesting design to try to isolate temporal compression/expansion. The behavioral results are consistent and they show several different EEG analyses. The main result, of beta power being correlated with temporal processing, is consistent with previous reports.

      Weaknesses:

      1) The paper would strongly benefit from more details on some of the methodologies and results. In several moments, the authors show measures that are subtracted or normalized based on other conditions. Although these normalizations can sometimes help to illustrate effects, it also makes it harder to understand the data in a more general sense. For example, in their behavioral results, the authors present an Adaptation Effect to quantify temporal compression/expansion. It would also help if authors present the raw estimates of Points of Subjective Equality across all conditions (including the unadapted condition) so that the reader can have a better understanding of the effects. It would be even better if the average proportion of responses for each duration was shown so that readers can see differences in PSE, JND, and guess/lapse rates.

      2) Further details about the EEG analysis would also help the readers. For example, it is not totally clear how the FFT analysis was performed. It would be important to add information about whether data was analyzed using moving windows, the size of the windows, whether there was an overlap between windows, whether there was a baseline correction and what was the baseline.

      3) Several of the conclusions of the authors are based on linear mixed effect (LME) regressions in which the PSE or the behavioral effect is the dependent variable and an EEG measure is used as one of the fixed effects. However, in some of the analysis, it is not really clear how this was performed (for example, whether this was done at the single-trial or at the averaged data). Critically, it would help the reader if more output (both tables and graphs) were shown for these analyses so that what is being analyzed and concluded is made clearer.

    2. Reviewer #1:

      The question is interesting, and the paradigm in principle well suited to answer it. Unfortunately, a number of shortcomings hinder a clear interpretation of the results. I think that the paper, notably the EEG analyses, need to be revised substantially, which might affect the results. Therefore I will just list the main points which need to be addressed and not go in more detail.

      The behavioral effect of adaptation on duration perception appears very unspecific, namely it occurs in all but the spatially neutral condition. The authors conclude that the inversely directed motion did not have an effect because it did not survive the Bonferroni correction, yet they report a p-value of 0.02 and Cohen's d of 0.58, suggesting a medium effect. In order to prove the absence of an effect, I suggest to report Bayes factors, and only interpret the effect as absent if the Bayes factor is conclusive towards the H0.

      In my view, if there was an effect of inversely directed motion, this poses a question as to the successful demonstration of specific adaptation effects in the behavior, which needs to be taken into account in the interpretation.

      The EEG analyses and displayed results show some important shortcomings, which hinder a clear interpretation at this stage. Just to list a few main points:

      -As apparent from Figures 3-5, the time-frequency plots show a lot of stripes and pixels, when one would expect rather smooth transitions over frequency and time. This suggests that the parameters for the time-frequency transformation might not be appropriate.

      -The analyses compare time windows that differ in many respects, for instance the 15 s long adaptation phase versus short-lived stimulus-evoked activity at reference onset. Interpreting these differences as specific to the duration distortion effects does not seem justified, due to the diverging inputs presented during those time windows.

      -Important aspects of the paradigm are not taken into account in the EEG analyses, for instance the fact that participants perform a saccade between the offset of adaptation and the onset of the reference. The saccade-related signatures in the EEG have to be accounted or controlled for, especially for effects occurring after adaptation offset.

      -Some of the effects (for instance the decoding analysis, or the linear mixed models testing for additive but not interactive effects) show differences in EEG activity related to visual processing of the stimuli, but might not specifically relate to the duration distortions. In my view, more trivial differences in processing the visual inputs should be accounted for (see also the point above), and clearly separated from specific timing effects.

    3. Summary: This work uses electro-encephalographic (EEG) recordings combined with an interesting experimental approach to measure temporal expansion/compression. Specifically, the question addressed here is whether adaptation to visual motion affects perceived duration, and if so, how spatially confined these effects are with respect to the processing of the stimulus in early visual areas. The authors find consistent evidence that a visual reference is judged as shorter/longer depending on a previous adaptation. They report several EEG analyses suggesting the early visual activity is correlated with such temporal distortions. This manuscript is of potential interest to cognitive neuroscientists specifically interested in temporal aspects of visual processing and time perception. Although the paradigm is well suited to assess the authors' question, the behavioral data as well as the electrophysiological analyses show important shortcomings currently hindering the interpretation of the results, and necessitating substantial revisions to the current work. Additionally, further methodological details are required to strengthen the manuscript.

    1. Author Response:

      We thank the reviewers for their efforts reviewing the manuscript and greatly appreciate the comments and recommendations. We are pleased that the reviewers were in agreement with the main conclusions of the manuscript based on the experimental evidence presented. We are also grateful for the complimentary comments and are encouraged that the reviewers recognized the potential impact of the findings.

      We are thankful for the opportunity to submit a revised manuscript and appreciate the recommendation to include currently missing controls. We agree with the reviewers; our mouse colonies were affected due to long pandemic-related shutdowns, which prevented measurements in all cohorts in a timely fashion. These experiments are now underway, for planned inclusion in the revised manuscript.

    2. Reviewer #3 (Public Review):

      The main findings are that loss of the Piezo1 protein in keratinocytes accelerate migration and wound healing, while genetic and pharmacological manipulations known to increase currents carried by Piezo1 slow migration and wound healing. The channels are shown to accumulate and cluster at the trailing edge of single migrating cells and at the wound margin during in vitro studies of wound healing. These findings demonstrate that Piezo1 mechanosensitive channels are not required for keratinocyte migration or wound healing, but rather function as essential regulators of the speed of both migration and would healing. Further, the findings suggest that increased flux through Piezo1 channels slows migration and wound healing. These channels are found to cluster in migrating cells and at wound margins. The conclusions are well-supported by the presented data and the authors' composition does an outstanding job of recognizing the limits of what has been learned and what remains uncertain.

    3. Reviewer #2 (Public Review):

      The manuscript "Spatiotemporal dynamics of PIEZO1 localization controls keratinocyte migration during wound healing" by Holt and colleagues demonstrates that loss of function of PIEZO1 speeds up keratinocyte migration and wound closure, whereas enhancing PIEZO1 function, with a PIEZO1 gain-of-function mutant or by chemical means, slows down both processes. The topic of this manuscript is timely and relevant. The experimental design followed by the authors is straightforward and elegant and the vast majority of the conclusions are fully supported by their results. Overall, this manuscript provides solid evidence that normal (wild type) function of PIEZO1 slows down skin wound healing in vitro and in vivo.

    4. Reviewer #1 (Public Review):

      In this manuscript, Holt and colleagues investigate how the mechanoreceptor PIEZO1 mediates keratinocyte cell migration and re-epithelialization during wound healing. The authors utilized epidermal-specific Piezo1 knockout mice (Piezo1cKO) and epidermal-specific Piezo1 gain of function mice (Piezo1GoF) to investigate the contribution of keratinocyte Piezo1 to wound healing in vivo. Piezo1cKO mice exhibited faster wound closure, whereas Piezo1GoF mice exhibited slower wound closure compared to controls, suggesting that the presence of epidermal Piezo1 affects the speed of wound healing. To determine if these effects observed in vivo were due to changes in keratinocyte re-epithelization, the authors utilized an in vitro model of wound healing by inducing scratches to mimic "wounds" in keratinocyte monolayers. Similar to the in vivo findings, Piezo1cKO keratinocytes exhibited enhanced wound closure compared to controls. In a separate line of experiments, the authors found that enrichment of Piezo1 at the wound edge induces localized cellular retraction that slows keratinocyte re-epithelization and wound closure. Overall, major strengths are that the topic is of significant interest, Piezo channels and their function is of broad topical interest, and the manuscript is well written. Wound healing is a major health concern and understanding the mechanisms underlying how wounds heal could generate improved therapeutics for faster healing. The key weaknesses are that there are missing controls and missing cohorts (Piezo1GoF or Piezo1cKO) in several of the experimental data sets, and there is a concern about the wide variation in controls for some experiments.

    5. Evaluation Summary:

      The manuscript links a critical physiological function of the skin, wound healing to the ability of skin cells to migrate and the modification of migration by the mechanosensitive ion channel Piezo1. The topic of the manuscript is timely, relevant and would be of interest to a broad audience. The experimental design followed by the authors is straightforward and elegant, and the majority of the conclusions are well supported by the results.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

    1. Reviewer #3 (Public Review):

      Slavetinsky and colleagues investigated the capability of monoclonal antibodies (mAb) against MprF, a critical protein of S. aureus, to act as re-sensitizing factors towards resistance strains and as supporting factors for S. aureus killing by human polymorphonuclear leukocytes.

      They created 8 mAbs against four different loops of MprF and showed that they were able to bind MprF-expressing S. aureus strains. Two of the mAbs led to significant reduction of S. aureus survival upon exposure with nisin (i.e. a cationic antimicrobial against towards which MprF normally confers resistance). The authors focused on the mAb against loop 7 and showed that it reduced survivals also against two other antimicrobials and, most important, it restored Daptomycin killing of a resistant strain. Moreover, although this mAb did not increase phagocytosis by leukocites, it decreased the survival of the phagocytized S. aureus cells, most likely by rendering them sensitive towards the cationic antimicrobial peptides.

      In parallel, the authors used this mAb to revise the ambiguous location of loop 7 of MprF. They employed two different experiment settings and concluded that this loop might have some degree of mobility in the membrane, which also explain the ambiguity of its location in previous studies. By showing that the mAb against loop 7 act by inhibiting the flippase activity of MprF while leaving the synthase activity intact, they speculated that the mobility of loop 7 might play an important role for LysPG translocation process.

      The data support the conclusion of the manuscript and show how promising monoclonal antibody are against staphylococcal infections.

    2. Reviewer #2 (Public Review):

      MprF is a lipid flippase involved in determining bacterial tolerance to cationic peptides of the innate immune system and to antibiotics such as daptomycin. Using Staphylococcus aureus as their model organism, the authors assessed the suitability of MprF as a target for anti-virulence treatments. For this purpose, a series of monoclonal antibodies directed against the extracellular loops of MprF were generated. The antibodies were tested for their ability to bind and inhibit the function of MprF, to sensitize S. aureus towards cationic peptides, and to promote phagocyte killing of S. aureus. Moreover, the antibodies were used to investigate the orientation of one specific loop of the MprF protein.

      Strenghts:

      The manuscript is well-written and the introduction provides a very good overview of the challenges associated with antibiotic resistance, anti-virulence strategies and the MprF protein. The Figures and the Figure legends are easy to follow. The described approach is innovative, and state of the art methods are used throughout the manuscript.

      Weaknesses:

      There is a discrepancy between the anti-virulence scope as indicated by the title and the introduction, and the actual content of the result section: here, the anti-virulence strategy is only preliminary addressed, and a lot of effort is instead put into determining the orientation of one specific loop of the MprF protein. This needs to be better aligned, and more compelling data are needed to support that MprF has potential for anti-virulence strategy. The conclusions of this paper are mostly well supported, however, additional controls are needed to fully support that the observed effects of the antibodies are mediated via specific binding to MprF.

    3. Reviewer #1 (Public Review):

      Slavetinsky et al., describe the development of monoclonal antibodies targeting the S. aureus MprF lipid flippase, which is responsible for membrane incorporation of the phospholipid lysyl-phosphatidylglycerol (LysPG). Incorporation renders the cell more positively charged and has been associated with increased virulence and resistance of MRSA to antibiotics and host antimicrobial peptides. MprF is a bifunctional protein; the N-terminal region translocates lipids (flippase), and the C-terminal region synthesizes LysPG. Overall, this is an interesting approach with significant potential.

      Strengths:

      Several epitopes on MprF (three outer loops) were targeted through the synthesis of peptides, which provided a number of antibodies that inhibit the flippase function. The authors identified one specific antibody (M-C7.1) that was shown to target a loop whose previous location was debatable; thus, these finding indicate the loop can be accessible from the outside of the cell. Antibody binding sensitized MRSA to host peptides and antibiotics (e.g., daptomycin). The antibody was shown to inhibit flippase function and also decreased bacterial survival in phagocytes. Overall, the antibody could be used as an anti-virulence agent, diminishing the severity of S. aureus-associated disease. The emergence of antibiotic resistance and difficult to treat S. aureus infections requires orthogonal therapeutic approaches; as such, the findings of this study could have significant impact.

      Weaknesses:

      A major emphasis of the study is that the antibody sensitizes S. aureus to host defenses. This reviewer would like to see dose-responses/titrations of the antibody vs the different CAMPs, using standard susceptibility testing methodology. In addition, during the preliminary ELISAs, have the authors established whether the mprF mutant has lower surface adhesion to maxisorp immuno plates? This would be an important control. When studying M-C7.1 mechanism of action, it is unclear why the data is being normalized to L-1 and why unbound cytochrome C is being quantified. It could be more intuitive to assess bound cytochrome C; can the raw data be included rather than normalized data? A control with delta-mprF alone would also be useful for these experiments. When assessing survival in phagocytes, Figure 5 would benefit from a delta-mprF control to compare M-C7.1 efficacy. This figure also requires statistical analysis. Overall, the conclusions of the study could be further strengthened from additional pre-clinical assessment of the antibody.

    4. Evaluation Summary:

      This study is of interest to readers in the field of Microbiology and the control of microbial infectious diseases. The authors address the challenge of antibiotic resistant bacteria with an innovative anti-virulence approach using monoclonal antibodies against a Staphylococcus aureus lipid flippase involved in tolerance to cationic peptides. The work indicates that this approach could re-sensitize antibiotic resistant S. aureus and diminish the severity of infections.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 and Reviewer #3 agreed to share their names with the authors.)

    1. Reviewer #3 (Public Review):

      In the present study, the authors have shown that Nkx2-1 depleted BRAFV600E driven mouse tumors show higher p-ERK activation. MAPK inhibition in these tumors leads to a cellular shift towards the gastric stem and progenitor lineage. The authors have provided detailed mechanistic insights on how MAPK inhibition influences lineage specifiers and oncogenic signaling pathways to form invasive mucinous adenocarcinoma. All experiments are carefully performed and entails advanced research methodologies such as organoid culture systems, novel genetically engineered mouse models and single cell RNA seq. The manuscript is well written, the research findings are logically interpreted and presented. Taken together, all major scientific claims are well supported by the data and offers major technical advancements for the development of precision medicine.

    2. Reviewer #2 (Public Review):

      In this very extensive and somewhat lengthy manuscript Zewdu et al, characterize an oncogenic Braf-driven model of invasive mucinous lung adenocarcinoma. They show an effect of co-incident and sequential Nkx2-1 inactivation on cancer cells state and therapy responses. They show that BP and BPN tumors have distinct responses to RAF/MEK inhibition. Furthermore, they uncover potentially important cross talk between the MAPK and WNT pathways in invasive mucinous adenocarcinoma (IMA). Overall, this is an excellent manuscript that uncovers many interesting new aspects of IMA. The strengths of this manuscript include the sophisticated in vivo cancer models, detailed cellular analyses, and potential importance of these finds to therapy responses. Their claims are well supported by their data.

    3. Reviewer #1 (Public Review):

      This manuscript from Eric Snyder's laboratory details cell lineage states that are controlled by NKX2-1 and oncogenic MAPK signaling in BRAFV600E-driven lung cancers. The work builds on previous works from Snyder's group that showed NKX2-1 suppresses a latent gastric differentiation program in KRASG12D-driven lung cancers. Switching the model from KRAS to BRAF, now the Snyder laboratory demonstrates multiple similarities between the oncogenic drivers and details key differences that have significant impact on our understanding of lung cancer etiology and possibly treatment. The depth of data analysis and breadth of methodology used represent a real tour de force in cancer modeling. The insights highlight the complex interplay between mitogenic signaling and developmentally-related pathways during cancer progression. The insights gleaned from the study have some potential in influence treatment strategies. As such, this study will appeal to a broad audience. The stated conclusions from the work are entirely sound and wholly supported by the data presented.

      The authors demonstrate that: Simultaneous activation of BRAFV600E expression and deletion of NKX2-1 suppresses the efficiency of tumor initiation (tumor number goes down). In contrast, genetic deletion of NKX2-1 after tumors have established does not impact tumor maintenance but instead is compatible with tumor progression. Modeling the effects of MAPK pathway inhibition (BRAFi+MEKi), the authors demonstrate that BRAF/p53 (BP) tumors enter a state of quiescence. However, BP tumors with NKX2-1 deletion (BPN) fail to enter the quiescent state. Mechanistically, this is due to activation of a WNT-dependent activation of CyclinD2 that acts with CDK4/6 to suppress RB. Further treatment with CDK4/6 inhibitors can drive cells into quiescence but does not lead to durable tumor growth inhibition as tumors rebound after treatment cessation. Consistent with their previous work in KRAS-driven lung cancers, deletion of NKX2-1 reveals a latent gastric cell differentiation program driven by relocalization of FOX factors toward gastric specific genes. Interestingly, MAPKi in BPN tumors further drives these cells toward a chief-like or tuft-like cell state that is also due to WNT-dependent signaling, and FOXA1/2-dependent effects at specific genes normally restricted to tuft and chief cells.

    4. Evaluation Summary:

      This manuscript greatly expands our understanding of an aggressive subtype of lung cancer. The author use in vivo cancer models and extensive analysis of the cancer cells states to uncover aspects of differentiation, drug responses and pathway activation. Findings of the study will help in the development of lineage-specific targeted therapies against cancers.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

  2. Feb 2021
    1. Reviewer #2:

      The authors address the vortex formation of bacteria in circular confinements with a particular focus on the difference of swarming vs. swimming (planktonic) motility of individuals. In the field of active matter, this critical distinction has rarely been studied so far but it is oftentimes ignored in modeling studies. Chen et al. show that qualitatively different patterns emerge for swarming and swimming bacteria. I do therefore believe that the work could have substantial influence on future studies devoted to bacterial pattern formation.

      I have two main concerns detailed in the following.

      1) A central finding of the present study is that the number of vortices/swirls as a function of the well diameter differs for swarming vs. swimming bacteria. The authors argue and show experimentally (Fig. 2) that the behavior is identical for small and large diameters. For intermediate values, however, they report that a single swirl is observed for swarming bacteria whereas swimming bacteria show multiple swirls.

      The fact that the behavior is identical for large wells suggests that the bulk behavior is identical. This is also confirmed by Fig. 2E which shows that the spatial correlation function of the velocity is identical in large wells. To me, that suggests that the boundary conditions play a central role for understanding how the observed phenomenology emerges. [Indeed, it was shown in the past that the interaction of bacteria with boundaries crucially determines the formation of swirls in confinement (Lushi, Wieland & Goldstein PNAS 111 9733 (2014). The authors of this work assume reflecting boundary conditions, which -- to my knowledge -- contradicts the finding of Lushi et al.]. The authors, however, explain the difference of the observed patterns within their modeling study in a different way, namely by a different strength of the (anti-)alignment interactions. Changing the interaction at the level of individual cells will, however, change the bulk behavior too. Accordingly, the numerically observed bulk behavior (Fig. 5B ) is very different in both cases (at a qualitative level). It is difficult to judge the difference in detail because the correlation function was not calculated for the simulations.

      In short:

      The model (Fig. 5A) reproduces the experimental results partially (Fig. 2C), but the modeling analogue to Fig. 2E is missing. The line of arguments seems to me not to be entirely consistent.

      2) Inferring the interactions of active particles from observations of the emergent patterns is a highly non-trivial task. In view of this I am not entirely convinced by the arguments put forward by the authors that "more substantial cell-cell cohesive interaction[s]" are the reason why the swirling patterns formed by swarming/swimming bacteria differ. In this context, I want to raise the attention of the authors to Ref. [Peruani, Deutsch & Bär: Phys. Rev. E 74 030904(R) 2006]. In this work, a clustering transition of self-propelled rods was described. "Rafts", referred to as clusters by Peruani et al., are observed as the aspect ratio of rods is increased. Notably, a kinetic transition towards clustering can emerge even in the absence of any attractive interactions. In short, the observation that cells move in parallel (polar clusters) next to each other does not allow to conclude that cohesive interactions are present. The movies S3 and S4 provided by the authors show that the particle shape of swarming and swimming particles is clearly different. In particular, the elongated swarming bacteria show pronounced clusters (Movie S3) whereas the shorter planktonic cells (Movie S4) do not. The difference in aspect ratio does indeed suggest that swarming and swimming bacteria differ in their alignment interaction. However, this contradicts the observation that spatial correlations in large wells are indistinguishable (see comment 1 and Fig. 2E). Side remark: in the main text, the authors argue that changes of the aspect ratio are not the reason for an increased alignment interaction, however, in the discussion section cell morphology changes (e.g. cell elongation and hyper-flagellation) are mentioned as an indicator that swarming is a different phenotype from swimming.

      In summary, I believe that the connection of experimental observations and modeling are not entirely convincing.

    2. Reviewer #1:

      In this paper, the authors proposed a new approach by mounting a PDMS microwells of specific sizes on agar surface to confine swarming and planktonic SM3 cell, they found swarming bacteria exhibit a "single-swirl" motion pattern and concentrated planktonic bacteria exhibit"multi-swirls" motion pattern in the diameter range of 31-90 μm. The phase diagram shows that in smaller wells concentrated planktonic SM3 forms a single vortex and in larger wells swarming SM3 also breaks into mesoscale vortices.

      After that, they conducted systematic experiments to explore parameters defining the divergence of motion patterns in confinement including cell density, cell length, cell speed and surfactant. They concluded that the single swirl pattern depends on cohesive cell-cell interaction mediated by biochemical factors removable through matrix dilution.

      This paper gives a new method to discern swarmers from Planktonic Bacteria and carefully studies the factors that influence the formation of bacterial vortices under restriction. However, major revisions are required to improve the quality of this paper.

      Major questions and comments:

      1) When the authors put the PDMS chip mounting on the edge of the swarming colony, the PDMS chip is completely attached to agar or suspended in a bacterial solution. The distance between PDMS chip and agar surface should be quantified. It is better to have a schematic diagram of the experimental device.

      2) Is the bacteria still expanding outward after a PDMS chip was mounted on agar surface? The effect of PDMS chips on the expansion of bacteria on the agar surface needs to be discussed.

      3) "Diluted swarming SM3 show unique dynamic clustering patterns". In the diluted bacteria experiment, the authors found that the diluted swarming bacteria can form bacterial rafts and the concentrated planktonic SM3 disperse uniformly and move randomly. Hence, when bacteria expand and gradually fill up new empty microwells, is there a process of transition from raft to single vortex state?

      4) In the experiment of altering the conditions of swarming SM3, the authors diluted the swarming cells in Lysogenic Broth (LB) by 20-fold, re-concentrated the cells by centrifugation and removed extra LB to recover the initial cell density. After these operations, they found the previous single swirl turned to multiple swirls and got a conclusion that matrix dilution can affect single swirl patterns. The authors think centrifugation may wash away some surrounding matrix or polymers on the surface of bacteria. Therefore, the steps of centrifugation need to be presented and the effect of centrifugation on the physiological behavior of bacteria should be discussed.

      5) This article covers the PDMS chip directly on the agar surface and finds that swarm and planktonic bacteria have different spatial correlation scales in the restricted microwells. The authors have done a lot of experiments to prove the difference between clusters and planktonic bacteria and explain the reason for the single vortex. However, the conclusion is not clear. Therefore, the authors should focus more on the analysis of this new experimental phenomenon, such as critical length and vortex phase diagram, rather than just describing the experiments they did.

      6) The authors mentioned the critical length for swarming SM3 is ~ 49 μm, whereas, for concentrated planktonic SM3, it is ~ 17 μm. Does this quoted data match what you get from their experimental method? I do not see any follow-up discussion and evidence.

      7) As shown in Figure 1 and Movie_S1_mp4, the direction of the single vortex motion of bacteria is clockwise. However, the article simply ignores that the single vortexes of bacteria all present the same direction, and there is no analysis and reasonable explanation on the vortex direction. As shown in Movie_S5_mp4 on the numerical simulations of circularly confined SM3, simulated bacteria vortex counterclockwise in completely opposite directions. The influence of the microwell boundary on the direction of the vortex should be clearly explained at the level of bacterial movement and preferentially with theoretical simulation.

      8) Swarming and concentrated planktonic Bacillus subtilis 3610 show the same motion pattern across different confinement sizes. However, the authors did not give definitive conclusions and evidence. As shown in Figure S1, bacillus subtilis 3610 show completely different cluster behavior. Therefore, the discussion of 3601WT may cause readers' confusion on the article. It may be better to put it in the supporting material.

      Minor questions and comments

      9) Figure 1C, 1D, 6A, 6B may be more convenient to have a scale bar.

    3. Summary: In this paper, the authors proposed a new approach by mounting a PDMS microwells of specific sizes on agar surface to confine swarming and planktonic SM3 cells. They found swarming bacteria exhibit a "single-swirl" motion pattern and concentrated planktonic bacteria exhibit "multi-swirls" motion pattern in the diameter range of 31-90 μm. The phase diagram shows that in smaller wells concentrated planktonic SM3 forms a single vortex and in larger wells swarming SM3 also breaks into mesoscale vortices.

      In addition, they conducted systematic experiments to explore parameters defining the divergence of motion patterns in confinement including cell density, cell length, cell speed and surfactant. They concluded that the single swirl pattern depends on cohesive cell-cell interaction mediated by biochemical factors removable through matrix dilution.

      This paper gives a new method to discern swarmers from planktonic bacteria and carefully studies the factors that influence the formation of bacterial vortices under restriction.

    1. Reviewer #2:

      In this study, the authors perform an impressive field phenotyping experiment on three grafted grapevines all with a common scion cultivar 'Chambourcin' alongside an ungrafted control to assess the associations between rootstock and leaf traits. The traits collected include ionomics, metabolomics, transcriptomics, leaf morphology and physiology. In addition, the authors collect these samples at three phenological stages to incorporate seasonal variation. The authors apply a combination of classification and machine learning methods to test whether features within each phenotypic measurement are predictive of genotype. In some cases, such as the ionomics data, certain ions are predictive of rootstock genotype but only at certain seasonal time points. The datasets presented here are extensive and will be of value to the horticulture field since grafting is such a common technique used in cultivating many crops. Considering the scale of this experiment, the manuscript is at times disconnected, in large part because each dataset is analyzed independently without any integration across phenotypes. The results presented do highlight more of an effect of phenology rather than rootstock on the phenotypes measured.

      Major comments:

      1) It would be very helpful to have a diagram with the layout in the field and the sampling strategy or a more detailed explanation. This would help to associate which phenotypic data was collected at the same time and on the same plants. For example, it would expand on what is mentioned on line 348 "row 8 sampled early in the day". It would help to know what time of day the samples in each row were collected. Additionally, how do the different irrigation treatments factor into the sampling? A better introduction of the experimental design is needed at the start of the results section along with a description of the genotypes and why they were selected.

      2) I understand why running a PCA before the LDA can help reduce the dimensionality of the space to be able to invert the covariance matrix (if that was the motivation?) but is this because there were issues with running LDA alone? I wonder if you've lost important discriminating information between the classes by doing this. Was the LDA run on the datasets first prior to the PCA? This may uncover additional classification that was eliminated by the PCA.

      3) For the Random Forest analysis, the authors might consider using k-fold cross validation rather than partitioning the dataset, this is especially beneficial when working with smaller datasets and might improve the predictions. Could all the importance scores be reported rather than just the couple mentioned in the text (line 296).

      4) In reference to Figure 1B and C, it would be helpful to indicate on the plots which comparisons are significant based on their model tests. The full test results are presumably in the excel spreadsheet referred to in the reporting form although it was not found with the manuscript materials.

      5) Throughout the text there is very little mention of the various grafted genotypes and what is known about the lines. The authors should consider introducing these genotypes and why they were selected for the grafting experiment. What is different among these lines? There is very little discussion of the comparisons between genotypes and what phenotypes are significantly different between the lines and what the implications are for the plant as a whole.

      6) Line 287 refers to a post-hoc analysis of the ions, do the ions showing significant variation explained by rootstock and phenology match the ions identified in the ML as important classifiers?

      7) For such a large metabolomic dataset, it is surprising that the authors do not present any identification of the metabolites highlighted. The identification of the metabolite features that were found to influence the rootstock main effect would be of interest and might reveal interesting biology. How did these metabolites differ between genotypes? On line 501 in the discussion there is mention of flavanols and stilbenes yet these weren't highlighted in the results section.

      8) What is the reasoning for not simply applying a linear modeling approach such as limma on the gene expression data first instead of only applying it to the PCs in order to identify differentially expressed genes between the genotypes? If phenological stage is the strongest effect, what if you run the analysis within each stage to look specifically at the differential responses between grafted lines at each stage? The analysis of the gene expression data, similar to the metabolomics data, seems to be missing an opportunity to uncover underlying biological mechanisms contributing to any genotype effects of grafting, a stated goal of the study. What genes are differentially expressed and do they relate to the metabolomic or ionomic data?

      9) In the methods, there are three irrigation treatments described yet this is not mentioned in the results section. While it seems as though rainfall mitigated much of the irrigation effect there does appear to be differences in water availability to the vines as described in the provided github page. Were various irrigation treatment sets sampled for all phenotypes? Or were the ionomics, metabolomics and transcriptome analysis done on the same irrigation treatments? If not, was this effect considered in the analysis? This is yet another variable that would greatly influence the response and should be considering when assessing the effects of grafting. Further detail about the sampling and conditions is needed to clarify.

      10) In figure 1 there is information about leaf age. For the metabolomics a mature leaf was sampled, transcriptomics the youngest leaf, and physiology it is not specified. Could you clarify the leaves that were sampled and how they relate across phenotypes. This is an important point to mention given the differences observed for the ionomics data.

      11) In reference to the vine physiology, were these all collected from the same irrigation treatment? Was the sampling of each genotype spread out over the 3h window to account for time of day variation? It would be helpful to have the significant comparisons indicated in the figure. What are the letters referring to on lines 402-403 with the p. values? This section would be greatly improved by additional clarity in the text.

      12) Given the focus on grafting, the analysis presented in Figure 6 does not seem to contribute to this objective. Could this be expanded on to look within and across genotypes to see if different phenotypes covary and to compare the dimensions of variation across genotypes rather than combining them all together? This would complement the previous analyses and hopefully reveal the differences that were highlighted in the earlier sections.

      13) The results section is very disjointed and the datasets are presented almost as completely separate studies. To improve clarity in the results section, the authors might consider expanding on the findings of the LDA and ML analysis for each phenotype and connecting them together.

    2. Reviewer #1:

      In this manuscript, the authors look at the influence of root stock genotype on a single scion genotype in Vitis. This includes a lovely highly replicated design including differential water availability. While the experimental design is very elegant, I'm less sure that using general PCs or ML is the best approach to grab the signal of interest.

      Is there evidence that the top 20 PCs of the metabolome or the top 100 PCs are an end point of gaining new information about the system. For example, if the top 20 PCs are all different descriptions of the water availability, then PC 21 might start to grab more information about the root-scion relationship. For example in this dataset, PC2-10 were largely about temporal block (line 314-316). In large genomic datasets like this, they have an immense amount of variation such that r2 is not a meaningful way to capture what is in a PC. I can understand the desire to minimize the statistical analysis but if the goal is to fully interrogate the dataset, the authors should provide an empirical reason for stopping at pre-ordained PCs. Or possibly better would be to grab the lsmeans for the main factors in the model to exclude factors of blocking and then run the PCs as that is the underlying interest in the experiment.

      The focus on PCs or using ML on the full dataset also hinders the ability to get at the underlying root/scion and water availability connection. Given that phenology and blocking are the main sources of variance, using these approaches rather than a direct GLM or PC on lsmeans/BLUPs weakens the authors ability to use the power in their experimental design. PC and ML can only capture the largest components of variance while GLMS that account for these larger sources of variance can begin to dive into the underlying questions. There is a possibility that the authors did attempt these directed GLMS with no luck but that was not stated.

      I think the use of PCs is maybe my biggest hindrance on the manuscript as the section on lines 409-430 which is the capstone of the paper but ends up being correlations of faceless PCs. Unfortunately this leaves the reader with the idea that phenology is simply too strong to obtain any information about the root/scion connection or the water availability connection.

    3. Summary: Experimentally, this is a very solid and nicely replicated experimental design that provides a strong ability to interrogate the questions at hand. Both reviewers had a concern that the use of PCs was underpowering the analysis to test the key questions that were the goal of the experiment. The manuscript could also be improved by working to interleave the different omics datasets to develop a deeper insight.

    1. Reviewer #3 (Public Review):

      In this work Farber and colleagues describe the generation of Fus(EGFP-plin2) and Fus(plin3-RFP) two knock-in zebrafish lines that alllow to study perilipins and lipid droplet biology in vivo at whole animal level. These lines could be important tools to understand how lipid droplet dynamics are affected by different genetic and physiological manipulations.

      The article is well written and the work is carries out with a good methodological approach and the results support their conclusions. The weakness is the lack of originality since it does not really go behind the current knowledge in the field. Most of the data are a detailed description of zebrafish lines but I doubt that could be interested to a broad audience.

      It also lacks novelty since the work does not add anything compared to what is already known regarding peripilin 2 and 3. I think this manuscript should be submitted to a more specialized journal on lipid metabolism or to a technical "zebrafish" journal.

    2. Reviewer #2 (Public Review):

      In this manuscript, the authors generated transgenic zebrafish reporter lines that allow observation of cytoplasmic lipid droplets in vivo. They knocked in GFP or RFP in the endogenous loci of perilipin 2 and 3, and showed that the reporter genes exhibited similar temporal and spatial expression in the intestine in response to acute high-fat feeding as the endogenous perilipin 2 and 3 transcripts. They also characterized the reporter gene expression in the liver, adipocytes, and around neuromasts. These tools open up new opportunities to study lipid droplets dynamics in live zebrafish that is not feasible in mouse models. Overall the manuscript is well written. The authors have discussed in details the strength and caveats of these reporters. The weakness is the descriptive nature of the study - many interesting observations but no mechanistic study. I have the additional comments:

      1) It is curious that in plin2 and plin3 reporter fish, the fluorescent tags were inserted at the 5' and 3' of the open reading frame, respectively. The authors did not provide any explanation. Does the location where the fluorescent tag is inserted affect the expression of the reporter genes?

      2) GFP and TagRFP-T are not fast folding fluorescent proteins and are very stable, which may not be the best options for studying the formation and degradation of lipid droplets. How the fluorescent tags affect the stability and clearance of the protein should be carefully characterized.

      3) Was there any indel being introduced by TALENs in these knockin fish? Is there off target effects of the TALENs?

      4) The authors also generated transgenic fish overexpressing human PLIN2 and PLIN3 fluorescent fusion proteins. Is the subcellular localization of these fusion proteins similar to the zebrafish knockin under nofed and fed conditions? In other words, do human PLIN2 and PLIN3 proteins behave similarly as the zebrafish orthologs?

    3. Reviewer #1 (Public Review):

      The authors find that plin2 transcript is induced in intestine of 6 dpf zebrafish larvae following a single feeding, while plin3 transcript is expressed in the fasted and fed states in the intestine. They use TALENS to knock-in EGFP and TagRFPt into the plin2 and plin3 loci, with the encoded gene products being the fusion proteins EGFP-plin2 and Plin3-TagRFPt. The EGFP-plin2 protein shows greater induction of fluorescence following a meal. The overall aim of these initial expression characterizations and development of lipid droplet reporter knock-ins is to be able to monitor the life cycle of these organelles in a living whole organism.

      Higher resolution photomicrographs of lipid droplets with these knock-in lines concurrently stained with the the fluorescent lipid dyes BODIPY 558/568 C12 and BODIPY FL-C12 are presented with a time series following feeding in intestine; additional cell types beyond enterocytes (i.e., hepatocytes, adipocytes, and cells surrounding lateral line structures) are presented.

      The authors have provided a technical advance to the field of lipid droplet biology. With the tractable revisions set out below, their tools set the stage for chemical and genetic screens for factors and compounds that modulate the normal life cycle of these dynamic organelles.

    4. Evaluation Summary:

      This manuscript has generated novel and useful tools to mark cytoplasmic lipid droplets and monitor their dynamics in various tissues in live animals. It will be of interest to researchers studying lipid metabolism and related human diseases.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 agreed to share their name with the authors.)

    1. Reviewer #3 (Public Review):

      Mutations in Naa10 are known to be causative in Ogden syndrome, a genetic disorder associated with infantile death. The paper by Kweon et al describes a series of experiments using mouse models of Naa10, an x-linked gene with the function of a major acetyltranferase in a complex accounting for 40-50% of acetylation of all proteins. The lack of complete embryonic lethality in the Naa10 hemizygous mice, leads the authors discover a paralogous mouse gene Naa12. The authors further demonstrate that Naa12 can compensate for Naa10 loss of function and that null mutations in both genes lead to complete embryonic lethality.

      Genetic experiments described in this paper involve 2 distinct knockouts of the Naa10 in mice. The resulting hemizygous male mice displayed a variety of developmental defects, and while hemizygous males were underrepresented at birth, some surviving mice experienced early neonatal lethality while a proportion of the hemizygous mice survived to adulthood. Severely affected animals exhibited a variety of development abnormalities but importantly, no major reductions in the acetylation patterns were observed. A similar spectrum of phenotypes were reported in 2017 in a separate paper by Lee et al. The lack of complete embryonic lethality in Naa10 hemizygous males led to the hypothesis that a compensatory gene in mice may exist. The authors then identified the autosomal Naa12 gene in mice. This is a major finding of the paper. Naa12 and Naa10 share 80% sequence identity. The authors continued on to generate a Naa12 knockout mouse that in combination with the Naa10 knockout mice, demonstrate complete embryonic lethality to support the hypothesis that Naa12 is a function homolog to Naa10 in mice. This is strong evidence supporting the functional compensation of Naa12. The authors provided a thorough account of the variety of development abnormalities in the Naa10 hemizygous mice at all stages of development, noting changes in bodyweight, hydrocephaly and significant cardiac defects, pigmentation, skeletal and reproductive abnormalities. The variation and heterogeneity ranged from severe embryonic abnormalities through to milder phenotypes in surviving adults. Importantly, the authors identified several phenotypes in the mice that upon further analysis, we also not in the patients with an assumption of incomplete penetrance.

      This reviewer finds this paper to be an important finding worthy of publication. The experiments were well powered and the genetic crosses thoroughly examined. The discussion was thoughtful and considered mechanisms of compensation between Naa10 and Naa12 based on the observed experiments.

    2. Reviewer #2 (Public Review):

      This manuscript shows the functional relevance of mNatA catalytic subunit, mNAA10, in mammals' development. Moreover, authors have found a new NatA catalytic subunit in mice, mNAA12, that can compensate mNAA10 inactivation in mice. Interestingly, inactivation of mNAA10 in mice induces some developmental defects similar to those observed in Ogden syndrome (OS) patients including lethality in infants. This study provides several evidences and explains some of the defects observed in OS patients like supernumerary vertebrae and hydrocephaly supporting the relevance of hNAA10 mutations in the development of OS. Moreover, authors have observed in mice some developmental deficiencies not observed previously in OS patients, like supernumerary ribs, that after patient re-examination they have been observed in humans too. Curiously, the results presented in this article show that inactivation of mNatA catalytic subunit does not affects dramatically protein N-terminal acetylation, probably as consequence of mNAA12 paralog function as mNatA catalytic subunit when mNAA10 is not present. Interestingly, gene inactivation supports the biological significance of NAA10 as the main NatA catalytic subunit as mNAA12 inactivation is not associated with any clear phenotype. In spite of being one of the most frequent protein modifications protein N-terminal acetylation has not attracted proper attention, therefore this paper can draw more attention to this important protein modification.

    3. Reviewer #1 (Public Review):

      In this paper, the authors investigated the role of the N-terminal acetyltransferase Naa10 in mouse development. In addition, they identified a new paralog, Naa12, and demonstrated that it has a redundant role with Naa10 in controlling mouse embryonic development. The results are very clear and should be of interests to those working on development and N-terminal acetylation.

      I have several comments for the authors to consider:

      1) It is important to show that N-terminal acetylation is lost in the double knockouts. Only with that, the authors can conclude that they have identified the "the complete machinery for the process of amino-terminal acetylation of proteins in mouse development."

      2) Naa12 is new, so if not done yet, the sequence needs to be deposited into Genbank.

      3) The presentation needs to be polished.

      i) The title "Naa12 rescues embryonic lethality in Naa10-Deficient 1 Mice in the amino-terminal acetylation pathway" is misleading. When I saw the title, I got the impression that Naa10-dficient 1 mice show embryonic lethality. I would suggest to change it to indicate that Naa10 and Naa12 have redundant roles in embryonic development. Also, "Naa10-Deficient 1 Mice" needs to be changed to "Naa10-deficient mice."

      ii) In the impact statement "Mice doubly deficient for Naa10 and Naa12 display embryonic lethality...", the word "doubly " is unnecessary.

      iii) Too many acronyms, which make the reading a bit difficult. The terms NTA and Nt-acetylation could be avoided. iv) At the end of page 9, please cite the sequence alignment in Fig. S6

      v) On page 12, "Naa12 may rescue loss of Naa10 in mice" could be more assertive.

      vi) Overall, I feel that the authors could polish the manuscript so that the salient points could be conveyed more easily to readers.

    4. Evaluation Summary:

      This manuscript describes the identification of an animal model that reproduces several features presented in Ogden syndrome patients and reveals the roles of two N-terminal acetyltransferases in mouse development. It will be of interest to the readers in the field of protein acetylation and modification, and also to the scientific community involved in rare diseases and syndrome studies.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

    1. Reviewer #3 (Public Review):

      Bridget A. Matikainen-Ankney et al. discuss the newest generation of their open-source Feeding Experimentation Device (FED3) platform capable of detailed tracking of food pellet intake and dual nose-poke operant behavioral testing. This platform provides a complete solution for these types of studies and includes all necessary open-source hardware, firmware, visualization code, and Arduino and Python libraries for user customization of experiments and analysis. FED3 has a rechargeable battery life of around one week and can operate without any external wires, logging data onto an on-board SD card and allowing for flexible placement in a rodent's home-cage. The platform also includes an on-board display for showing current experimental parameters/data and a variable voltage digital output for synchronizing the system with other external devices such as an optogenetic simulation system. The authors show multiple applications of the FED3 platform including detailed food intake tracking, fixed-ratio operant behavior experiments, and optogenetic self-stimulation. Importantly, they also highlight the ability to do studies across multiple, remote laboratories by leveraging the standardization of such a food intake platform.

      Strengths:

      The FED3 platform is well thought out and clearly builds off the authors' experience designing and working with their previous generation systems. The specific open-source approach taken by the authors include, not just openly providing design files but, building an understandable and open ecosystem of tools and libraries for laboratories to customize the platform to fit a broad range of experiments. By including data visualization tools and a Python library for working with FED3 data, the authors effectively lower the technical entry point for using such a platform and streamline the process of implanting the system in one's own experiments. The paper provides strong evidence of the FED3's capabilities and relevance of data generated across a range of use cases. There is compelling evidence of the usefulness of developing an open standard for food intake tracking, allowing for multi-site studies and across-laboratory comparisons. Finally, the system is significantly more affordable than other commercial options, lowering the economic barrier for implementing food intake tracking and operant behavior experiments.

      Weaknesses:

      While this paper presents a very useful, customizable, and flexible approach to food intake and operant behavior studies, certain aspects of the device could be better described in the paper. This is only a minor weakness as all hardware and code is openly available online, allowing for a more detailed understanding of the system beyond what is presented in the paper. It would be helpful to identify the major electronics components on the custom printed circuit board to aid in customization of the system. It would also be useful to provide more details as to the mechanical mechanism used to deliver food pellets and the optical beam breaks for detecting nose-pokes and food pellets.

      Some potential limitations of the system include the inability to detect food pellet hoarding, lack of wireless option to access and configure the system, limited battery life, complications when using granular bedding, and no way to identify individual mice. The authors identify and discuss these limitations within the paper which is appreciated.

    2. Reviewer #2 (Public Review):

      "Feeding Experimentation Device version 3 (FED3): An open-source device for measuring food intake and operant behavior" describes the third iteration of an open-source automatic feeding device to be used with mice. I have no concerns about this paper and would recommend it as is. The authors have provided an incredible resource for the fields of feeding and reward-related behaviors, and provide all the details needed for assembly and use. Moreover, the data that they have collected using this device constitutes an advance, particularly the circadian rhythms of feeding, as well as the increase in operant responding during the light cycle. This device enables homecage measurement of feeding and training for motivational behavior, enabling most any laboratory to examine feeding behaviors in their experiments.

    3. Reviewer #1 (Public Review):

      In this manuscript the authors present a new and improved open-source option for a home cage pellet dispensing device that carries with it the ability to offer continuous monitoring of feeding behavior as well home-cage operant testing. This device solves many issues in the way individuals typically go about studying animal feeding behavior including but not limited to testing at only certain times of the day for limited amounts of time and food restriction in a manner that optimizes cost, functionality, scalability, and customizability over traditional or commercial options. Of note, besides offering the ability to capture massive amounts of home cage feeding and operant data directly in the vivarium of animal housing facilities, a major strength of this approach is that the authors demonstrate that the same amount of learning that would typically require 16 days (one-hour testing sessions) can be accomplished overnight (and with interesting circadian effects on decision-making that are often overlooked). The authors demonstrate useability of this device across institutions in other labs and integration with optogenetics (as well as citing recent studies integrating the device with recording systems).

    4. Evaluation Summary:

      All three reviewers were very enthusiastic about this manuscript describing FED3, a new and improved open-source option for a home cage pellet dispensing device. They all agreed that this open-source tool would be of wide-interest to neuroscience laboratories, that the manuscript was well-written and clear, and that the cross-lab validation was informative. They also appreciated that this Tools & Resource manuscript all necessary open-source hardware, firmware, visualization code, and Arduino and Python libraries for user customization of experiments and analysis. Minor concerns were identified with the extent to which the manuscript describes and compares to existing systems and with clarity on some details of the system.

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #3 agreed to share their name with the authors.)

    1. Reviewer #3 (Public Review):

      In this study, van Dorp et al. provide new insights into the structure of the C-terminus of STIM1 in the quiescent as well as the active state. By using extensive smFRET and protein crosslinking techniques, the authors substantially advanced our understanding of STIM1 cytosolic domains orientation and revealed inter- and intramolecular interactions within a STIM1 dimer. Structures have been derived for both STIM1 resting and activated state. Altogether, this study substantially contributes to a mechanistic and structural understanding of the STIM1 activation process, and it paths the way for the comprehensive dynamic resolution of conformational transitions from the inactive to the fully active state.

      The single molecule studies represent a very elegant approach to derive novel details on STIM1 structure and dynamics. Utilization of these developed smFRET protein probes of ctSTIM1 in the interaction with Orai1, either reconstituted or even in living cells, would be phantastic, but certainly experimentally challenging based on the low fluorescent background required to resolve single molecule FRET.

    2. Reviewer #2 (Public Review):

      Although the major activation steps and general mechanistic underpinnings of SOCE have been reported in a flurry of literatures, they are largely descriptive and lack quantitative information. One topic of greatest interest to the CRAC channel field is the structural basis of CC1-CAD/SOAR-mediated STIM1 autoinhibition. Using single-molecule Förster resonance energy transfer (smFRET) and protein crosslinking approaches, Dorp et al provides a binding model for the CC1-CAD interaction. This model explains the role of CC1 in STIM1 activation, and delineates the activation process of STIM1 CT. It also clarifies the controversy on the two varying structures regarding the packing of the CAD/SOAR domain by favoring the X-ray structure over the NMR structure. The conclusions of this paper are mostly well supported by data. The only minor concern is to reconcile some of the conflicting results (regarding the relative positions of some residues used in the crosslinking study, as well as the CC1-alpha 1 helix), made between this study and a recent structural study, i.e., the NMR solution structure of CC1 reported by the Romanin/Muller's groups (PMID: 33106661). Overall, this study covers a timely topic to address a long-standing question in the ORAI-STIM signaling field, i.e., the structural basis of CC1-CAD association that keeps STIM1 largely quiescent in the resting condition. This work, regarded by this reviewer as a "tour-de-force" by meticulously scanning through many key residues within the multiple CC1/CAD helices, certainly warrants immediate publication.

      Notable strengths:

      1) smFRET is increasingly being used to determine distances, structures, and dynamics of biomolecules. Full length STIM1 and STIM1 C-terminus have been always difficult to obtain crystal structure due to its tendency for aggregation and the existence of large disordered regions. Herein, the authors selected smFRET as the major tool to overcome this hurdle and illuminated the CC1-CAD binding models to provide novel mechanistic insights into STIM1 auto-inhibition mediated by the intramolecular cis CC1-CAD association.

      2) The efforts to extend crosslinking of ctSTIM1 to flSTIM1 are particularly commendable, moving one more step closer to the physiological scenario.

      Minor weaknesses:

      1) The authors proposed a CC1 model displaying "tandem connection of "CC1α1- CC1α2", that shows notable discrepancies with the recent CC1 NMR solution structure (PMID: 33106661). In the latter structure, the three helices are intertwined to form a bundle like structure. An in-depth discussion is certainly needed to clarify the difference. Some possibilities include: (i) Is this due to the artifact of the CC1 NMR structure (done in the presence of helix-stabilizing reagents)? (ii) is this due to the introduction of cysteine residues for the assays? (iii) is this due to absence of the CAD/SOAR part, or other regulatory components, in the solution structure? Repeating one or two key smFRET/crosslinking experiments in the presence of the similar buffer condition as in the NMR study would provide clues to these possibilities.

      2) Another concern, very minor though, is regarding cysteine crosslinking flSTIM1 by 0.2 mM diamide. Will the addition of diamide cause undesired activation of STIM1 in the absence of cyclopiazonic acid?

    3. Reviewer #1 (Public Review):

      The authors use smFRET and cross linking to constrain relative orientations of CC1-CC3 helices in STIM1 resting and active conformations. The data are excellent and especially because structures of full length STIM1 are currently lacking they paint an important picture of the structural basis for STIM1 activation. The number of smFRET pairs examined in the inactive state is fairly large and paints a good picture of the relative orientations of helices. In contrast, only a few pairs of sites were examined in activated STIM1 which paint a clear picture of CC1a1 dissociation from CC3, but the remaining postulated conformational changes during activation are inferred primarily from cross linking, and it would have been nice to probe those with smFRET as well. Nonetheless, the data yet provide very useful constraints on STIM1 conformational rearrangements that will be of great value to further structure-function studies.

    4. Evaluation Summary:

      This study uses complementary approaches to advance our mechanistic understanding of STIM1 activation, with elegant single molecule methods providing new details on STIM1 structure and dynamics. Full length STIM1 in a cellular environment was probed by crosslinking, but the same has not yet been possible with single-molecule Förster resonance energy transfer (smFRET).

      (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

    1. Reviewer #3:

      The authors propose a new method of focused ultrasound (FUS) neuromodulation namely amplitude modulated FUS that they propose can differentially affect inhibitory and excitatory cells depending upon the intensity employed. Parameter selection is an issue for this field and the introduction of new methods for efficacious modulation are highly desirable. However, this paper does not explicitly test AM FUS against existing forms of FUS thus lending no evidence to its efficacy. While the differential effects are interesting in themselves, we gain no insight if AM FUS is the critical factor leading to this.

    2. Reviewer #2:

      Nguyen et al. developed a novel method of transcranial focused ultrasound stimulation and used it to stimulate anesthetized rats while performing extracellular recordings in the hippocampus. They find that the stimulation has different amplitude-dependent effects on putative inhibitory interneurons and excitatory principal cells. This finding is exciting because it suggests that transcranial ultrasound could be used to specifically reduce or increase firing rates in excitatory or inhibitory neurons in a particular part of the brain (resolution in the mm range). In principle, this could also be applied to humans. Simultaneously measured oscillations of the local field potential, particularly (but not exclusively) in the theta band (3-10 Hz) could also be manipulated in a bidirectional manner depending on the stimulation amplitude. Such cortical oscillations have been strongly linked to a wide range of functions including memory, and the potential to manipulate them in an anatomically precise manner is exciting and could even lead to new therapy approaches. Although it is not new that ultrasound can be used to modulate neuronal activity, this paper reaches a new level of precision by demonstrating that bidirectional effects can in principle be limited to one cell class or one frequency band. Thus, it could provide a great alternative to current methods that either provide much less precision (e.g. transcranial magnetic stimulation) or rely on more invasive methods (e.g. deep brain stimulation) or genetics (e.g. optogenetics).

      The study is well-designed with stimulation at 3 different amplitudes applied in the same rat, whereby each 3-minute stimulation is compared to a 3-minute sham session where the transducer is 1 cm above the skull. Baseline sessions before each stimulation and sham session did not show any differences, showing no spillover-effects from the previous stimulus. Effects on brain temperature were also measured and shown to be negligible compared to normal variability.

      Low intensity stimuli lead to a reduction in firing rates in putative interneurons and a reduction in theta oscillation power, whereas high intensity stimuli lead to an increase in firing rates in putative principal cells, with intermediate intensities having largely no effect.

      In principle, these findings could provide novel insights into the mechanisms underlying ultrasound stimulation, but neither of the two discussed main modes of action (mechanical and thermal) appears consistent with the results. Thus, no model could be offered that might give some insight into the underlying mechanisms of ultrasound modulation of neuronal activity. This might be an issue for future work, and if the results were more robust perhaps this would not matter as much. However, the overall size of the effects appears to be too small to be of practical use as a reliable tool for manipulation of neural circuits. Although the authors show statistical significance, some details of the analysis are not fully clear and may need to be further corrected for multiple testing. It remains unclear if perhaps larger or different effects would be achieved when recording through the skin, without anesthesia, in different brain areas, in differently defined subclasses of neurons or with a different stimulation protocol (frequency, duration, amplitude). Thus, although the technique appears promising, more work is needed.

    3. Reviewer #1:

      Major points:

      1) On the conceptual level, the authors claim that low-intensity amplitude-modulated transcranial focused ultrasound stimulation (AM-tFUS) inhibits local inhibitory interneurons and excites excitatory neurons at high intensity. However, the problem I have with this is that these cell types are highly interconnected within the local circuits, and changing the activity of the inhibitory cell type should have the opposite effect on the excitatory cell type. This has been documented in many experiments (Babl et al., Cell Reports, 2019; Royer et al., Europ.Journ. of Neurosc., 2010), and it unclear why the authors did not see similar effects. Furthermore, it is particularly troubling that the authors observe sustained suppression (five minutes) of the inhibitory neurons yet fail to see any effects on the excitatory neurons (Fig. 3B,D). This conceptual problem raises questions about the experimental setup, which I address below.

      2) The authors performed electrophysiological recordings while delivering AM-tFUS with different intensities. To claim the differential effects on the excitatory and inhibitory interneurons, the authors first need to isolate single units in their recordings. However, the authors fail to cluster single units, as documented in the methods section (line 338). There could be several reasons why the authors failed to complete this step. I suggest the following ways to remedy this problem: A. The authors should use a silicone probe with a higher density of recording sites (the distance between the individual sites can be as small as 25 um in some NeuroNexus probes) than the one used in the MS, or use a Neuropixels probe so that the clustering algorithms have a chance to isolate single units. Using NeuroNexus probes with 100 um separation between the recording sites makes it impossible for different channels to "see" the same neuron and severely limits the spike sorting algorithms that separate units based on their unique spatio-temporal waveforms. B. After clustering, the authors should use autoccorellograms to verify that the single units do not violate the refractory period (Hill et al., Journal of Neuroscience, 2011). This is particularly important in areas, such as the hippocampus, which has a high density of neurons, and care should be taken to avoid multiunit recordings. C. The authors should perform one long recording session that comprises all experimental manipulations-the delivery of AM-tFUS, the sham control, and the rest period-to trace how the same units change their firing rate as a function of the experimental manipulations. This would also be very helpful in understanding how the firing rate change in one class of neurons is accompanied by changes in another class. D. Although this might be tricky, the authors could try to perform electrophysiological recordings by lowering the electrode perpendicular to the brain surface. This would allow them to record excitatory neurons and inhibitory interneurons that are connected to each other within the local circuit. This type of recording, would give the authors a greater chance of observing how changes in the firing of the inhibitory cell type affects the activity of the excitatory cell type and vice versa. This type of recording would also be highly desirable for understanding changes in oscillations of the local field potential (LFP) (see below).

      3) The authors should report the sites that they have recorded by labelling the electrode with fluorescent dye or performing lesions at the recording sites.

      4) When analyzing the effect of AM-tFUS on theta frequency oscillations, the authors should perform current source density (CSD) analysis to verify that the observed effects are local and do not originate from distant sources by volume conduction (Buzsaki et al., Nat. Rev. Neurosc. 2016). Performing electrophysiological recordings perpendicular to the brain surface, as I recommend in 2D, would be necessary for this. The CSD analysis would identify the location in the hippocampus where the change in theta power occurs.

      5) The authors argue that temperature changes of 0.2 degrees were not sufficient to alter the firing rate of the neurons. However, the paper to which they refer (Darrow et al., Brain Stim, 2019) shows, in Fig. 7, that heating up brain tissue with a laser even at 0.2C can induce changes in somatosensory evoked LFPs. The authors should perform control experiments that are analogous to those in the cited paper to manipulate the temperature while recording the neurons in order to verify that the observed effects are not due to the changes in temperature.

      Minor points:

      1) The authors should not use label cells in Fig. 3 as they cannot claim that they recorded single units.

      2) In Fig. 5C, Fig. S3B,C, and Fig. S4B,C, the authors should show the full scale of the values. Furthermore, the outliers in these plots (not seen in the figures) may drive the general trends, and removing them should be considered.

      3) During AM-tFUS at intermediate power intensity (Fig. 4D,G), the authors observe a very dramatic change in LFP power in the 1-3 Hz frequency range. Although there is no clear underlying change in the firing of neurons at this intensity (Fig. 3E,F,G,H), the authors could speculate on what is happening in this case.

      4) Fig. 5B shows a clear reduction of power in the theta frequency range after AM-tFUS in the dentate gyrus as well as in CA1 and CA3. This effect is also seen in Fig. 4G and Fig. S1,2. Although this effect does not reach the level of statistical significance, the authors should report the p-values.

      5) Although the suppression of firing rates for a five-minute period after low-intensity AM-tFUS application is interesting, I am not sure if such prolonged after-stimulation effects have ever been documented using other modes of neuromodulation. Therefore, the authors should discuss this effect in line with previous work.

    1. Reviewer #3:

      This is an interesting manuscript in which the authors have investigated the effect of intracellular injection of oligomeric beta-amyloid into hippocampal neurons both in cultures and adult animals. They find that starting from 500 pM, intracellular injection of oligomeric beta-amyloid rapidly increases the frequency of synaptic currents and higher concentrations potentiate the AMPA receptor-mediated current. Both effects were PKC-dependent. Furthermore, they find that following PKC activation there is release of NO which in turn increases release of neurotransmitter not only in the nearby pre-synaptic site, but also in neighboring cells. This suggests that intracellular injections of oligomeric beta-amyloid into the postsynaptic neuron can increase network excitability at a distance. The effect on neuronal excitability would involve AMPA-driven synaptic activity without altering membrane intrinsic properties. The conclusions are sound. However, there are two main aspects of the observed phenomenon that have not been taken adequately into account, or have been avoided by the authors. The authors have not investigated the effects of application of oligomeric beta-amyloid into the extracellular space and the presynaptic neurons, two other compartments of the synapse. They might have performed experiments comparing findings from experiments with intracellular injections of oligomeric beta-amyloid into the post-synaptic neurons, with effects of extracellular application and those of injections into the presynaptic neuron.

      Additional minor concerns are related to the following issues:

      a) The raw data on Figure 3 suggest that not only excitatory transmission is affected but also inhibitory transmission is somewhat modified. Measurement of the charge might be misleading.

      b) This reviewer is not clear on the meaning of the following sentence in the discussion "Contrary to previously published data using extracellular Aβ or with more chronic application models [45-50], we did not find any synaptic deficits". The current work shows synaptic changes!


      c) There is a mistake in the numbering of figures in the discussion. The paper has no figure 11. When referring to figure 10, they must mean something else.

      d) The model on Figure 10 needs work. The authors should explain what various elements of the drawing mean, or better label them directly on the figure.

    2. Reviewer #2:

      Epilepsy is often an early sign observed in Alzheimer patients and there are several mechanisms that may contribute to this hyperexcitability. In this study, the authors focused on an important observation suggesting that intracellular Amyloid beta, a protein often found in plaques in the brain, is found early on inside neurons of the hippocampus, the learning and memory center of the brain. Interestingly, when unique early forms of Ab named oligomers were introduced inside neurons, the cells and surrounded circuits became hyperexcitable. This increased excitability was mediated mainly by the release of glutamate on AMPA glutamate receptors. Remarkedly, these excitatory effects were triggered by intracellular amyloid oligomers through a retrograde signal named nitrous oxide. This manuscript suggests that early stages of the disease may comprise significant increases in network excitability that may trigger a cascade of synaptic dysfunction and cognitive deficits such as memory loss.

      Here are my comments to strengthen the manuscript. Overall this is a strong study with an interesting take on the role of intracellular amyloid and how it contributes to increased network excitability in AD.

      There is an interest to determine the mechanisms responsible for the hyperexcitability often associated with familial and sporadic forms of Alzheimer's disease. Many have focused on possible reduction in inhibitory interneuron function as essential drivers of the increased excitability of the network. Although there exist a large number of investigations determining the effects of extracellular Ab on synaptic transmission, the intracellular effects of Ab and its contribution to disruptions of synaptic transmission remains less well understood. A couple of studies have shown that intracellular application of Ab (Ab42) induces decreases in long-term potentiation and basal synaptic transmission. In this study, the authors have investigated how intracellular Ab oligomers (iAbo) contribute to enhanced excitability in the CA1 region of the hippocampus. To do so, they have intracellularly applied human brain-derived and synthetic Ab oligomers through the patch-pipette in principal neurons recorded in vitro and in vivo.

      In this study, the authors show that intracellular application of intracellular Ab oligomers increased the frequency and the amplitude of excitatory currents and spiking in ex vivo hippocampal slices. Effects that were mimicked by human oligomers. The intracellular amyloid mediated effects were through the amplification of AMPAergic spontaneous activity and currents, and, to a lesser extent, spontaneous GABAA mediated currents. Miniature frequency and amplitude of AMPA-mediated EPSCs were also increased and were sensitive to PKC blockers. Interestingly, since intracellular Ab increased the frequency of EPSCs, which is a presynaptic effect, a signaling molecule is likely to be released postsynaptically to modulate presynaptic terminals. The hypothesis that the retrograde signal NO was involved by determining the sensitivity of NOS inhibitor L-NAME. L-NAME reduced the increased iAbo mediated frequency of spontaneous post-synaptic excitatory currents in cultured neurons. The L-NAME compound was shown to reduce the iAbo -mediated No from both the recorded and neighboring neurons providing further evidence that intracellular Ab oligomers triggered NO release and increased glutamate release. Increases in the excitability of CA1 pyramidal cells were also observed in vivo by intracellular application of AB oligomer. Overall, this is a well written study that demonstrates a novel perspective of the effects of intracellular Ab oligomers on CA1 principal neurons and suggests possible mechanisms underlying hyperexcitability.

      Novelty:

      1) use intracellular oligomers, synthetics and humans

      2) Showing that iAb oligo increased post and presynaptic AMPA-mediated EPSCs.

      3) The presynaptic increases in EPSCs were mediated by NOS and NO, this could potentially spread widely across the network.

      4) spontaneous IPSCs were also increased (through an undetermined mechanism).

      5) the iAbo increase in excitation was also observed in vivo.

      Questions:

      Intracellular Ab produces both an increase in EPSCs and IPSCs. However, in Fig 3, the IPSCs, measures using a charge transfer quantification, did not show a significant change in response to iAbo, in contrast to EPSCs. This spontaneous inhibition here was measured as charge transfer which depends on the amount of charges in time. I wonder why this was not significant since this measurement should have picked up a possible increase in spontaneous IPSCs?

      With regard to the inhibition, In the schematic on Fig. 10, I find this incomplete and slightly inaccurate since it shows one terminal releasing both glutamate and GABA with NO increasing both. While this is obviously an oversimplification, it's slightly inaccurate since NO was not directly shown to increase sIPSCs. Were NOS blockers able to disrupt the increase in sIPSCs? Moreover, there are many papers that have shown that PKC can also phosphorylate GABA receptors and increase their conductance. What could be the reason that this was not involved here? This needs to be discussed.

      The experiments were done in cultured neurons, in slices and in vivo. It's not always easily discernible in what conditions the experiments were done when reading the manuscript, especially when looking at the figures and figure legends. This should be at least stated in the figure legends. To help the reader, the conditions in which the currents were recorded (GABA and or excitatory receptor blockers, other ion blockers could be indicated in the figure legends to ease the comprehension of how the experiments were done and what was measured). In relation to this, was the sIPSC iAbo-mediated increases also blocked by L-NAME?

      In other studies, investigating intracellular application of Ab, such as the Ripoli et al., 2014 paper, showed that iAb produced significant reductions in EPSCs in their hippocampal neurons. What are the differences explaining this? This should be discussed. Similarly, Gulisano et al., 2019, showed that extracellular, but not intracellular oligo Ab had effects on excitability when it was applied extracellularly but not intracellularly. This should also be discussed.

      In the introduction, it's mentioned that the nature of hyperexcitability is unknown. I agree that it's incompletely known, but what is known is that there is a large variety of possible causes. For example, changes in GABAergic interneuron function (see Hijazi et Al 2019) is well known to be a contributing factor. There are many studies that have shown possible contributing causes of hyperexcitability, therefore, something IS known, and this should be identified in the introduction.

      How do these increases in synaptic transmission by applying pM concentrations of oligomers fit with the data showing that extracellular Ab oligomers of comparable concentrations decrease synaptic transmission through presynaptic reductions in glutamate release? This needs to be put into context and discussed.

    3. Reviewer #1:

      This is an interesting study of the effects of intracellularly-applied amyloid beta (Ab) in primary hippocampal cultures of embryonic rats or in area CA1 of hippocampal slices or anesthetized rats that are less than 35 days old (therefore prepubertal). In vivo, whole cell recordings were made of CA1 neurons which is difficult and therefore a strength. Both synthetic Ab and human-derived Ab were applied by adding them to the internal solution of a patch electrode. Several interesting effects were documented, such as increased evoked and miniature EPSCs (mEPSCs) as well as some effects on IPSCs and neuronal properties. A major question is whether these effects were pharmacological or physiological.

      An intriguing finding was that the increased EPSCs was reduced by inhibiting a PKC-mediated effect of nitric oxide (NO). Furthermore, the effect of intracellular Ab on the recorded cell had effects on neighboring cells. Whether those were due to diffusion of NO, synaptic inputs from the recorded cell on neighboring cells, or release of Ab from the recorded cell was not clear. The authors suggested this is 'functional spreading of hyperexcitabiliity' similar to the way prions are spread transynaptically (actually this has been suggested for Ab too; see work by Karen Duff or Brad Hyman's groups) although this seems premature because the work that has been done with prions and Ab involves spread over a long time and a long distance relative to the results of the present study. Still the results are interesting and could be relevant in some way to the development of the disease or hyperexcitability.

      MAJOR CONCERNS

      One major issue is whether the results are relevant to Alzheimer's disease (AD) or represent interesting pharmacological data about what Ab can potentially do in some of its forms in normal tissue. The cultures are from embryonic rats and it is not clear how well they can predict what occurs in aged humans with AD. This issue is not only a question related to the preparation of tissue but the use of Ab intracellularly. It is not clear that synthetic or human Ab that is prepared outside the animal and used to fill electrodes to dialyze a cell is similar to the Ab generated in a cell of a person with AD. Independent of the methods to determine whether it is oligomeric outside the cell, once dialyzed it is not clear how it may change and where it would go. In AD Ab has a particular location and precursor where it forms and how it travels to the external milieu. As a product of its precursor APP, several peptides are produced besides Ab and many labs think they are as important as Ab in the disease. Although a strength to use atomic force microscopy to attempt to verify the form of Ab being used, it is not clear what form was actually in the dialyzed cell and how that compared to the form in AD.

      How this work relates to other studies that are similar is important. It seems that few other studies that have applied Ab are mentioned because few have studied it intracellularly. However, they are relevant because adding Ab has been shown to cause an increase in hippocampal neurons of excitatory activity at low concentration but at higher concentrations synaptic transmission is weakened. Many studies of mouse models of AD pathology suggest reduced synaptic transmission and plasticity, although many others show hyperexcitability, often without adding Ab at all.

      PKC and NO do a lot of things throughout the brain and body. How do the effects the authors have identified relate to all these other effects. For example, if PKC is activated by another mechanism, would it occlude the effects of Ab? What are the changes in PKC and NO in AD?

      ADDITIONAL CONCERNS

      I am not sure of the validation of Ab using the anti amyloid or 6E10 antibodies. The western blot shows a large region that both antibodies detect and the 6E10 antibody shows an even greater band. It is not clear what the large range of bands that are shown imply except nonspecificity. The antigen that the antibodies recognize should be stated exactly.

      Clarifying sample sizes throughout the study is needed.

      Do the cultures include interneurons? Are the excitatory and inhibitory neurons interconnected? This information will help interpret the results.

      The external solution for cultures contains 5.4 mM K+ which is quite high, and can induce hyperexcitability. Therefore it is important to be sure controls did not show hyperexcitability even after persistent recordings. Similarly, the use of 100uM AMPA and GABA seem very high. Justifying these high concentrations is important. They should lead to hyperexcitability and toxicity (AMPA) over time. Another point of concern is that the concentration of K+ for the slice work is 3 mM, much different than cultures. There are also differences in Mg2+ and Ca2+, making data hard to compare in the two preparations.

      Line 295 mentions 2 min recording periods were used to acquire sufficient events. One wants to know if this was done throughout the paper and if so, how many events per 2 min was considered sufficient?

      Terms related to intrinsic membrane properties and firing need to be explained much more because each lab has a slightly different method.

      In the statistics part of the Methods, why is Welch's ANOVA (followed by Games-Howell) used when variance was unequal. Usually the test to determine inequality is provided, so it is clear it was done objectively and with a reasonable test. Then if the data are unequal there is often a choice for a non parametric test, which is common. Some groups transform the data such as taking the log of all data values. If this reduces the variance between groups, sufficient to pass the test to determine inequality, it leads to a parametric test like a one-way ANOVA followed by Tukey's posthoc test.

      In the Results, Line 331 suggests that the authors think they know what a low concentration is for Ab. I don't think it is known in AD what is low and what is high. In other studies of Ab, low concentrations were picomolar (Puzzo et al., listed in the references). So it is not clear the term low is justified for 50 nM.

      The bursts of activity are not quantified. What was defined as a burst? What was the burst frequency and did it change over the recording period?

      In the section about mPSCs in culture, starting on Line 348, were these events EPSCs or IPSCs? It is important because in the section starting on Line 383 there were changes in IPSCs but the authors conclude a major role of EPSCs only. For example, Line 400 suggests that the effects of Ab were on AMPA receptor-mediated activity but it seems from the data there were also some effects on IPSCs.

      Line 434. Provide evidence that the fluorescent probe accurately measures NO.

      At the top of page 19 there is a section that needs to be moved earlier because it relates to the work in cultures. That earlier section needs to be reinterpreted given changes in membrane properties occurred. Also, if there is increased synaptic activity in cells dialyzed with Ab, TTX needs to be added to be sure of intrinsic properties. The increase in excitability the authors discuss could be due to the synaptic activity or changes in properties, or both and this needs clarification.

      The last paragraph on page 20 is not useful because DRG neurons are so different from hippocampal neurons. One could have effects in DRG but not hippocampus, and vice-versa. The paragraph starting on Line 616 should be revised. It is not a series of compelling arguments in its present form. For example, saying that AMPAR are linked to epilepsy seems quite obvious, and does not mean that the work presented here is like epilepsy because AMPAR events increased in several assays. Increased AMPAR events also occur when there is a change in behavioral state, plasticity, etc.

      In the conclusions, I don't think the data suggest a synaptic change in AMPAR alone. There are intrinsic changes and changes in GABAergic events. Many sites in the brain could have different effects but were not studied. It is not clear effects of NO were coordinated in the way they affected adjacent neurons to the recorded cell. NO simply could have diffused to an area around the recorded cell. I may have missed evidence to the contrary, but effects could have been mediated by axons of the recorded cell and not NO.

      In Figure 1b, there is a representative example. Could the neurons be shown? Then one knows the relationship of the signal to the location of neurons.

      Graphs should show points. This is one way to clarify sample size easily also.

      MINOR POINTS

      Line 169 mentions stable access resistance and one usually provides a number indicating how little it increased over time, such as 10-20%. Similarly the way synaptic events were discriminated by noise is not provided (line 291). Instead, a brief description is provided.

      Line 292 mentions noise ~2 pA but it is much higher in the data shown in the figures.

      Solvents of drugs are not listed at all, and controls that show no effect of vehicle need clarification in some cases.

      On Line 371, Ab-mediated neurotransmission is used. I believe this needs to be modulated rather than mediated, or an explanation is needed.

      On Line 381, how do the authors know that EPSCs are mediated primarily by AMPA receptors in this preparation?

      On Line 393, what is the comparison of AMPA-mediated events to [where it is stated they are what is mostly changing]?

      In all of the sections where drugs were applied, abbreviations need to be spelled out before the first use, concentrations need to be confirmed as specifically action on the intended receptor, and indirect effects on other cells need to be discussed if bath-applied.

      The sentence starting on Line 417 is a repetition of a prior sentence on the previous page.

      Line 433. Clarify what low concentrations mean here.

      Line 444. mPSCs are referred to here. One needs to know what were the values for E and IPSCs.

      In this section it is often stated that there is a decrease but actually the dialyzed cells are compared to controls so different language is needed.

      Line 461. It is not clear that the hippocampus is the first site to be affected in AD. The entorhinal cortex is earlier in the studies of some, and in the mouse models it is usually the cortex that gets plaque first. In humans, the locus coeruleus may be earlier than the entorhinal cortex.

      How the plots of current vs. spikes were done is important. If there were differences in membrane potential, that could affect the spike output. If there were differences in input resistance or threshold, that also could play a role. One can control for these potential confounds, so explanations are needed.

      Line 472. Vm does not generate fluctuations in this case. Vm changes, and synaptic potentials get larger or smaller, add new components or lose them, etc.

      Line 476. It is not clear why cells are firing at membrane potentials so hyperpolarized to threshold.

      The streptavidin/calbindin labeling is good but the morphology of the cell is not like a pyramidal cell of area CA1 because there is a major branch of the dendrites at almost a right angle to the apical dendrites. The electrophysiology of this cell might be like an interneuron, and two of the figures show firing with a large afterhyperpolarization similar to an interneuron.

      In Figure 3, what are EPSCs and what are spikes would be helpful to point out. The concentration, 500 nm, may never be reached in the brain of an individual with AD, or do the authors have evidence that concentration is relevant in vivo?

      There are typos in figure headings, such as Contro instead of Control and in figure 4g, AMPAergic has the c below AMPAergi

    4. Summary: This study provides new information about how amyloid beta (Ab) oligomers (Abo) may contribute to hyperexcitability which is important because Abo and hyperexcitability have been suggested to occur early in the development of Alzheimer's disease (AD). The authors added Abo intracellularly (iAbo) using dialysis from a patch pipette. Their data suggest iAbo led to increased synaptic excitation mediated presynaptically by retrograde signalling of nitric oxide (NO). Furthermore, they present data suggesting that there is spread of this increase in excitation to neighboring neurons.

      Major Comments:

      1) The nature of the described effects of intracellular iAbo are quite unexpected, occurring within a minute of obtaining intracellular recording configuration, which contrasts with at least on previous study. While some controls for intracellular application of oligomers are provided, with reverse iAbo failing to reproduce the effect (Fig 2S1) and the effect being blocked by the antibody A11 (Fig 2S2), further controls are necessary to explain this rapid effect, which seems faster than that for the diffusion of the fluorescent tag into the cell (Fig 1S1). Note that Pusch and Neher (Pflug Arch 1988) determined diffusion time for different substances. That paper or others should be cited, and then some estimation of equilibrium time based on diffusibility of ab oligomers should be provided. Equations 17 and 18 in that paper provide some estimates based on molecular weight or diffusion coefficient. One point in Pusch and Neher is there is extreme variability between access times across cells and that it depends on access resistance, of course. Finally, the Pusch and Neher calculations were for small spherical cells - diffusion into spatially extended cells with long dendrites where the synapses are will take even longer. This is especially critical, as one of the major papers of precedent for this work is that of Ripoli, et al. 2014 (cited in the manuscript) in which the authors of that work examined effects of patch applied Ab42 over the course of 20 minutes, with internal controls showing differences between initial responses, right after break in, and 20 minutes later when the oligomer and/or monomers will have had a chance to equilibrate with the intracellular contents. It is not clear how such a rapid effect as indicated in the figures could be achieved by such a large molecule as Ab. The data suggest a time to effect of seconds to minutes, and the peak effect occurs before the fluorescence peaks, which seems hard to explain.

      2) The data need reorganization in terms of their results using h-iAbo or iAbo. There needs to be a clear demonstration of why both were used if the results are generalized with both (or not) and if they can actually use both interchangeably.

      3) The authors need to clearly indicate whether the experiments were done in culture or in slices. The authors need to provide a rationale on why specific experiments were done in culture and others in slices.

      4) There are aspects of the observed phenomenon that have not been taken adequately into account. For example, the authors have not investigated the effects of application of oligomeric beta-amyloid to either the extracellular space or the presynaptic neurons, two other compartments of the synapse.

      5) Aspects of the data raise questions: 1) Western blots appear to have multiple bands 2) evidence that the fluorescent probe accurately measures NO. 3) The bursts of activity are not quantified. What was defined as a burst? What was the burst frequency and did it change over the recording period? 4)The external solution for cultures contain 5.4 mM K+ which is quite high, and can induce hyperexcitability. Similarly, the use of 100uM AMPA and GABA seem very high. Justifying these high concentrations is important. They should lead to hyperexcitability and toxicity (AMPA) over time. Another point of concern is that the concentration of K+ for the slice work is 3 mM, much different than cultures. There are also differences in Mg2+ and Ca2+, making data hard to compare in the two preparations. 5) sample sizes are unclear 6) Intracellular Ab produces increases in both EPSCs and IPSCs. However, in Fig 3, the IPSC measures using a charge transfer quantification, did not show a significant change in response to iAbo, in contrast to EPSCs. 7) With regard to the inhibition, In the schematic on Fig. 10, I find this incomplete and slightly inaccurate since it shows one terminal releasing both glutamate and GABA with NO increasing both. While this is obviously an oversimplification, it's slightly inaccurate since NO was not directly shown to increase sIPSCs. Were NOS blockers able to disrupt the increase in sIPSCs? Moreover, there are many papers that have shown that PKC can also phosphorylate GABA receptors and increase their conductance. What could be the reason that this was not involved here? This needs to be discussed.

      6) How this work relates to other studies is necessary. For example, how this study is related to others about Ab exposure is lacking. Also, regarding hyperexcitability, many possible causes exist. These should be summarized in the introduction and the authors should comment how their results fit with these studies. Regarding PKC and NO, PKC and NO have several known actions throughout the brain and body. How do the effects the authors have identified relate to all these other effects? For example, if PKC is activated by another mechanism, would it occlude effects of Ab? What are the changes in PKC and NO in AD? Regarding the ability of the data to address AD, a major issue is whether the results are relevant to AD or represent interesting pharmacological data about what Ab can potentially do in some of its forms in normal tissue.

      Reviewer #2 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      Labelling strategies for electron microscopy have so far lacked the ability to clearly visualise genetically expressed probes such as GFP for light microscopy. Building on previous studies by the group of Ellisman, the Parton group have made significant adaptations and improvements to the system. Especially addressing the issue of diffusion of the DAB precipitate and the low visibility of it by silver enhancing the particles is a key step forward. The authors have tested their system in a wide variety of EM workflows and show that it works.

      The quantification part of the manuscript is to me potentially the most interesting part. Quantitation of proteins at physiological levels at the ultrastructural level would be a significant achievement. This part is a bit under represented although there are some issues with that. The silver enhanced particles on the added external standard appear to be larger than the ones inside. Does that result in lower detection?

      Overall, this is a manuscript that is very clearly written and very easy to follow for non-experts.

    2. Reviewer #2:

      In this study, entitled “APEX-Gold: A genetically-encoded particulate marker for robust 3D electron microscopy” Rae et al. describe a method to improve the visualization of the staining for genetically encoded probes that they described in previous studies (namely APEX2 constructs).

      These techniques are very powerful as they increase the sensitivity for detecting low level of expression of the tagged proteins (e.g. compared to GFP tagged proteins). The novelty in this study is that the reaction product (DAB precipitates) is revealed by the nucleation of silver/gold precipitates. Such enhancement has been used extensively in the past for pre-embedding immuno-peroxidase techniques, but has never been combined with the use of APEX2. This has one major advantage: that the contrast of the positive staining can stand out from the contrast of the surrounding ultrastructure, making the sample preparation more adapted to 3D EM techniques, especially volume SEM where contrast is a bottleneck. Moreover, the sensitivity of the technique is shown to be compatible with the detection of endogenous levels of expression.

      The technique is very well detailed and elegantly illustrated by convincing applications on cultured cell systems. The apparent simplicity of use, together with a growing interest in the community for the APEX2 based techniques (also in correlative imaging), significantly raises the potential for it to become a standard in the field, and should thus be shared with the community.

    3. Reviewer #1:

      The manuscript by Rae et al. reports the development of a new protocol for labeling genetically-tagged proteins of interest with heavy atom particles for visualization by electron microscopy. The optimized protocol builds on the established use of the enzyme APEX fused to the target of interest. APEX oxidizes diaminobenzidine, DAB, which in turn converts silver and gold metal salts to particulates in close proximity to the APEX-fused protein of interest. The optimized protocol is related to the contrast-enhancement method reported by Sedmak et al., 2009 and Mavlyutov et al., 2017. The changes to the method may improve the proportionality of the signal such that the number of APEX tags present in a sample is better correlated with the number of heavy atom particles. While the study appears to be sound, it is an extension of an established labeling method.

    4. Summary: The manuscript by Rae et al. reports a new protocol for labeling genetically-tagged proteins of interest with heavy atom particles for visualization by electron microscopy. The optimized protocol builds on the use of the enzyme APEX2, fused to the target protein of interest. The contrast enhancement may be useful in diverse 3D EM techniques. Also, reviewers were enthusiastic about the prospects for quantitative studies, even for low-levels of endogenous expression. Semi-quantitative studies may be enabled because the new method appears to improve the proportionality of the signal such that the number of APEX2 tags in a sample correlates with the number of heavy atom particles. The apparent simplicity of the protocol raises the potential for it to become a standard in the field of EM labeling.

      Reviewer #1, Reviewer #2 and Reviewer #3 opted to reveal their name to the authors in the decision letter after review.

    1. Author Response:

      Summary:

      While the work addresses an interesting research question, several shortcomings have been raised by three independent reviewers. A first issue is the lack of theoretical clarity and linkage with prior work, as discussed by Reviewer 1 and Reviewer 2. A second critical set of concerns is raised by all reviewers with the need for several additional analyses to nail down the interpretations proposed by the authors. Reviewer 2 specifically raised concerns regarding the interpretability of activation in auditory cortices, while Reviewer 3 provides insights on the MVPA analysis and suggests the possible use of RSA to clarify the main findings.

      While we respect the editor’s decision, we think that all points raised by Reviewer 1 and Reviewer 3 can be easily addressed through editing of the text and additional analyses. As we describe below, these revisions do not undermine the findings reported in our study – instead, they improve the clarity of the manuscript and further demonstrate that our results are genuine and robust. Furthermore, we believe that points raised by Reviewer 2 are based on misunderstanding. Differences in auditory properties across sound categories in our experiment cannot explain the pattern of results reported. Thus, additional analyses in the auditory cortex, proposed by Reviewer 2, can neither support nor undermine the claims made in our study. Nevertheless, we performed all the analyses suggested by the Reviewer 2.

      We also want to stress that all reviewers find our study timely and interesting for broad readership. Furthermore, Reviewer 1 and Reviewer 3 made a number of positive comments on study methodology. Overall, we believe that there are no doubts regarding the novelty and importance of our study, and that we are able to address all additional methodological concerns raised by the reviewers.

      Reviewer #1:

      Bola and colleagues asked whether the coupling in perception-action systems may be reflected in early representations of the face. The authors used fMRI to assess the responses of the human occipital temporal cortex (FFA in particular) to the presentation of emotional (laughing/crying), non-emotional (yawning/sneezing), speech (Chinese), object and animal sounds of congenitally blind and sighted participants. The authors present a detailed set of independent and direct univariate and multivariate contrasts, which highlight a striking difference of engagement to facial expressions in the OTC of the congenitally blind compared to the sighted participants. The specificity of facial expression sounds in OTC for the congenitally blind is well captured in the final MVPA analysis presented in Fig.5.

      We would like to thank the reviewer for an overall positive assessment of our work.

      -The use of "transparency of mapping" is rather metaphorical and hand-wavy for a non-expert audience. If the issue relates to the notion of compatibility of representational formats, then it should be expressed formally.

      Following the reviewer’s suggestion, we revised the introduction and clarified what we mean by “transparency of mapping”, and how this concept might be related to the compatibility of representations computed in different areas of the brain. As is now extensively explained, we propose that shape features of inanimate objects are directly relevant to our actions. In contrast, a relationship between shape and relevant actions is much less clear in the case of most animate objects. We hypothesized that this inherent difference between the inanimate and the animate domain, combined with evolutionary pressures for quick, accurate, and efficient object-directed responses, resulted in the inanimate vOTC areas being more strongly coupled with the action system, both in terms of manipulability and navigation, than the animate vOTC areas. The stronger coupling is likely to be reflected in the format of vOTC shape representation of inanimate objects being more compatible with the format of representations computed in the action system.

      -The theoretical stance of the authors does not clearly predict why blind individuals should show more precise emotional expressions in FFA as compared to sighted - as the authors start addressing in their Discussion. In the context of the action-perception loop, it is even more surprising considering that the sighted have direct training and visual access to the facial gestures of interlocutors, which they can internalize. Can the authors entertain alternative scenarios such as the need to rely on mental imagery for congenitally blind for instance?

      We agree that our approach does not predict the difference between the blind and the sighted subjects, and we openly discuss this in the discussion: “An unexpected finding in our study is the clear difference in vOTC univariate response to facial expression sounds across the congenitally blind and the sighted group”. We also propose an explanation of this unexpected difference. Specifically, we suggest that the interactions between the action system and the animate areas in the vOTC are relatively weak, even in the case of facial expressions – thus, they can be captured mostly in blind individuals, whose visual areas are known to increase their sensitivity to non-visual stimulation. This explanation can account for this unexpected between-group difference and is consistent with our theoretical proposal.

      The “mental imagery account” can be, in our opinion, divided into two distinct hypotheses. One version of this account would be to assume that the representation of animate entities typically computed in the vOTC (i.e., also in sighted people) can be activated through visual mental imagery (as suggested by several previous studies), and that this would affect our between-group comparisons. In that case, however, we should observe an effect opposite to that obtained in our study – namely, the activation in the vOTC animate areas should be stronger in the sighted subjects, since they, but not the congenitally blind participants, can create visual mental images (as the reviewer pointed out). This is clearly not what we observed. A second version of the mental imagery account would be to assume representational plasticity in the vOTC of blind individuals – that is, to assume that vOTC animate areas in this population switch from representing visually, face-related information to representing motor mental imagery, which presumably they can generate just like sighted individuals. However, such an account does not, on its own, explain why the animate vOTC areas in the congenitally blind participants are more strongly activated than they are in the sighted subjects, who can generate both visual and motor mental imagery. Based on these considerations, we do not think that the mental imagery account provides a sufficient explanation. Nonetheless, it is certainly a factor worth considering, which we add in a revised discussion of the reported results. Similar reasoning can be applied to other accounts which assume that the observed difference between the blind and the sighted group is a result of representational plasticity in this region in the blind group. Such accounts would need to propose a plausible dimension, different than face shape and its relation to the action system, that is captured by the animate vOTC areas in blind individuals. Since the effect we report is independent of auditory, emotional, social or linguistic dimensions present in our stimuli, it is hard to say what this dimension might be.

      We now elaborate on these important points in the Discussion section.

      Reviewer #2:

      The study by Bola and colleagues tested the specific hypothesis that visual shape representations can be reliably activated through different sensory modalities only when they systematically map onto action system computations. To this aim, the authors scanned a group of congenitally blind individuals and a group of sighted controls while subjects listened to multiple sound categories.

      While I find the study of general interest, I think that there are main methodological limitations, which do not allow to support the general claim.

      We would like to thank the reviewer for this assessment. Below, we argue that the results presented in the paper support our claim, and that they cannot be explained by alternative accounts described by the reviewer.

      Main concerns

      1) Auditory stimuli have been equalized to have the same RMS (-20 dB). In my opinion, this is not a sufficient control. As shown in Figure 3 - figure supplement 1, the different sound categories elicited extremely different patterns of response in A1. This is clearly linked to intrinsic sound properties. In my opinion without a precise characterization of sound properties across categories, it is not possible to conclude that the observed effects in face responsive regions (incidentally, as assessed using an atlas and not a localizer) are explained by the different category types. On the stimulus side, authors should at least provide (a) spectrograms and (b) envelope dynamics; in case sound properties would differ across categories all results might have a confound associated to stimuli selection.

      We now present spectrograms and waveforms for sounds used in the study in the Methods section. We did not present this information in the original version of the paper because, in our opinion, it is quite obvious that sounds from different categories will differ in terms of their auditory properties – after all, this is why we can distinguish among human speech, animal sounds or object sounds. Thus, differences in sound properties across conditions are an inherent characteristic of every study comparing sounds from several domains or semantic categories (e.g., human vs. non-human), including our own study. We now clarify this issue in the Methods section of the manuscript.

      Having said that, we believe that differences in acoustic properties across sound categories cannot explain the results in the vOTC, reported in our work. We report that, in blind subjects, the vOTC face areas respond more strongly to sounds of emotional facial expressions and non-emotional facial expressions than to speech sounds, animal sounds and object sounds. These brain areas did not show differential responses to two expression categories or to three other sound categories. To explain this pattern of results, the “acoustic confound account” would need to assume that there is some special auditory property that differentiate both types of expression sounds, but does not differentiate sound categories in any other comparison. Moreover, this account would need to further assume that this is precisely the auditory dimension to which the vOTC face areas are sensitive, while being insensitive to other auditory characteristics, different across the other sound categories (e.g., across object sounds and animal sounds, or expression sounds and speech sounds - as the reviewer pointed out, all categories are acoustically very different, as indicated by the activation of A1). We find this account extremely unlikely. We now comment on these points in the Methods and the Results section.

      2) More on the same point: the authors use the activation of A1 as a further validation of the results in face selective areas. Page 16 line 304 "We observed activation pattern that was the same for the blind and the sighted subjects, and markedly different from the pattern that was observed in the fusiform gyrus in the blind group (see Fig. 1D). This suggests that the effects detected in this region in the blind subjects were not driven by the differences in acoustic characteristics of sounds, as such characteristics are likely to be captured by activation patterns of the primary auditory cortex." It is the opinion of this reader that this control, despite being important, does not support the claim. A1 is certainly a good region to show how basic sound properties are mapped. However, the same type of analysis should be performed in higher auditory areas, as STS. If result patterns would be similar to the FFA region, I guess that the current interpretation of results would not hold.

      As we discuss above, we believe that the explanation of the results observed in the vOTC in terms of “acoustic confound” does not hold, even without any empirical analysis in the auditory cortex. The analysis in A1 was planned to clearly illustrate this point and to support interpretation of potential unexpected pattern of results across sound categories (such an unexpected pattern was not observed).

      However, per reviewer’s request, we performed an ROI analysis also in the STS. Specifically, we chose two ROIs – a broad and bilateral ROI covering the whole STS, and a more constrained ROI covering the right posterior STS (rpSTS), known to be a part of the face processing network and to respond primarily to dynamic aspects of the face shape. As can be seen in Supplementary Materials, the broad STS ROI pattern of responses is markedly different from the one observed in the FFA. Particularly, the magnitude of the STS activation is clearly different for speech sounds, animal sounds, and objects sounds, in both the blind and the sighted group. In the case of the FFA, the activation magnitudes for these three sound categories were indistinguishable. Furthermore, in the blind group, the STS showed stronger activation for emotional facial expression sounds than for non-emotional expression sounds. Again, such a difference was not observed in the FFA (if anything, the FFA showed slightly stronger activation for non-emotional expression sounds in the blind group). The pattern of the rpSTS responses is more similar to the responses observed in the FFA. This is exactly what can be expected based on our hypothesis that the FFA in the blind group is sensitive primarily to dynamic facial reconfigurations, with transparent link between the motoric and visual shape representations. Overall, we think that the pattern of results observed in the auditory cortex is fully in line with our hypothesis – the auditory regions (A1 and STS, defined broadly) show responses that are different than the responses observed in the FFA (one may hypothesize that responses in the auditory regions are driven by low-level auditory features of stimuli to a larger extent); the rpSTS, which is specialized in the processing of dynamic aspects of the face shape, shows the pattern of responses that is more similar to the pattern of responses observed in the FFA. Importantly, the responses in the rpSTS were not different across subject groups. As we describe below, this is the pattern of results that was observed also in MVPA. We now report all the above-described results in the paper.

      3) Linked to the previous point. Given that the authors implemented a MPVA pipeline at the ROI level, it is important to perform the same analysis in both groups, but especially in the blind, in areas such as STS as well as in a control region, engaged by the task (with signal) to check the specificity of the FFA activation.

      Per reviewer’s request, we additionally performed the MVPA in three control regions. Firstly, we performed the analysis in the auditory cortex, defined as A1 and the STS combined. We treated this area as a positive control – particularly, given the acoustic differences between sound categories, we expected to successfully decode all sound categories from the activity of this ROI. Secondly, we performed the analysis in the parahippocampal place area (PPA). We treated the PPA as a negative control – given that this area does not seem to contain much information about animate entities, we did not expect to find effects there for most of our comparisons. Furthermore, as the PPA is the vOTC area bordering the FFA, the negative results in this area would be a proof of spatial specificity of our results. Thirdly, we performed the analysis in the rpSTS – here, we expected to observe the results similar to the ones observed in the FFA, for the reasons provided above. We now present the results of these analyses as supplementary figures.

      We were able to successfully distinguish all sound categories, in both groups, based on the activation of the auditory cortex (all p = 0.001; the lowest value that can be achieved in our permutation analysis). Furthermore, based on the activation of this area, we were able to classify specific facial expressions, specific speech sounds, and the gender of the actor, in contrast to the result from the FFA, where the decoding of facial expressions was the only positive result.

      As expected, the decoding of animate sound categories was generally not successful in the PPA. However, as one might expect, activation of this area allowed us, to some extent, to distinguish object sounds from animate sounds – especially in the blind group. Furthermore, based on the PPA activation, we were not able to classify specific facial expressions, speech sounds, or the gender of the actor. These results confirm that the results reported for the FFA are specific to only certain parts of the brain and even certain parts of the vOTC.

      As can be expected, the results in the rpSTS were the most similar to the results observed in the FFA – while the activation of this region was diagnostic of all categorical distinctions, the more detailed analysis showed that this region represented differences between specific facial expressions, but not between the speech sounds or the gender of actors acting the expressions (the similar pattern of results was observed in both groups). This is the same specificity that the FFA in blind people show.

      Finally, we would like to stress that the difference between results observed in the FFA and the PPA is yet another argument against interpreting the results in the FFA as being driven by auditory properties of stimuli – the issue that we discussed in details above. We do not see the reason why putative acoustic influences on the vOTC responses in the blind group should be present in the FFA, but not in the PPA.

      4) I find the manuscript rather biased with regard to the literature. This is a topic which has been extensively investigated in the past. For instance, the manuscript does not include relevant references for the present context, such as:

      Plaza, P., Renier, L., De Volder, A., & Rauschecker, J. (2015). Seeing faces with your ears activates the left fusiform face area, especially when you're blind. Journal of vision, 15(12), 197-197.

      Kitada, R., Okamoto, Y., Sasaki, A. T., Kochiyama, T., Miyahara, M., Lederman, S. J., & Sadato, N. (2013). Early visual experience and the recognition of basic facial expressions: involvement of the middle temporal and inferior frontal gyri during haptic identification by the early blind. Frontiers in human neuroscience, 7, 7.

      Pietrini, P., Furey, M. L., Ricciardi, E., Gobbini, M. I., Wu, W. H. C., Cohen, L., ... & Haxby, J. V. (2004). Beyond sensory images: Object-based representation in the human ventral pathway. Proceedings of the National Academy of Sciences, 101(15), 5658-5663.

      The first reference listed by the reviewer is actually a conference abstract. Thus, we feel that it would be premature to give it comparable weight to peer-reviewed papers. Furthermore, based on the abstract, without the published paper, we cannot assess the robustness of the results and their relevance to our study (particularly, it is unclear whether some effects were observed in the right FFA, and whether a statistically significant difference between blind and sighted subjects was detected).

      In the second reference, the authors did not observe effects in the FFA in the visual version of their experiment with sighted subjects, at the threshold of p < 0.05, corrected for multiple comparisons. In our opinion, this makes the null result of the tactile experiment, reported for the FFA, hard to interpret – thus, while the paper is very interesting in certain contexts, it is not particularly informative when it comes to the question addressed here.

      While the third reference reports interesting results, it does not investigate preference for inanimate objects or animate objects in the vOTC, which is the main topic of our paper (only comparisons vs. rest and between- and within-category correlations are reported). Furthermore, based on that study, we cannot conclude whether effects reported for faces are found in the face areas or in other parts of the vOTC (no analyses in specific vOTC areas were reported).

      These were the reasons why we did not refer to these materials in the previous version of the manuscript. Importantly, none of them compel us to revise our claims, and we refer to a number of other papers, directly relevant to the question we are interested in – that is, the difference between vOTC animate and inanimate areas in sensitivity to non-visual stimulation. Nevertheless, we agree that referring to materials suggested by the reviewer might be informative for non-expert readers – thus, we cite them in the revised version of our paper.

      Reviewer #3:

      Bola and colleagues set out to test the hypothesis that vOT domain specific organization is due to the evolutionary pressure to couple visual representations and downstream computations (e.g., action programs). A prediction of such theory is that cross-modal activations (e.g., response in FFA to face-related sounds) should be detected as a function of the transparency of such coupling (e.g., sounds associated with facial expression > speech).

      To this end, the Authors compared brain activity of 20 congenitally blind and 22 sighted subjects undergoing fMRI while performing a semantic judgment task (i.e., is it produced by a human?) on sounds belonging to 5 different categories (emotional and non-emotional facial expressions, speech, object sounds and animal sounds).The results indicate preferential response to sounds associated with facial expressions (vs. speech or animal/objects sounds) in the fusiform gyrus of blind individuals regardless of the emotional content.

      The issue tackled is relevant and timely for the field, and the method chosen (i.e., clinical model + univariate and multivariate fMRI analyses) well suited to address it. The analyses performed are overall sound and the paper clear and exhaustive.

      We thank the reviewer for this positive assessment.

      1) While I overall understand why the Authors would choose a broader ROI for multivariate (vs. univariate) analyses, I believe it would be appropriate to show both analyses on both ROIs. In particular, the fact that the ROI used for the univariate analyses is right-hemisphere only, while the multivariate one is bilateral should be (at least) discussed.

      We shortly discuss this issue in the Methods section: “The reason behind broader and bilateral ROI definition was that the multivariate analysis relies on dispersed and subthreshold activation and deactivation patterns, which might be well represented also by cross-talk between hemispheres (for example, a certain subcategory might be represented by activation of the right FFA and deactivation of the left analog of this area).”

      Constraining the FFA ROI in the multivariate analysis (i.e., using the same ROI as was used in the univariate analysis) makes the results slightly weaker, in both groups. However, the pattern of results is qualitatively comparable. Slight decrease in statistical power can be expected, for the reasons described in the Methods and cited above:

      Similarly, using broader FFA ROI in the univariate analysis (i.e., using the same ROI as was used in the multivariate analysis) results in qualitatively comparable, but slightly weaker effects in the blind group and no change in sighted subjects (no difference between sound categories). Again, this is expectable – visual studies show that the functional sensitivity to face-related stimuli is weaker in the left counterpart of the FFA than in the right FFA. This is also the case in our data - using broader and bilateral ROI essentially averages a stronger effect in the right FFA and a more subtle effect in the left counterpart of the FFA.

      We now clarify this issue in the Methods section.

      2) The significance of the multivariate results is established testing the cross-validated classification accuracy against chance-level with t-tests. Did these tests consider the hypothetical chance level based on class number? A permutation scheme assessing the null distribution would be advisable. In general, more details should be provided with respect to the multivariate analyses performed, for instance the confusion matrix in Figure 5B is never mentioned in the text.

      Yes, the chance level was calculated in a standard way, by dividing 100 % by the number of conditions/classes included in the analysis (note that all stimulus classes were presented equal number of times). To respond to the reviewer’s comment, we used a permutation approach to recalculate significances of all MVPA analyses reported in the paper (note that the whole-brain univariate analyses are already performed within the permutation framework). To this aim, we reran each analysis 1000 times with condition labels randomized and compared the actual result of this analysis with the null distribution created in this way (see the Methods section for details). We replicated all results reported in the paper. We now report this new analysis in the manuscript, changing the figure legends and the Methods section accordingly.

      The confusion matrix was not mentioned in the text because it is not a separate analysis. As explained in the figure legend, it is just a graphical representation of classifiers performance (i.e., its choices for specific stimulus classes) during the decoding analysis reported in Fig. 5A. To clarify this, we now briefly mention the graph presented in Fig. 5B in the main text.

      3) I wonder whether a representational similarity approach could be useful in better delineating similarity/differences in blind vs. sighted participants sounds representations in vOT. Such analysis could also help further exploring potential graded effects: i.e., sounds associated with facial expression (face related, with salient link to movement) > speech (face related, with less salient link with movement) > animals sounds (non-human face related) > object sounds (not face related at all). The above-mentioned confusion matrix could be the starting point of such investigation.

      We thank the reviewer for this interesting suggestion. In response to this comment, we performed an additional RSA analysis, aimed at investigating graded similarity in the FFA response patterns, across categories used in the experiment. Based on our hypothesis, we created a simple theoretical model assuming that responses to both types of facial expression sounds are the most similar to each other (animate sounds with high shape-action mapping transparency), somewhat similar to speech sounds (animate sounds with weaker shape-action mapping transparency), and the least similar to animal and object sounds (animate sounds with no clear shape-action mapping transparency and inanimate sounds). We observed a significant correlation between this theoretical model and FFA response patterns in the blind group (pFDR = 0.012), but not in the sighted group pFDR = 0.223). We believe that the RSA analysis further supports our visual-shape-to-action mapping conjecture, at least when it comes to blind subjects (see the Discussion section for our interpretation of the observed differences between the blind and the sighted subjects). We describe this new analysis in the revised text.

    1. Reviewer #2:

      In this work, Hofmann and colleagues conduct a study investigating the relationship between EEG alpha and subjective arousal in naturalistic (as opposed to controlled experimental) settings. Participants completed an immersive virtual reality experience while EEG was recorded, and continuously rated their subjective arousal while a video of the experience was replayed. Three different decoding methods were evaluated (Source Power Comodulation, Common Spatial Patterns, and Long Short-Term Memory Recurrent Neural Networks), each of which demonstrated above chance levels of performance, substantiating a link between lower levels of parietal/occipital alpha and subjective arousal. This work is notable because the roller-coaster simulation is a well-controlled, yet dynamic manipulation of arousal, and in its comparison of multiple decoding approaches (that can model the dynamics of affective responses). Indeed, this is an interesting proof of concept that shows it is possible to decode affective experience from brain activity measured during immersive virtual reality.

      Major concerns:

      The authors advocate that naturalistic experiments are needed to study emotional arousal, because "static" manipulations are not well-suited to capture the continuity and dynamics of arousal. This point is well-taken, but no comparisons were made between static and dynamic methods. Thus, although the work succeeds in showing it is possible to use machine learning to decode the subjective experience of arousal during virtual reality, it is not clear what new insights naturalistic manipulations and the machine learning approaches employed have to offer.

      The methods used to assess model performance are also a concern. Decoding models were evaluated separately for each subject using 10-fold cross-validation, and inference on performance was made using group-level statistics. Because time-series data are being decoded, if standard cross-validation was performed the results could be overly optimistic. Additionally, hyperparameters were selected to maximize model performance which can also lead to biased estimates. This is particularly problematic because overall decoding performance is not very high.

    2. Reviewer #1:

      Hofmann et al. investigate the link between two phenomena, emotional arousal and oscillatory alpha activity in the cerebral cortex, which is of central interest in their respective fields. Although alpha activity is tightly linked to the first reports of electric activity in the brain nearly 100 years ago, a comprehensive characterization of this phenomenon is elusive. One of the reasons is that EEG, the major method to investigate electric activity in the human brain, is susceptible to motion artifacts and, thus, mostly used in laboratory settings. Here, the authors combine EEG with a virtual reality setup to give experimental participants a roller-coaster ride with high immersion. The ride, literally, leads to large ups and downs in emotional arousal, which is quantified by the subjects during a later rerun. Next, the authors decode the degree of emotional arousal as stated in the rerun based on the EEG signals recorded during the VR session. They demonstrate convincingly a negative dependence of alpha activity with the degree of emotional arousal. Further, they demonstrate the differential involvement of parietal and occipital regions in this process. The sequencing of the description of methods and results could be improved upon, is, however, as such perfectly ok. This investigation comes timely, makes an important contribution to our understanding of the relation of emotions and sensory processing.

    3. Summary: Hofmann et al. investigate the link between two phenomena, emotional arousal and cortical alpha activity. Although alpha activity is tightly linked to the first reports of electric activity in the brain nearly 100 years ago, a comprehensive characterization of this phenomenon is elusive. One of the reasons is that EEG, the major method to investigate electric activity in the human brain, is susceptible to motion artifacts and, thus, mostly used in laboratory settings. Here, the authors combine EEG with virtual reality (VR) to give experimental participants a roller-coaster ride with high immersion. The ride, literally, leads to large ups and downs in emotional arousal, which is quantified by the subjects during a later rerun. Three different decoding methods were evaluated (Source Power Comodulation, Common Spatial Patterns, and Long Short-Term Memory Recurrent Neural Networks), each of which demonstrated above-chance levels of performance, substantiating a link between lower levels of parietal/occipital alpha and subjective arousal in a quasi-naturalistic setting.

      The reviewers both expressed some enthusiasm for the MS:

      The study is timely and makes an important contribution to our understanding of the relation of emotions and sensory processing

      Of potentially great interest to a broad audience

      The embedding in historic literature is excellent. I like it a lot.

      This work is notable because the roller-coaster simulation is a well-controlled, yet dynamic manipulation of arousal, and in its comparison of multiple decoding approaches (that can model the dynamics of affective responses). Indeed, this is an interesting proof of concept that shows it is possible to decode affective experience from brain activity measured during immersive virtual reality.

      Reviewer #1 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #2:

      This paper proposes a novel and relevant evolutionary model that explains many aspects of replication origin statistics in a family of yeast species. It is a step forward in our understanding of the evolutionary pressures that affect the distribution of replication origins in Eukaryotes. I recommend the authors address the following issue:

      1) Many of the conclusions of the paper are based on the claim that the extending the model by adding an efficiency bias to the origin death rate makes the model fit the data better; in particular, they say in line 213 that "the observed huge divergence in efficiency between lost origins and their neighbors is absent in the model simulations." This is reinforced in line 243, and in other parts of the text. But inspecting Fig 3, the two models (with and without a death rate bias) yield almost identical box-plots; if anything, the box-plots for the lost/nearest fractions of the pure double-stall aversion model seem visually to match the data marginally better. So why do the authors claim that the model with death rate bias is a much better fit? This is far from clear by just inspecting the data. I see no "huge difference" in the plots. There is a difference, but it is far from huge - the differences in the mean are much smaller than the size of the boxes. It seems to me unjustified to use this to choose one model over another. One way to ascertain this is to do rigorous statistical tests to determine if the differences in the means of the simulated and observed data are statistically significant; for example, a t-test.

    2. Reviewer #1:

      The manuscript entitled "An evolutionary model identifies the main selective pressures for the evolution of genome-replication profiles" is an examination of the principles shaping evolution of replication origin placement. Overall I found the manuscript to be engaging and interesting, and the topic of general importance. It is quite compelling that with just two parameters, origin efficiency and distance between origins, a good model can be built to describe the dynamics of origin birth and death. While this work on its own is sufficiently important for publication, it would be very interesting to see whether the model can be updated in the future to address whether there are fork-stalling or origin-generating mechanisms that shape evolution of specific inter-origin spaces. This work provides a very good foundation for such efforts.

      I have a few major, general concerns I would like the authors to address.

      If I'm interpreting the methods correctly, it seems the parameters used in these simulations, such as mean birth rate, mean death rate, gamma, and beta, were fit to the data once, and used as point estimates during simulation. If true, I expect the simulations to be yielding estimates of birth and death rates with a much narrower distribution of outcomes than is likely to be realistic given what an appropriate level of confidence in those parameter estimates would be. Could the parameters be fit to data in such a way that we attain an estimate of confidence in the parameter values, from which a distribution could be generated and sampled from during simulation?

      Closely related to my prior concern, I would like the authors to demonstrate the general predictive value of their model on out-of-sample data. Can the model be applied to other data on replication timing? Without such an attempt to demonstrate the model's applicability to out-of-sample prediction, the reader cannot ascertain whether the model is overfit to the Lachancea data from Agier et al, 2018. Also, keeps the parameter estimates here from being overfit to better predict origin birth and death events in closely related branches of the Lachancea tree in Figure S1? Are gamma and beta inferred in a way that accounts for the higher correlation in birth and death events in closer-related branches than in distal branches, or has the fit ignored those correlations?

      The authors state that their model identifies selective pressures. The authors imply, and specifically state in lines 238-242, that increased death rate of origins which happen to be nearby highly efficient origins represents selective pressure against the less efficient origins. It isn't until the discussion that the authors raise the possibility that there may simply be a lack of selective pressure to retain inefficient origins that are near highly efficient origins. In my view, it's more likely that selection for the existence of an inefficient origin is simply lower than the drift barrier, so mutagenesis and drift can passively remove such origins over time without the need to invoke selection against inefficient origins.

      Figure 3 is intended to show that the stall-aversion and interference model performs better at predicting correlations between efficiency of lost origins and their nearest neighbor. I agree, but I do not think Figure 3 presents a strong case for this conclusion. Fig S6 presents stronger evidence to me. While fig 3 does qualitatively suggest that the joint model may predict the correlation between neighboring origin efficiency and origin loss better than the double-stall model alone, it almost appears to me that the model with fork stalling and interference has significantly overestimated the correlation. Is there a quantitative way, perhaps using information criteria, though I admittedly am not sure how one would go about doing that with simulations such as these, to demonstrate that the model with both effects has better predictive value than the one with only fork stalling?

      There are a couple of assumptions of the model that I would like the authors to examine in further detail. First, that origin birth events occur in the middle of an inter-origin space. I am not aware of evidence pointing to this being a good a priori assumption. Can you re-run the simulations, allowing origins to arise at a random site within the inter-origin space into which it is born? Second, is it reasonable to expect origin firing rates to reshuffle to a new value randomly, without any dependence on their prior rate? Perhaps I'm mistaken, but it seems to me that an origin's firing rate should evolve more gradually, and should have a higher probability of sampling from values near its current value than from values very far from its current value.

    3. Summary: The reviewers appreciate that the manuscript presents a simple but compelling model that explains the dynamics of replication origin birth and death, which enhances our understanding of the selection pressures that have shaped the distribution of replication origins. However, both reviewers had a series of concerns.

    1. Author Response:

      Summary:

      This paper uses numerical simulations to model DNA replication dynamics in an in vitro Xenopus DNA replication system, both in unperturbed conditions and upon intra-S-checkpoint inhibition. The current work extends previous studies by the authors that recapitulated some but not all features of the replication program. The new model is superior as it can model both the frequency and the distribution of observed initiation events. Although the reviewers found the work in principle interesting and well executed, they have identified limitations of the study, both with respect to model validation and the extent to which the findings represent new biological insights into origin regulation and replication dynamics.

      We would like to thank the referees and the editor to have read and commented our work. The main message that we grasp from the three referees comments is that this work lacks “ new biological insights into origin regulation and replication dynamics”.

      To our knowledge, this work is the first one to clearly show that:

      • The origin clustering is not regulated by intra-S checkpoint in Xenopus egg extract as was proposed previously [1].

      • The variability of the rate of DNA synthesis close to replication forks is a necessary ingredient to describe the dynamic of replication origin firing.

      • Heterogenous firing probabilities in the embryonic Xenopus system

      We believe that the common referees conclusion arises because these important conclusions were not clearly and explicitly stated in our manuscript. Hence, we modified our manuscript to explicitly state these new insights. Please find below our detailed answers to the referee’s comments, criticisms and suggestions.

      Reviewer #1:

      The current work by Goldar and colleagues uses numerical simulations to model the spatiotemporal DNA replication program in an in vitro Xenopus DNA replication system. By comparing modeled data and experimental DNA combing data generated during unperturbed S-phase replication and upon intra-S checkpoint inhibition (which the authors published previously), the authors find that DNA replication in Xenopus extracts can be modeled by segmenting the genome in regions of high and low probability of origin activation, with the intra-S-phase checkpoint regulating origins with low but not high firing probability. Recapitulating the kinetics of global and local S-phase replication under different conditions through mathematical simulations represents an important contribution to the field. However, one concern I have pertains to the generality of the model, as the authors did not explore whether the model can accurately simulate replication under other conditions (e.g., checkpoint activation).

      In this work we showed that the same combination of processes can recapitulate several observations on the spatio-temporal pattern of DNA replication (as measured by DNA combing) in unchallenged and checkpoint inhibited conditions. Following the referee’s suggestion to “explore whether the model can accurately simulate replication under other conditions”, we also applied our methodology to a condition where Chk1 is over-expressed. We were able to reproduce the pattern of DNA replication as measured by DNA combing and found, as expected, that the over-expression of Chk1 reduces the rate of origin firing, but only by reducing the number of available limiting factors and not the capacity of potential origins to fire. This analysis was added to our manuscript and discussed.

      Major comments:

      1) In figure 1a and 1c, the authors show data that were previously published by the authors. Yet, the displayed values in 1a and 1c differ from those displayed in Figure 10 of Platel et al, 2015. This discrepancy should be explained.

      The discrepancy results from the thresholding of the optical signal and the smoothing of the experimental data in Platel et al, 2015. In the work presented here, we decided to model raw profiles after the thresholding. While the absolute values of the extracted data are different from those in Platel et al 2015, the trends of I(f) and fork density profiles are similar. We stated this point clearly in the caption of figure 2.

      2) The authors test whether their model can simulate replication when S-phase is perturbed by Chk1 inhibition, but not under opposite conditions of Chk1 activation. This important analysis should be included.

      The experimental mean chosen for activating or inhibiting (manipulating) the checkpoint in Platel et al 2015 was respectively to overexpress Chk1 protein, or to inhibit its activity using the specific inhibitor UCN-01. We further analysed Chk1 overexpression combed fibres and add this new analysis to our manuscript (See above).

      3) Although the MM4 model developed by the authors is in agreement with previously published experimental DNA combing data measured in the Xenopus system, it is unclear whether it can also accurately predict the replication program in other systems. Comparing simulated data with experimental data from another metazoan system would serve as an important additional validation of the authors' model.

      We agree with the referee that the generality of this model has to be tested by comparing it with experimental data from other metazoan. Unfortunately, to our knowledge, there is no available DNA combing data in other metazoan where the effect of inhibition ( and now “activation”) of intra S checkpoint have been measured concomitantly with cells under unchallenged growth 3 conditions. If the referee is aware of such an available data we will be happy to analyse them. It is possible to compare a simulation of our model with replication timing profiles measured by NGS techniques, by introducing in the model a distribution of length for regions where the probability of origin firing is high. This will result in a timing profile where we can define TTR and CTR as it has been done in human cell lines [2]. However, this requires the addition of a supplementary parameter: the length of domains with high probability of origin firing. This would complexify the model and cannot be justified on a statistical ground based on combing data (see annex 1 of our new manuscript, this model corresponds to MM6)

      Reviewer #2:

      Here the authors expand on their prior modeling of origin activity (Platel 2015) in xenopus extracts. Their prior work, while successful in some estimates, failed to reproduce the tight distribution of interorigin ("eye to eye") distances. Here the authors generate a series of nested models (MM1-MM4) of increasing complexity to describe the distribution and frequency of observed initiation events in an unperturbed S-phase. Not surprisingly, the fit improves with the increasing complexity of each model.

      The improvement of the concordance between the model and the data was assessed by 2 statistical methods (F test and AIC) in order to avoid overfitting of data. Both tests showed that the increasing complexity of the model were necessary to explain the variability of measured data. In fact, one could still increase the complexity of the model (for example one could use our fictitious model to fit the data ). In this case, the F test and AIC score show that the better representation of the data by the model is due to the increase in the complexity and not the necessity of considered processes. We included this discussion in annex 1 of our new manuscript.

      The authors then built an even more complex model based on prior published work to generate in silico data for which they tested their MM4 model. I admit to being a little lost at this point as to why the authors were using simulated data to assess their model and identify key parameters.

      The in silico data helps us to verify the quantitative ability of our model and validate the analysis process that we propose.

      Finally, the authors compare prior published experimental data from an unperturbed S-phase and one with an abrogated intra s-phase checkpoint (chk1 inhibition) and three parameters stood out J (rate limiting factor), 𝜃 (fraction of the genome with high origin initiation activity), and Pout (probability of remaining origins to fire) which suggests that Chk1 limits the probability of origin activation outside of the regions of the genome with high origin activation efficiency and modulates the activity of the rate limiting factor (J). These conclusions are consistent with prior observations in other systems. In summary, the authors apply elegant modeling approaches to describe xenopus in vitro replication dynamics and the effects of Chk1 inhibition, but the work fails to reveal new principles of eukaryotic origin regulation and replication dynamics.

      See above

      The most powerful modeling approaches are those that reveal a new or unexpected mode of regulation (or parameter) that can then be experimentally tested.

      We agree with the referee, and thank him for his comment. We re-wrote part of our manuscript to explicitly indicate “the new principles of eukaryotic origin regulation and replication dynamics” that our analysis implies.

      Additional points:

      This was a very specialized manuscript and would be difficult to read for general biologists. The terms/parameters were only defined in a table and many of the figures would not be parsable by a broad audience.

      We re-wrote part of the manuscript to make it more readable, and transfer technical details in annexes. We added a new subfigure Fig 1a to better explain combing parameters

      Figure 1. Sets off the challenge at hand -- that the previous model couldn't account for the distribution of "eye to eye" distances; but this is never assessed in similar format with the newer model. I assume this is captured in the appendix 1 figures, but was uncles if this was eye length or gap length.

      The referee is correct, this is represented in figures in annexes sections, where we showed that our modelling approach can reproduce in a satisfactory manner replication fraction of measured fibres, I(f), fork density, eye length distribution, gap length distribution and eye-to-eye distribution in all considered conditions. Following the referee’s suggestion we added in our new manuscript a figure comparable to figure 2 for our new model in the main text.

      Reviewer #3:

      General assessment:

      The authors arrive at a plausible model of DNA replication kinetics that reasonably fits six types of plots from fiber-combing data on Xenopus cell-free extracts, for normal and challenged cases. However, although the mechanisms postulated and the parameters inferred all seem reasonable, they rely on untested hypotheses and a single type of data (combing).

      All hypothesis used in this model have already been proposed and tested in existing literature, as stated in the discussion (lines 309-315 in the new manuscript) where all used hypothesis are explained and referenced.

      We use DNA combing data, and compare our conclusions to observations in the literature obtained by other techniques. Indeed, DNA combing (and in general DNA fibre stretching technique combined with optical detection) has the unique ability to allow working directly on distribution of parameters like eye-to-eye distances, eye-length…. Hence the data are not biased by any type of population averaging (as it is the case in the NGS our other classical biochemical techniques ).

      To truly convince, the authors need further experiments to test specific hypothesized mechanisms.

      This is not the purpose of this work and we do not propose any molecular mechanism. We look for essential ingredients necessary to reproduce spatio-temporal dynamic of DNA replication.

      Techniques such as Repli-Seq or perhaps FORK-seq (recently developed by one of the authors here) might give direct information on the variation of initiation efficiency across the genome.

      We analyse data from Xenopus invitro system that has been extensively used to investigate spatio-temporal pattern of DNA replication. Unfortunately, the referenced genome of this organism is not assembled accurately enough to allow techniques such as Repli-Seq or FORKseq that require mapping procedure on a reference genome. Furthermore, these techniques require a cell population containing more than 107 individuals [3], here we are working with 200000 to 500000 nuclei. Hence without changing model system these techniques could not be applied.

      Substantive Concerns:

      1) The authors refer to each case (MM1-5) as a unique model, but each has more complexity and defines a class of models.

      MM1-5 belong all to the unique class of nucleation and growth process defined as KJMA model. All models are variants of this model. We do not understand the point of referee, if the referee means that each case can represent the data not in a unique manner, we agree with him/here and this is the reason we used a genetic algorithm and not a gradient descent algorithm to minimise the difference between the data and the considered model.

      For example, in fitting MM1, the simplest of all the cases (and with, by far, the worst fit), the fork velocity was fixed at 0.5 kb/min. And yet the real fork velocity is described as having v ~ 0.5 kb/min. Shouldn't this also be a parameter in the fit?

      We chose to keep the velocity as a constant and close to the observed experimental value, as in Xenopus egg extract it is assumed that the fork velocity is constant [4]. But indeed, one could consider fork velocity as a fitting parameter (see the answer to the next point), but this is not in accordance with experimental observations.

      2) Under replication stress, forks can stall, giving an effectively two populations of forks, as proposed by the authors in an earlier work (Ciardo et al., Genes 2019; cf. Fig. 1). Strangely, that paper is not referred to or discussed in this manuscript. Why not?

      Indeed, instead of self-citation of a review article we preferred to refer to original experimental works. Furthermore, in order to change the mean of eye to eye distribution by only changing the speed of replication forks, one should consider that the speed of replication forks should have a value higher than 10kb/min‼! which has not been reported in any organism. To be conservative, we ran a model where the speed of replication forks could take several values ranging between 0 to 3kb/min. The model failed to fit the experimental data. (see the new manuscript and annex 1). Hence, we consider that the best model is the one with constant speed.

      3) Continuous vs. discrete potential origins: The density was fixed to be random at 1 potential origin per 2.3 kb (or 1 kb in part of the paper). How robust are findings to these assumed densities?

      If we consider the density as a free parameter, the model converges with a density of 1 origin every 2.3 kb.

      In general, there does not seem to be a huge difference between the two cases, for the type of data explored. Perhaps it is not worth looking at the discrete case here?

      The difference is that in the “discrete” case the distribution of origins is not continuous and hence there naturally exists a distance between two fired origin where the origin firing is inhibited. The existence of such an origin firing exclusion zone was shown to be necessary to model replication dynamic as measured by DNA combing [5,6].

      4) The definition of goodness of the fit (GoF) should be made more explicitly. How is the norm calculated? There is an implicit sum - the elements should be defined explicitly. Also, the ensemble average < yexp > is not defined. More broadly, it is not clear why we need a custom GoF statistic when it would seem that standard ones (chi square, or - ln likelihood) could serve equally well.

      The defined GoF is a classical normalised chi squared as defined in annex 1. We modified the text to include explicitly the summation over the data points. By definition <yexp> is the average value of an experimental data series. GoF is not a custom defined criterion but the classical normalised chi square [7].

      Note that those statistics (when proper normalization is used) can also work for global fits where each local fit is to a quantity with different units.

      References:

      1. Ge XQ, Blow JJ. Chk1 inhibits replication factory activation but allows dormant origin firing in existing factories. The Journal of Cell Biology. 2010;191: 1285–1297. doi:10.1083/jcb.201007074

      2. Pope BD, Ryba T, Dileep V, Yue F, Wu W, Denas O, et al. Topologically associating domains are stable units of replication-timing regulation. Nature. 2014;515: 402–405. doi:10.1038/nature13986

      3. Petryk N, Kahli M, d’Aubenton-Carafa Y, Jaszczyszyn Y, Shen Y, Silvain M, et al. Replication landscape of the human genome. Nat Commun. 2016;7: 10208. doi:10.1038/ncomms10208

      4. Marheineke K, Hyrien O. Control of Replication Origin Density and Firing Time in Xenopus Egg Extracts ROLE OF A CAFFEINE-SENSITIVE, ATR-DEPENDENT CHECKPOINT. J Biol Chem. 2004;279: 28071–28081. doi:10.1074/jbc.M401574200

      5. Löb D, Lengert N, Chagin VO, Reinhart M, Casas-Delucchi CS, Cardoso MC, et al. 3D replicon distributions arise from stochastic initiation and domino-like DNA replication progression. Nature Communications. 2016;7: 11207. doi:10.1038/ncomms11207

      6. Jun S, Herrick J, Bensimon A, Bechhoefer J. Persistence length of chromatin determines origin spacing in Xenopus early-embryo DNA replication: quantitative comparisons between theory and experiment. Cell Cycle. 2004;3: 223–229.

      7. Bevington P, Robinson DK. Data Reduction and Error Analysis for the Physical Sciences. McGraw-Hill Education; 2003.

    2. Reviewer #3:

      General assessment:

      The authors arrive at a plausible model of DNA replication kinetics that reasonably fits six types of plots from fiber-combing data on Xenopus cell-free extracts, for normal and challenged cases. However, although the mechanisms postulated and the parameters inferred all seem reasonable, they rely on untested hypotheses and a single type of data (combing). To truly convince, the authors need further experiments to test specific hypothesized mechanisms. Techniques such as Repli-Seq or perhaps FORK-seq (recently developed by one of the authors here) might give direct information on the variation of initiation efficiency across the genome.

      Substantive Concerns:

      1) The authors refer to each case (MM1-5) as a unique model, but each has more complexity and defines a class of models. For example, in fitting MM1, the simplest of all the cases (and with, by far, the worst fit), the fork velocity was fixed at 0.5 kb/min. And yet the real fork velocity is described as having v ~ 0.5 kb/min. Shouldn't this also be a parameter in the fit?

      2) Under replication stress, forks can stall, giving an effectively two populations of forks, as proposed by the authors in an earlier work (Ciardo et al., Genes 2019; cf. Fig. 1). Strangely, that paper is not referred to or discussed in this manuscript. Why not?

      3) Continuous vs. discrete potential origins: The density was fixed to be random at 1 potential origin per 2.3 kb (or 1 kb in part of the paper). How robust are findings to these assumed densities? In general, there does not seem to be a huge difference between the two cases, for the type of data explored. Perhaps it is not worth looking at the discrete case here?

      4) The definition of goodness of the fit (GoF) should be made more explicitly. How is the norm calculated? There is an implicit sum - the elements should be defined explicitly. Also, the ensemble average < yexp > is not defined. More broadly, it is not clear why we need a custom GoF statistic when it would seem that standard ones (chi square, or - ln likelihood) could serve equally well. Note that those statistics (when proper normalization is used) can also work for global fits where each local fit is to a quantity with different units.

  3. Jan 2021
    1. Reviewer #3:

      In this manuscript, the authors utilize single-cell/single-nucleus RNA-sequencing to perform a comparative analysis of the cellular composition of the dorsal lateral geniculate nucleus (dLGN) in mice, non-human primates, and humans. This topic is important for a number of reasons, including (1) the dLGN is a critical center of visual processing about which we know relatively little; (2) the dLGN has emerged as a widely used experimental model of neural circuit development; and (3) in general, the integration of cross-species data at the transcriptomic level is important for identifying conserved mechanisms of brain function that may shed light upon neurological disease states. By employing a relatively deep RNA-sequencing approach (Smart-Seq) the authors identify major excitatory and inhibitory dLGN cell types within each species. While the multiple inhibitory neuron subtypes were relatively similar across species, excitatory neurons displayed major differences particularly between mouse and both primate classes. The authors identified four major excitatory cell types in primate and human dLGN corresponding with known functional heterogeneity that places these neurons into magnocellular, parvocellular, and koniocellular populations. Interestingly, koniocellular neurons could be broken into two distinct subtypes. Somewhat surprisingly, the authors noted a lack of excitatory neuron diversity in the mouse, despite prior evidence suggesting these neurons can have different morphological and physiological features. Yet, although all excitatory neurons in the mouse clustered together, there were subtle differences in excitatory neurons in the mouse that aligned with different regions of mouse dLGN (shell vs core), suggesting that excitatory neuron heterogeneity may still exist along a more subtle continuum. Consistently, neurons in the shell region in mouse dLGN more strongly resembled koniocellular neurons in primates versus the core region, suggesting some level of conservation between excitatory neuron identity across species. While the study is largely descriptive, the authors are creative in their use of bioinformatics to uncover particularly interesting observations that the transcriptomic analysis yielded, and the paper is very interesting because of that. The major weakness of the paper is a paucity of robust FISH analyses to quantitatively validate the transcriptomic findings in all species. Overall, it is my opinion that this work is very important and that, at a broader level, it may help to define the relationship between transcriptomic cell type, functional/physiological cell type, and anatomical cell type within a brain region that is critical for visual function and that has emerged as a fascinating model of neural circuit development in the mouse.

      Strengths:

      The Smart-Seq transcriptomic technique chosen is appropriate to address the authors' questions.

      The data were generated rigorously and subjected to an in-depth quality control pipeline prior to analysis. As a result, the quality of the transcriptomic data is high.

      The paper includes a detailed, transparent description of the approach taken in the Results and Methods. The authors point out caveats and weaknesses - and how they were addressed - throughout the text.

      The inclusion of tissue from thalamic nuclei surrounding the dLGN as a way to control for the unintentional inclusion of non-dLGN tissue in the experimental dissection was well-designed and effective.

      Despite a couple of exceptions, the authors do an excellent job of placing their findings within the context of what is already known about dLGN cell types across different species, and how these cell types function differently in physiological, morphological, and anatomical terms.

      The study is descriptive in nature but the authors do a nice job of laying out several interesting findings, such as the observation that GABAergic neurons are more conserved across species than relay neurons, with mouse neurons being particularly distinct. Another fascinating observation is that shell-located neurons in mouse dLGN are transcriptomically related to koniocellular neurons suggesting the possibility of a close relationship between thalamocortical connectivity and molecular identity across species.

      Weaknesses:

      The characterization of gene expression patterns through sequencing-based transcriptomics has emerged as a powerful tool for dissecting the brain, but it is important to couple such approaches with techniques like fluorescence in situ hybridization (FISH) to verify sequencing results in a histological context. While here the authors show 3 - 4 validations of mouse genes that seem to be restricted to or excluded from the shell versus the core dLGN regions (Figures 4G and S4E), the conclusions of the study would be better supported by a more extensive and rigorous analysis of cell-type-specific gene expression within all species described.

      It is not entirely clear from the manuscript how the authors dissected the shell from the core region of the dLGN, given these regions are not as clearly distinct as the dLGN lamina in other species. One possibility would be to take advantage of the fact that the shell receives input from specific RGCs that can be targeted genetically by crossing a Cre driver to the TdTomato line, but I do not believe that that is what was done here. I also noted that the authors use ventral LGN (vLGN) as one of their controls for the precision of their micro-dissections, but given that the vLGN does not directly contact the dLGN, this had me wondering exactly how cleanly the shell and core regions of the mouse dLGN were isolated.

      On lines 101 - 103, the authors state "...differentially expressed genes between donors were related to neuronal signaling and connectivity and not to metabolic or activity-dependent effects." Table S2 is cited, but the columns are not labeled such that a common reader could interpret them and confirm the statement in the text. Moreover, the text does not state how the authors made the determination that these differentially expressed genes are not related to "activity-dependent effects".

    2. Reviewer #2:

      The conclusion was quite surprising from their anatomical differences and connectivity to the cortex, however, implies different mechanisms underlie for species specific circuit organization.

      The manuscript is well-organized and well-written with strong figures. I have only a few comments/suggestions to further improve the overall quality of this manuscript.

      I understand obtaining human and NHP tissue is difficult and hard to perform numbers of ISH. Therefore, there is a database that provides additional information on gene expression in NHP LGN (https://gene-atlas.brainminds.riken.jp/). From this database, it is possible to obtain parvocellular specific and magnocellular specific gene expression by fine structure search, which may be worth comparing with the results in the current paper. Many researchers have realized that marmoset is one of the good animal models to understand human brain function and dysfunction, therefore, it is worth including marmoset for comparative analysis for community interest.

    3. Reviewer #1:

      In this manuscript, Bakken et al use single cell and single nucleus RNA-sequencing to conduct comparative analysis of dLGN in humans, macaques and mice. dLGN exhibits a dramatic reorganization and lamination in primates relative to mice. Other components of the visual system (retina, V1) have previously been explored with cross-species transcriptomic analyses to reveal species-specific or evolutionary modifications. How dLGN fits in this picture, and the extent to which differences amongst previously identified cell types can be discerned from transcriptomic data, is an important question.

      The conclusions are supported by the data, but the paper could better motivate what the main questions or debates are.

      Strengths:

      The authors use highly sensitive SMART-seq v4 to collect and analyze thousands of cells from dLGN and some adjacent nuclei. The gene detection rate using this method is impressive, and the plate/strip-based workflow has distinct advantages in terms of lower ambient contamination and risk of doublets compared to microfluidics-based single cell platforms. Cells or nuclei are sorted to enrich for neurons, which are the main focus of this paper. Key results are validated by smFISH or by examining publicly available Allen Brain Atlas ISH data. By examining conservation and divergence of cell types and evolutionarily conserved thalamic nucleus that has nonetheless undergone dramatic anatomical reorganization, these data and analyses add to our understanding of how cell types evolve in mammalian brains. They also contribute nuance to the ongoing debate of the extent to which transcriptomic data alone can be used to identify and discriminate cell types that have been described using other methodologies.

      Weaknesses:

      The Introduction does a nice job of describing what is known about the anatomy and cell types of the dLGN in each species, but it is less obvious what the motivating cross-species question is. Similarly, the Discussion focuses on technical details but the take-away is not clear.

      dLGN is collected from all species, but in some species (macaque, mouse), additional thalamic nuclei are also collected. These are useful for examining cell type correspondences across regions or shifts between species, but their inclusion in cross-species integrations can also distort results (e.g. with some integration approaches, inclusion of very different, dataset-specific cell types can distort integration of more similar types). Analyses could be done to better distinguish the evolutionary comparisons within dLGN itself vs. what is additionally learned from inclusion of extra-dLGN nuclei.

      One major evolutionary difference can involve differences in cell type proportions. Some proportion results are described but mainly for individual species (some of which include extra-dLGN regions) rather than in the integrated maps, so they can't be compared across species. The FISH results could also be used to corroborate proportion changes when such data are available.

      Parameters for clustering analysis (using CCA/Seurat) are not described. Often changes in parameters can change the clusters, and it would be important to know if species integration results robust across a range of parameters and inclusion of extra-dLGN regions.

      Some expected genes (PVALB) are barely detected in the macaque neurons, raising the question of whether this is due to tissue or annotation/alignment quality.

    4. Summary: This manuscript provides a comparative analysis of the cell variety present in the dorsal lateral geniculate nucleus (dLGN) of mice, non-human primates, and humans using single-cell/single-nucleus RNA-sequencing (Smart-seq). The study identifies excitatory and inhibitory dLGN cell types in the three species and shows that the different subclasses of inhibitory neurons are relatively similar across species. In contrast, excitatory neurons appear to bear cross-species differences particularly between mouse and primates. The study provides an extensive description of the dLGN neurons, an important visual relay nucleus that has been so far poorly studied. As such, these data are very welcomed and will likely attract the interest of researchers working in visual function and beyond. The strong and creative bioinformatics analysis has uncovered interesting and subtle cross species links between different types of neurons.

      Reviewer #1, Reviewer #2 and Reviewer #3 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #2:

      The paper investigates the temporal signatures of single-neuron activity (the autocorrelation timescale and latency) in two frontal areas, MCC and LPFC. These signatures differ between the two areas and cell classes, and form an anatomical gradient in MCC. Moreover, the intrinsic timescales of single neurons correspond with their coding of behaviorally relevant information on different timescales. The authors develop a detailed biophysical network model which suggests that after-hyperpolarization potassium and inhibitory GABA-B conductances may underpin the potential biophysical mechanism that explains diverse temporal signatures observed in the data. The results appear exciting, as the proposed relationship between the intrinsic timescales, coding of behavioral timescales, and anatomical properties (e.g., the amount of local inhibition) in the two frontal areas is novel. The use of the biophysically detailed model is creative and interesting. However, there are serious methodological concerns undermining the key conclusions of this study, which need to be addressed before the results can be credited.

      Major Concerns:

      1) One of the key findings is the correspondence between the intrinsic timescales of single neurons and their coding of information on different behavioral timescales (Fig. 4). However, the method for estimating the intrinsic timescales has serious problems which can undermine the finding.

      1.1 The authors developed a new method for estimating autocorrelograms from spike data but the details of this method are not specified. It is stated that the method computes the distribution of inter-spike-intervals (ISIs) up to order 100, which was "normalized", but how it was normalized is not described. The correct normalization is crucial, as it converts the counts of spike coincidences (ISI distribution) into autocorrelogram (where the coincidence counts expected by chance are subtracted) and can produce artifacts if not performed correctly.

      1.2 The new method, described as superior to the previous method by Murray et al, 2014, appears to have access to more spikes than the Murray's method (Fig. 2). Where is this additional data coming from? While Murray's method was applied to the pre-cue period, the time epoch used for the analysis with the new method is not stated clearly. It seems that the new method was applied to the data through the entire trial duration and across all trials, hence more spikes were available. If so, then changes in firing rates related to behavioral events contribute to the autocorrelation, if not appropriately removed. For example, the Murray's method subtracts trail-averaged activity (PSTH) from spike-counts, similar to shuffle-correction methods. If a similar correction was not part of the new method, then changes in firing rates due to coding of task variables will appear in the autocorrelogram and estimated timescales. This is a serious confound for interpretation of the results in Fig. 4. For example, if the firing rate of a neuron varies slowly coding for the gauge size across trials, this will appear as a slow timescale if the autocorrelogram was not corrected to remove these rate changes. In this case, the timescale and GLM are just different metrics for the same rate changes, and the correspondence between them is expected. Before results in Fig. 4 can be interpreted, details of the method need to be provided to make sure that the method measures intrinsic timescales, and not timescales of rate changes triggered by the task events. This is an important concern also because recent work showed that there is no correlation between task dependent and intrinsic timescales of single neurons, including in cingulate cortex and PFC (Spitmaan et al., PNAS, 2020).

      2) The balanced network model with a variety of biophysical currents is interesting and it is impressive that the model reproduces the autocorrelation signatures in the data. However, we need to better understand the network mechanism by which the model operates.

      2.1 The classical balanced network (without biophysical currents such as after-hyperpolarization potassium) generates asynchronous activity without temporal correlations (Renart et al., Science, 2010). The balanced networks with slow adaptation currents can generate persistent Up and Down states that produce correlations on slow timescales (Jercog et al., eLife, 2017). Since slow after-polarization potassium current was identified as a key ingredient, is the mechanism in the model similar to the one generating Up and Down states, or is it different? Although the biophysical ingredients necessary to match the data were identified, the network mechanism has not been studied. Describing this network mechanism and presenting the model in the context of existing literature is necessary, otherwise the results are difficult to interpret for the reader.

      2.2 Does the model operate in a physiologically relevant regime where the firing rates, Fano factor etc. are similar to the data? It is hard to judge from Fig. 5b and needs to be quantified.

      2.3 The latency of autocorrelation is an interesting feature in the data. Since the model replicates this feature (which is not intuitive), it is important to know what mechanism in the model generates autocorrelation latency.

      3) HMM analysis is used to demonstrate metastability in the model and data, but there are some technical concerns that can undermine these conclusions.

      3.1 HMM with 4 states was fitted to the data and model. The ability to fit a four-state HMM to the data does not prove the existence of metastable states. HMM assumes a constant firing rate in each "state", and any deviation from this assumption is modeled as state transitions. For example, if some neurons gradually increase/decrease their firing rates over time, then HMM would generate a sequence of states with progressively higher/lower firing rates to capture this ramping activity. In addition, metastability implies exponential distributions of state durations, which was not verified. No model selection was performed to determine the necessary number of states. Therefore, the claims of metastable dynamics are not supported by the presented analysis.

      3.2 HMM was fit to a continuous segment of data lasting 600s, and the data was pooled across different recording sessions. However, different sessions have potentially different trial sequences due to the flexibility of the task. How were different trial types matched across the sessions? If trial-types were not matched/aligned in time, then the states inferred by the HMM may trivially reflect a concatenation of different trial types in different sessions. For example, the same time point can correspond to the gauge onset in one session and to the work trial in another session, and vice versa at a different time. If some neurons respond to the gauge and others to the work, then the HMM would need different states to capture firing patterns arising solely from concatenating the neural responses in this way. This confound needs to be addressed before the results can be interpreted.

    2. Reviewer #1:

      This is an interesting manuscript which covers an important topic in the field of computational neuroscience - the 'temporal signatures' of individual neurons. The authors set out to address several important questions using a single-neuron electrophysiology dataset, recorded from monkeys, which has previously been published. The behavioural paradigm is well designed, and particularly well suited to investigating the functional importance of different temporal signatures - as it simultaneously requires the subjects to monitor feedback across a short timescale, as well as integrate multiple outcomes across a longer timescale. The neural data are of high quality, and include recordings from lateral prefrontal cortex (LPFC) and mid-cingulate cortex (MCC). First, the authors modify an existing method to quantify the temporal signatures of individual neurons. This modification appears helpful, and an improvement on similar previously published methods, as the authors are able to capture the temporal signatures of the vast majority of neurons they recorded from. The temporal signatures differ across brain regions, and according to the neurons' spike width. The authors argue that the temporal signatures of a subset of neurons are modified by the subjects' degree of task engagement, and that neurons with different temporal signatures play dissociable roles in task-related encoding. However, I have several concerns about these conclusions which I will outline below. The authors then present a biophysical network model, and show that by varying certain parameters in their model (AHP and GABAB conductances) the temporal signatures of the monkey data can be reproduced. Although I cannot comment on the technical specifics of their models, this seems to be an important advance. Finally, they perform a Hidden-Markov Model analysis to investigate the metastability of activity in MCC, LPFC, and their network model. However, there are a few important differences between the model and experimental data (e.g. neurons recorded asynchronously, and the network model not performing a task) that limit the interpretation of these analyses. Overall, I found the manuscript interesting - and the insights from the biophysical modelling are exciting. However, in its current form, the conclusions drawn from the experimental data are not supported by sufficient evidence.

      Major Comments:

      1) The authors use a hierarchical clustering algorithm to divide neurons into separate groups according to their spike width and amplitude (Fig 1C). There are three groups: FS, RS1, and RS2. The authors ultimately pool RS1 and RS2 groups to form a single 'RS' category. They then go on to suggest that RS neurons may correspond to pyramidal neurons, and FS neurons to interneurons. I have a few concerns about this. Firstly, the suggestion that spike width determined from extracellular recordings in macaques can be used as an indicator of cell type is controversial. A few studies have presented evidence against this idea (e.g. Vigneswaran et al. 2011 JNeuro; Casale et al. 2015 JNeuro). The authors should at least acknowledge the limitation of the inference they are making in the discussion section. Secondly, visualising the data alone in Fig 1C, it is far from clear that there are three (or two) relatively distinct clusters of neurons to warrant treating them differently in subsequent analyses. In the methods section, the authors mention some analyses they performed to justify the cluster boundaries. However, this data is not presented. A recent study approached this problem by fitting one gaussian to the spike waveform distribution, then performing a model comparison to a 2-gaussian model (Torres-Gomez et al. 2020 Cer Cortex). Including an analysis such as this would provide a stronger justification for their decision to divide cells based on spike waveform.

      2) The authors conclude that the results in Fig 3 show that MCC temporal signatures are modulated by current behavioural state. However, this conclusion seems a bit of a stretch from the data currently presented. I can understand why the authors used the 'pause' periods as a proxy for a different behavioural state, but the experiment clearly was not designed for this purpose. As the authors acknowledge, there is only a very limited amount (e.g. a few minutes) of 'pause' data available for the fitting process compared with 'engage' data. Do the authors observe the same results if they constrain the amount of included 'engage' data to match the length of the 'pause' data? Also, presumably the subjects are more likely to 'pause' later on in the behavioural session once they are tired/sated. Could this difference between 'pause' and 'engage' data be responsible for the difference in taus? For instance, there may have been more across-session drift in the electrode position by the time the 'pause' data is acquired, and this could possibly account for the difference with the 'engage' data. Is the firing rate different between 'pause' and 'engage' periods - if so, this should be controlled for as a covariate in the analyses. Finally it is not really clear to me, or more importantly addressed by the authors, as to why they would expect/explain this effect only being present in MCC RS neurons (but not FS or LPFC neurons).

      3) At many points in the manuscript, the authors seem to be suggesting that the results of Fig 4 demonstrate that neurons with longer (shorter) timescales are more involved in encoding the task information which is used across longer (shorter) behavioural timescales (e.g. "long TAU were mostly involved in encoding gauge information", and "population of MCC RS units with short TAU was mostly involved in encoding feedback information"). However, I disagree that this conclusion can be reached based on the way the analysis has currently been performed. A high coefficient simply indicates that the population is biased to be more responsive depending on a particular trial type / condition - i.e. the valence of encoding. This does not necessarily tell us how much information the population of neurons is encoding, as the authors suggest. For instance, every neuron in the population could be extremely selective to a particular parameter (i.e. positive feedback), but if half the neurons encode this attribute by increasing their firing but the other half of neurons encode it by decreasing their firing, the effects will be lost in the authors' regression model (i.e. the beta coefficient would equal 0). I would suggest that the authors consider using an alternative analysis method (e.g. a percentage of explained variance or coefficient of partial determination statistic for each neuron) to quantify coding strength - then compare this metric between the high and low tau neurons.

      4) Similarly, in Fig 4 the authors suggest that the information is coded differently in the short and long tau neurons. However, they do not perform any statistical test to directly compare these two populations. One option would be to perform a permutation test, where the neurons are randomly allocated into the 'High TAU' or 'Low TAU' group. A similar comment applies to the different groups of neurons qualitatively compared in panel Fig 4C.

      5) The authors make an interesting and well-supported case for why changing the AHP and GABA-B parameters in their model may be one mechanism which is sufficient to explain the differences in temporal signatures they observed between MCC and LPFC experimentally. However, I think in places the conclusions they draw from this are overstated (e.g. "This suggests that GABA-B inhibitory - rather than excitatory - transmission is the causal determinant of longer spiking timescales, at least in the LPFC and MCC."). There are many other biophysical differences between different cortical regions - which are not explored in the authors' modelling - which could also account for the differences in their temporal signatures. These could include differences in extra-regional input, the position of the region in an anatomical hierarchy, proportion of excitatory to inhibitory neurons, neurotransmitter receptor/receptor subunit expression, connectivity architecture etc. I think the authors should tone down the conclusions a little, and address some more of these possibilities in more detail in their discussion.

      6) For the Hidden Markov Model, I think there are a couple of really important limitations that the authors only touch upon very briefly. Firstly, the authors are performing a population-level analysis on neurons which were not simultaneously recorded during the experiment (only mentioned in the methods). This really affects the interpretation of their results, as presumably the number of states and their duration is greatly influenced by the overall pattern of population activity which the authors are not able to capture. At this stage of the study, I am not sure how the authors can address this point. Secondly, the experimental data is compared to the network model which is not performing any specific task (i.e. without temporal structure). The authors suggest this may be the reason why their predictions for the state durations (Fig 7B) are roughly an order of magnitude out. Presumably, the authors could consider designing a network model which could perform the same task (or a simplified version with a similar temporal structure) as the subjects perform. This would be very helpful in helping to relate the experimental data to the model, and may also provide a better understanding of the functional importance of the metastability they have identified in behaviour.

      7) It is not clear to me how many neurons the authors included in their dataset, as there appear to be inconsistencies throughout the manuscript (Line 73, Fig 1A-B: MCC = 140, LPFC = 159; line 97-98: MCC = 294, LPFC=276; Fig2: MCC = 266, LPFC = 258; Methods section line 734-735 and Fig 2S2: MCC = 298, LPFC = 272). While this is likely a combination of typos and excluding some neurons from certain analyses, this will need to be resolved. It will be important for the authors to check their analyses, and also add a bit more clarity in the text as to which neurons are being included/excluded in each analysis, and justify this.

    3. Summary: Both reviewers found this study's findings to be interesting and novel, and they appreciated the integration of intrinsic timescale analysis, coding of behavioral signals, and exploration of mechanisms in a biophysical circuit model. However, they both raised serious concerns about methodological and interpretational aspects of the analyses which would need to be satisfactorily addressed in subsequent review. In consultation, both reviewers were in agreement with all points raised in the other's review. I view the two reviewers' requests and suggestions as appropriate and complementary, and all of them needed response.

      I highlight here two of the major concerns raised by the reviewers (but all raised by the reviewers merit addressing). First is the need for intrinsic timescale analysis to not be confounded by slowly varying changes in firing rate coding task variables (Point 1.2 of Reviewer 2). This is important for interpretation on intrinsic timescale and its correlation with task variable coding as a major result. Second is the interpretation and implementation of Hidden Markov Models to non-simultaneously recorded neurons (Points 6 of Reviewer 1 and 3 of Reviewer 2). Here again the HMM states may be driven by task variable coding which is not corrected for, which could confound interpretation of results as in terms of meta-stable states and its link to the circuit model. Furthermore, HMM analysis of the circuit model does not match its methodology for experimental data, but could be through non-simultaneous model spike trains, which the manuscript does not justify. I will add my own suggestion that perhaps both of these methodological data analysis concerns could potentially be addressed through comparison to the null model of a non-homogeneous Poisson process with firing rate given by the variable-coding PSTH. I do not consider it necessary to study a network model that performs the task, as suggested for consideration by Reviewer 1.

    1. Reviewer #3:

      In this study, Michaluk et al. explored the membrane dynamics of the main glial glutamate transporter GLT1 in hippocampal astrocytes, which was previously shown to shape synaptic transmission through regulating extracellular levels of glutamate and whose dysfunction may lead to pathologic conditions. Their results underscore the importance of the GLT1 C-terminus in the membrane turnover as well as in the activity-dependent lateral diffusion of the transporter at the plasma membrane.

      To access GLT1 dynamics, the authors generated and imaged a pH-sensitive fluorescent analogue of the GLT1a isoform, namely GLT1-SEP, which fluoresces when exposed to the extracellular space but not in low pH intracellular compartments. By performing Fluorescent Recovery After Photobleaching (FRAP) in astrocytes from cell cultures, they show that about 75% of GLT1-SEP dwell at the cell membrane with a lifetime of about 22 s. Super-resolution dSTORM imaging further revealed that surface GLT1 distributes in clusters showing a spatial correlation with PSD-95 synaptic marker. In astrocytes from cell cultures or brain slices, the authors were able to monitor lateral diffusion of GLT1-SEP at the plasma membrane with FRAP; they recapitulated previous findings based on single molecule tracking experiments and showed that 25% of surface GLT1-SEP remains immobile (or slowly mobile) and that this immobile fraction decreases upon elevated network activity. Interestingly, deleting the C-terminus of GLT1-SEP does not alter much the intracellular fraction of GLT1-SEP, the fraction of immobile GLT1-SEP at the membrane or its ability to organize in clusters under basal conditions. However, GLT1-SEP lacking the C-terminus show a higher turnover at the membrane under basal conditions and surface GLT1-SEP clusters are not associated with synaptic markers anymore. Finally, removing the GLT1 C-terminus blocks the increase in the mobile fraction that is normally observed upon elevating neuronal activity.

      Strengths:

      While previous studies have unveiled a role for the lateral diffusion of GLT1 in controlling the recruitment of GLT1 near active synapses, the present study uses powerful optical approaches and analysis tools that allow for the monitoring of both lateral mobility and the exchange between membrane and intracellular fractions of GLT1. Furthermore, important and original information is provided about the nanoscale organization of GLT1 transporter at proximity of synapses and the fact that this organization depends on the C-terminal domain of GLT1. The results unveil an important role for membrane turnover as a possible 'redeployment' route for the immobile fraction of GLT1 at the plasma membrane.

      Weaknesses:

      1) Although overexpressed GLT1-SEP displays a similar expression pattern as endogenous GLT1 (assessed through dSTORM experiments), the expression level of GLT1-SEP relative to endogenous GLT1 has not been addressed by the authors. In particular, whether overexpressing GLT1-SEP impacts glutamate uptake currents and whether this could affect membrane turnover has not been measured.

      2) The authors did not test the impact of neuronal activity on membrane turnover or surface distribution of GLT1-SEP, like they did for lateral mobility. This would be important to provide support for the 'redeployment route' hypothesis that the authors propose.

      3) The FRAP data in organotypic slices looking at the effect of deleting GLT1 C-terminus, blocking mGluRs or buffering Ca2+ with BAPTA on GLT1 lateral mobility (Figure 5C-G) is not very convincing. The trend for lower immobile fraction upon 4AP compared to control is maintained across conditions. The lack of statistical difference between control and 4AP in Fig. 5D, 5E and 5F might come from the smaller n number (n = 30-48) compared to the control condition (n = 72) and/or higher variability.

      4) The importance of GLT1 membrane turnover for controlling glutamate spillover ('extrasynaptic glutamate escape) and synaptic transmission/plasticity is missing.

      5) While providing new information about the turnover of the GLT1a isoform, this study does not provide information about other GLT1 isoforms, in particular GLT1b, which contain unique C-terminal domains and which could thus display different membrane and lateral diffusion dynamics. The authors should justify why they focused on this specific isoform.

    2. Reviewer #2:

      A state of the art imaging of the dynamics of astroglial glutamate transporters that certainly add novel perspective into this, quite important and hot field. Experiments and clean and convincing, the data obtained fully support conclusions.

      Comments:

      1) The authors mention the importance of efficient glutamate uptake in the development of neuropathological conditions, but do not discuss this in regards to their results. Such a discussion would seem relevant.

      2) Authors conclude that the membrane turnover pathway should be a particularly important GLT-1 resupply mechanism near excitatory synapses as some earlier studies have found the lowest lateral membrane mobility of GLT-1 there. In this context, it would be of interest to have some quantitative tips as to the relationship between the level of excitatory activity and the occupancy of local GLT-1.

      3) There is a recent work implicating the C-terminus in the surface assembly of GLT-1 (Peacy et al Mol Pharm 2020), which seems relevant to the present findings. Please discuss further.

      4) Functional activity of glutamate transporters is linked to (and is being regulated by) astroglial Na signalling; any suggestions how proposed turnover cycle may affect cytosolic Na+ dynamics

      5) The Fig. 5A imaging data seem to nicely provide both surface and cytosol labelling of the same cell. Perhaps the authors could thus assess the distribution of surface-to-volume ratios across live astroglia: to my knowledge, such data has not yet been available.

      6) Figure 1B: Please provide further detail regarding fast-exchange solution application, its physical arrangement, etc.

      7) Figure 2, whole-cell photobleaching: Please expand on what is 'tornado' mode scanning and how it has been applied.

      8) Figure 3, dSTORM data: Please provide further details regarding the numbers of sampled ROIs and/or individual molecules / distances analysed.

    3. Reviewer #1:

      Astrocyte glutamate transporter, GLT1, plays a crucial role in confining the levels of extrasynaptic glutamate, and therefore, understanding the cellular basis by which surface dynamics of GLT1 is regulated has implications in regulating glutamatergic transmission. Here, Michaluk et al. perform FRAP experiments using pHluorin (SEP)-tagged GLT1, and present a careful quantitative characterization of GLT1 surface dynamics that takes into account both lateral diffusion and exocytic delivery. The authors report that 25-30% of surface GLT1 represent immobile fraction which may be subject to slower exchange via exocytic delivery from intracellular compartments. In addition, the cytoplasmic domain of GLT1 plays a role in regulating GLT1 subcellular localization patterns and its activity-dependent dynamics. While the roles for mGluR and calcium-signaling mechanisms are explored, given the drugs have been applied under conditions in which neurons are equally affected, whether mGluR and calcium signaling involving calcineurin are engaged in astrocytes to impact GLT1 remains to be established. In addition, the super-resolution imaging, which does not discriminate between surface and intracellular pool of GLT1, is not well connected to the FRAP results, which is performed blind to the location of synapses.

    4. Summary: In this study, Michaluk et al. examined the membrane dynamics of the main glial glutamate transporter GLT1 in hippocampal astrocytes, which was previously shown to shape synaptic transmission through regulating extracellular levels of glutamate. Using GLT1 tagged on its surface with a pH-sensitive fluorescent marker, GLT1-SEP, the authors performed (1) fluorescence recovery after photobleaching (FRAP) experiments to assess the basal and activity-dependent dynamics of surface GLT1-SEP and (2) super-resolution dSTORM imaging to determine the relationship between GLT1 and PSD-95, an excitatory synapse marker. A large proportion of surface GLT1-SEP underwent turnover with a surface lifetime of 22 s, whereas a smaller fraction (~25%) remained largely immobile, which was decreased upon increased activity. Notably, the cytoplasmic domain of GLT1-SEP was shown to attenuate the basal turnover of surface GLT1 and to facilitate its proximal localization to synapses; moreover, GLT1 cytoplasmic domain was required for activity-dependent increase in the mobile fraction.

      While previous studies using single molecule tracking have demonstrated a role for the lateral diffusion of GLT1 in controlling the recruitment of GLT1 near active synapses, the present study uses powerful optical approaches and analysis tools to access both the surface lateral mobility and the exchange between surface and intracellular pools of GLT1. Furthermore, characterization of the nanoscale organization of GLT1 relative to synapses and its dependence on the C-terminal domain of GLT1 is presented. Altogether, the results are interesting and valuable, and underscore the importance of the GLT1 C-terminus in the membrane turnover and in the activity-dependent lateral diffusion of the surface GLT1. Nevertheless, some of the conclusions are not strongly supported by the data shown.

      Reviewer #2 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      One example of this problem is in the estimation of cancer risk. The risk is estimated on the basis of body size and lifespan. However, that lifespan is itself phylogenetically estimated from body size at least for the non-extant species. It is not clear to me from the manuscript whether all lifespans are so estimated, or whether observations are used for the lifespan of the extant species. If the latter, caution is indicated, because lifespan data are highly uneven and often given as observed maximal lifespans, which can be misleading if taken from, for instance, zoo specimens. In either case, the manuscript needs to more clearly emphasize that these are statistically-predicted risks, not measured risks.

      At a larger scale, the authors have done their best with a dataset that suffers from a couple of problems. First, all of the extant very large-bodied animals form a single clade, with the hyrax as the sole small-bodied member of that clade. And since the titanohyrax is extinct, among the extant organisms (an available large-bodies species with genomes) there is then a true large-bodied clade of the sirenia and elephants and relatives. I understand that other evolutionary data make it clear that these represent two (three including titanohyrax) independent transitions to large-body sizes. But with only the modern or nearly modern genomes to work with, I am not sure that the duplication inference procedures and their coupling to the body size analysis statistically represents more than a single observation (e.g., a default of a single transition to large size along the tethytheria branch).

      Similarly, the authors observe what appears to be a number of independent duplications of tumor suppressors in African and Asian elephants: duplications that are lacking in many of the ancient genomes considered. I know that the authors used rigorous statistical methods to correct for the fragmented nature of these ancient genomes, but it is very hard not to wonder if some of the data in Figure 4 is really not an artifact of using ancient genomes, where detecting recent gene duplications may be very difficult (several of the Asian and African elephant duplications in Figure 4 appear to be of the same genes). If these events are truly independent and not genome assembly/annotation artifacts, there is then an alternative hypothesis to propose. Thus, are the authors suggesting that there is a rapid turnover in the duplication of tumor suppressors, such that all elephants have such duplicates, but the particular duplications have short life spans and differ from species to species?

      Finally, it would be nice to see a few more comments on the manatee genome and why it does (or doesn't) show the expected patterns for the genome evolution in the face of the evolution of larger body sizes.

      I would also note that Figure 3 and 4 would benefit from greatly expanded captions: I do not fully understand what is being illustrated in, for instance, Figure 3B-why are certain dots connected with lines? Intersections between what in the y-axis label?

    2. Reviewer #2:

      This manuscript addresses the question of whether duplication of tumor suppressors occurred coincidently with the enlarged body size and reduced cancer risk evolved independently in Afrotherians. Using the human genome as reference, the authors systematically searched for gene duplications in 13 publicly available Afrotherian genomes, including 9 extant and 4 extinct species. The authors also reconstructed the ancestral body sizes, cancer risks and gene duplication events across the Afrotherian phylogeny. These data showed that both increased body sizes and reduced cancer risks are gradually evolved. Reactome pathway enrichment analysis for gene duplicates showed unexpectedly that gene duplicates in both lineages with or without major increases in body size/lifespan/decreases in cancer risk are enriched in many cancer related pathways. However, the authors found that 157 genes duplicated in Proboscidean stem-lineage, in which extremely large species evolved, were uniquely enriched in 12 cancer pathways. These genes might facilitate further body enlargement and cancer resistance evolution in Proboscidean. Most interestingly, the authors found that several genes both upstream and downstream of a famous tumor suppressor TP53 have also been duplicated, either before or after initial TP53 duplication. These genes are involved in transcriptional regulation of TP53 and may have facilitated re-functionalization of TP53 retroduplicates. Overall, this is an important and interesting study that can help us understand the evolution of body size, lifespan and cancer risk in mammals more deeply.

      Major comments:

      1) In general, the evolutionary fate of gene duplication includes: 1) Conservation of gene function; 2) Neofunctionalization; 3) Pseudogenization; 4) Subfunctionalization (doi:10.1016/S01695347(03)00033-8). To execute the function of tumor suppression, as this study focused on, gene duplicates were supposed to be functionally conserved or subfunctionalized. Gene duplicates that have been neofunctionalized or pseudogenized will not be helpful (also mentioned by authors in the Caveats section). Therefore, it might be more convincing to investigate the functional status of each gene duplicate, especially those in Fig 4C/D. In many cases, however, a related function, rather than an entirely new function, evolves by neofunctionalization after gene duplication, and also that to check new functions for a batch of genes is not realistic, the authors could simply check the coding sequences to ensure these genes duplicates are not pseudogenes and are functional. This is necessary because in Fig 4D, many genes have only 2 copies expressed. If one of them is a young pseudogene, it could be stochastically expressed and will encode a dysfunctional protein.

      2) In Results section 3, the cancer pathway frequency data of many nodes seems not consistent with data shown in Table 2. For example, Line 293-296: "55.8% (29/52) of the pathways that were enriched in the Tethytherian stem-lineage..., 27.8% (20/72) of the pathways that were enriched in the Proboscidean stem-lineage...were related to tumor suppression", the cancer pathway percentages shown in Table 2 for these 2 nodes are 63.4% and 38.81%, respectively. While the frequency data in Table 2 are consistent with Supplementary Data File S3: "Atlantogenata_Reactome_ORA.xlsx". It is possible that the frequency data shown in the main text are specific to pathways of tumor suppression, rather than cancer related pathways. If this is the case, more detailed data should be shown somewhere else.

      3) The titles of Results section 3 and section 4 are highly similar and actually the data in section 4 seems to be used to further solidify the conclusion of section 3. Therefore, is it possible to merge them into one single section?

    3. Reviewer #1:

      The strength of this paper is its coupling of careful phylogenetic work with genomics to demonstrate the take-home message: all afrotherians are equal, but some are more equal than others with respect to mechanisms that reduce cancer risk. This is a significant advance in our understanding of the evolution of cancer risk with body size, and in so doing it considerably lengthens the list of genes of interest. It also has interesting examples illustrating the logical criteria of consistency, necessity, and sufficiency that will make it quite useful in teaching critical thinking to students.

    1. Reviewer #3:

      The present manuscript focuses on a subpopulation of layer 5 neurons in medial and lateral entorhinal cortex and its functional connections to target neurons in layers 2, 3 and 5. The authors show a difference in LVb-to-LVa connectivity between MEC and LEC. The results suggest that the entorhinal output circuit via LVb-to-LVa is present primarily in LEC.

      The work relies on and is made possible by a newly described transgenic mouse (TG) where LVb neurons can be labeled and stimulated with light. The authors showed that these neurons are largely co-labeled with PCP4, a marker for LVb. They compared the apical dendritic extent from TG labeled cells (LVb) and Nac retrogradely labeled cells (LVa) in medial and lateral EC. The intrinsic electrophysiological properties of LVa and LVb neurons were measured and used for PCA showing segregation according to sublayer and region. The axonal distribution and translaminar local connections of LVb neurons form the TG mice were then examined. Cells were recorded in vitro and filled with biocytin, both from MEC and LEC, with multiple cells in the same slice, documented with high quality images. The study of the LVb translaminar connectivity via a direct comparison of postsynaptic responses in neurons in different layers in the same slice is the gold standard for this type of functional connectivity analysis. There is also an investigation of mixed excitatory-inhibitory postsynaptic response sequences, and evidence for a dorso-ventral gradient in LVb-to-LVa connectivity in MEC is given.

      The study combines TG mice, immunolabeling, retrograde labeling, morphological analysis and in vitro electrophysiology with optogenetic photo-stimulation. While it builds on already published work by the same group and others, by comparing the local target neurons of LVb in MEC and in LEC, the manuscript provides a unique contribution to the literature on the laminar circuit organization in the Entorhinal Cortex. In view of the central position of this area in the hippocampal memory systems of the rodent brain, these results are of interest to a broader neuroscience audience. It is also a nice example of a bottom-up approach, where data on the entorhinal translaminar connectivity may influence and constrain theories of hippocampal-cortical processing.

      Major Comments:

      1) Almost all TG labelled neurons are positive for PCP4 but not so vice versa, only 45.9 and 30.P% of PCP4 + neurons in LEC and MEC are labeled in the TG mouse (page 5) leaving open the possibility that the TG mouse labels a (specific?) subset of LVb neurons. Did you test whether TG labeled LVb cells co-localize with Ctip2 ?

      2) The direct comparison of translaminar connectivity of LVb neurons is very convincing. But if your main conclusion (title) concerns the difference of LVb-to-LVa connectivity between MEC and LEC, it would have been more appropriate to test that in the same slice. While the data strongly support conclusions on the laminar differences of LVb connectivity, the evidence for differences in LVb-to-LVa connectivity between MEC and LEC is a bit weaker and more indirect.

      3) Postsynaptic responses (in mV) in LEC are about twice as high in amplitude as in MEC (Fig. 4E vs Fig 5E), across all layers. Please discuss possible reasons, and possible impact on the circuit function. Is the probability to initiate action potentials higher in LEC ?

      4) Give the onset latencies of postsynaptic excitatory potentials induced by LVb photostimulation. Are latencies monosynaptic? Or also polysynaptic? Ideally this could be tested by applying a cocktail of TTX-4-AP.

      5) Figure 4 S3, Fig 5 S2. Analysis of inhibition. What is the cut-off criteria to say inhibition is present or not? It might be more appropriate to give the I/E ratio.

    2. Reviewer #2:

      The study investigates key components of the entorhinal circuits through which signals from the hippocampus are relayed to the neocortex. The question addressed is important but the stated claim that layer 5b (L5b) to layer 5a (L5a) connections mediate hippocampal-cortical outputs in LEC but not MEC appears to be an over-interpretation of the data. First, the experiments do not test hippocampal to L5a connections, but instead look at L5b to L5a connections. Second, the data provide evidence that there are L5b to L5a projections in LEC and MEC, which contradicts the claim made in the title. These projections do appear denser in LEC under the experimental conditions used, but possible technical explanations for the difference are not carefully addressed. If these technical concerns were addressed, and the conclusions modified appropriately, then I think this study could be very important for the field and would complement well recent work from several labs that collectively suggests that information processing in deep layers of MEC is more complex than has been appreciated (e.g. Sürmeli et al. 2015, Ohara et al. 2018, Wozny et al. 2018, Rozov et al. 2020). Major Concerns:

      1) An impressive component of the study is the introduction of a new mouse line that labels neurons in layer 5b of MEC and LEC. However, in each area the line appears to label only a subset (30-50%) of the principal cell population. It's unclear whether the unlabelled neurons have similar connectivity to the labelled neurons. If the unlabelled neurons are a distinct subpopulation then it's difficult to see how the experiments presented could support the conclusion that L5b does not project to L5a; perhaps there is a projection mediated by the unlabelled neurons? I don't think the authors need to include experiments to investigate the unlabelled population, but given that the labelling is incomplete they should be more cautious about generalising from data obtained with the line.

      2) For experiments using the AAV conditionally expressing oChIEF-citrine, the extent to which the injections are specific to LEC/MEC is unclear. This is a particular concern for injections into LEC where the possibility that perirhinal or postrhinal cortex are also labelled needs to be carefully considered. For example, in Figure 3D it appears the virus has spread to the perirhinal cortex. If this is the case then axonal projections/responses could originate there rather than from L5b of LEC. I suggest excluding any experiments where there is any suggestion of expression outside LEC/MEC or where this can not be ruled out through verification of the labelling. Alternatively, one might include control experiments in which the AAV is targeted to the perirhinal and postrhinal cortex. Similar concerns should be addressed for injections that target the MEC to rule out spread to the pre/parasubiculum.

      3) It appears likely from the biocytin fills shown that the apical dendrites of some of the recorded L5a neurons have been cut (e.g. Figure 4A, Figure 4-Supplement 1D, neuron v). Where the apical dendrite is clearly intact and undamaged synaptic responses to activation of L5b neurons are quite clear (e.g. Figure 4-Supplement 1D, neuron x). Given that axons of L5b cells branch extensively in L3, it is possible that any synapses they make with L5a neurons would be on their apical dendrites within L3. It therefore seems important to restrict the analysis only to L5a neurons with intact apical dendrites; a reasonable criteria would be that the dendrite extends through L3 at a reasonable distance (> 30 μm?) below the surface of the slice.

      4) Throughout the manuscript the data is over-interpreted. Here are some examples:

      • The title over-extrapolates from the results and should be changed. A more accurate title would be along the lines of "Evidence that L5b to L5a connections are more effective in lateral compared to medial entorhinal cortex".

      • "the conclusion that the dorsal parts of MEC lack the canonical hippocampal-cortical output system" seems over-stated given the evidence (see comments above).

      • Discussion, para 1, "Our key finding is that LEC and MEC are strikingly different with respect to the hippocampal-cortical pathway mediated by LV neurons, in that we obtained electrophysiological evidence for the presence of this postulated crucial circuit in LEC, but not in MEC". This is misleading as there is also evidence for L5b to L5a connections in MEC, although this projection may be relatively weak. Recent work by Rozov et al. demonstrating a projection from intermediate hippocampus to L5a provides good evidence for an alternative model in which MEC does relay hippocampal outputs. This needs to be considered.

      5) What proportion of responses are mono-synaptic? How was this tested?

    3. Reviewer #1:

      The current study by Ohara et al. describes differences in the connectivity patterns between LVb to LVa. The study builds on the authors previous study (Ohara et al., 2018) where they showed the intrinsic connectivity of LVb neurons in the MEC and LEC. The focus of the current study is the difference the authors observed in the strengths of connectivity between LVb and LVa in the MEC and LEC. The authors suggest that the in MEC Vb neurons do not provide substantial direct input to LVa neurons. The manuscript emphasizes the functional importance of difference as the authors suggest that "...hippocampal -cortex output circuit is present only in LEC, suggesting that episodic systems consolidation predominantly uses LEC-derived information and not allocentric spatial information from MEC." The study uses a newly developed mouse line to investigate connectivity differences, this is a nice technical approach and the experimental data is of high quality. While the data is solid, the authors tend to over-interpret their findings from the functional point of view. While the observed difference is quite interesting, it is unclear what the impact is on information flow in the MEC and LEC and to which degree they differ functionally. The authors assume major differences and their work is framed based on these expected differences, but the manuscript does not provide data that would demonstrate functionally distinct features.

      Major Comments:

      1) Throughout the text the authors treat their findings as if it was 'all-or-none' i.e the LEC has a direct connection between LVb and LVa while the MEC does not. This does not seem to be the case based on their data, the data shows that connectivity in the MEC is less robust but it is definitely there. The difference seems to be quantitative and not qualitative.

      2) Due to this problem, the authors seem to be over-interpreting their data by suggesting that the information flow must be significantly different conceptually in the LEC and the MEC and this would have important implications for memory consolidation. It is impossible to draw these conclusions based on the data presented, as there are no experiments investigating the functional, network level consequences of these connectivity differences.

      3) The electrophysiology experiments provide information about the basic parameters of the investigated cells, but these lack a physiological context that would allow the authors to evaluate the consequences of these differences on information flow and/or processing in the MEC and the LEC.

      4) The study is using a novel transgenic mouse line to differentiate between LVb and LVa neurons, while this is definitely a strength of the study, this strategy allows the authors to visualize ~50% of LEC and ~30% MEC neurons. Since the authors aim to prove a negative (MEC does not have direct connection) the fact that ~70% of the neurons are not labelled could be problematic.

    4. Summary: The study addresses a fundamentally important question regarding the connectivity of LVb and LVa in the medial and lateral entorhinal cortex. The authors suggest that LVb to LVa connection exists in the LEC but not in the MEC. This finding would have important implications on studies investigating circuits functions between the hippocampus and the EC. All three reviewers found the central question important and the data novel. However, there are several technical issues that limit the robustness of the authors claim.

      1) While the transgenic animal used in these experiments is elegant and novel, it only labels a subpopulation of the neurons. There is a possibility of selective labelling of neurons with distinct connectivity patterns. The authors would need to show that their approach is not leading to false negative results due to the selective visualization of those neurons that project more modestly to the LVa.

      2) The specificity of the injection to the LEC/MEC should be better documented and potential spread to the perirhinal or postrhinal cortex carefully excluded.

      3) The findings are presented as LVb to LVa connection did not exist at all in the MEC, however the data shows that the connection is there but it is significantly less dense than in the LEC. Given the graded finding, if the authors aim to show their central claim regarding the lack of mediation of hippocampo-cortical outputs by this connection in the MEC, this would require the addition of functional studies.

    1. Reviewer #2:

      Overall, this is a very well written paper that presents software that fills an interesting niche: interactive, real-time simulations of complex multicellular systems that can run in a web browser, without any need for users to install or configure software. As the authors describe, this enables new modes of education, science communication, and multidisciplinary collaboration. The software itself is impressive, and the supplied examples are clean and beautifully fluid. It is eye-opening that Javascript can run these models so well. The authors also did a fantastic and complete job in sharing their full source code, from the overall software down to individual scripts used to generate figures.

      Some points that the authors should address in a revision:

      1) Suitability of the software for researchers:

      a. Artistoo simulations do not appear to have any method to save data for external manipulation and archival. This makes their use somewhat less applicable to robust simulation-driven investigations, particularly where postprocessing and further analyses are required.

      b. It is unclear if Artistoo-based models can be exported into other cellular Potts (CP) frameworks such as CC3D or Morpheus. This may leave researcher end users without a clear "upgrade path" after exploring model ideas in Artistoo and moving to larger simulations (e.g., larger or more complex domains), running simulations in high throughput on HPC resources, or adapting approximate Bayeseian techniques for parameter estimation that require automating many simulation runs. Without an upgrade path, such users may wish to immediately begin in research-focused platforms rather than start with Artistoo and re-implement in another framework later.

      c. Similarly, it is unclear if a model developed in Morpheus or CC3D can be directly imported into Artistoo. If such an import were possible rather than re-implementing models in Aristoo, research-focused users would be more likely to use Artistoo for scientific communication and outreach.

      2) Need for improved educational scaffolding: The examples provided in the paper are excellent. However, they lack context on what the parameters mean or do. (For example, what are max_act and lambda_act in the cell migration model?) This may limit the educational impact because users will be unclear on what to change, and how the parameters relate to cell biophysical processes.

      The authors should include more background information with each model, define parameters, and give end users some idea of what to expect when parameters are changed. We have also found it useful to help guide a new user's exploration of a model by suggesting parameter sets and describing what they should see. This can serve as an educational scaffolding to help learners build and grow.

      The authors' sample models should serve as a template to Artistoo users on best practices for communicating models to diverse audiences.

      3) New developments in online cellular Potts simulators: The authors should note that CompuCell3D has recently been ported to run interactively online in a web browser. See https://nanohub.org/resources/compucell3d. This recent development should be addressed in the paper.

      4) Narrow review of interactive, "zero install" simulation frameworks: The authors focus too narrowly by only comparing Artistoo with other cellular Potts frameworks, while the main use case for Artistoo is for interactively sharing and communicating complex simulation models online.

      The authors should discuss non-CP frameworks that worked towards this, such as CC3D on nanoHUB (see above), online Tellurium (https://nanohub.org/resources/tellurium), current practice to share R models online as Shiny apps, and recent work to use xml2jupyter to automatically convert research-focused (command line) PhysiCell models to interactive Jupyter notebooks that can be shared as interactive webapps on nanoHUB (e.g., https://nanohub.org/tools/pc4cancerimmune). All of these serve similar purposes of creating zero-install, interactive versions of models for science education and communication. The authors should briefly discuss these to further contextualize their work.

      5) While this is a more minor point, I would feel more comfortable if the supplementary information had convergence and accuracy testing. Are there limits on computational step sizes for numerically accurate simulations, particularly for large energies or when including diffusion processes?

      Overall, this is some fantastic work.

    2. Reviewer #1:

      The authors present a novel framework for running CPM simulations in the web browser. The CPM framework is a well-established model methodology for cells and tissues. Several well established other simulation platforms exist, however they do not run in the web browser, and require varying amounts of setup. This often presents an insurmountable roadblock since many researchers do not have the required software packages or expertise to read, execute, and run models in different formats. Artistoo on the other hand promises a zero-install experience for end-users, and ease of model construction for modellers.

      The unique feature of Artistoo is that it runs in the web browser. This allows users to execute simulations in a zero-install setting. In the web browser users can change model parameters, and observe resulting effects instantly. Extending or modifying models requires the user to know JS. Artistoo implements core modern CPM features. Artistoo is successfully benchmarked against the existing software of Morpheus. The source code is available on github. A wiki with an apparent complete and extensive documentation is available.

      The authors argue for three main avenues of impact: (1) accelerated feedback loops on models with experimental collaborators, (2) science communication, and (3) in teaching.

      The authors' points have merit, point (1) in particular. Installation and execution of tissue modelling software by non-experts is a well known challenge. Artistoo elegantly avoids this issue, by allowing models to be shared via the web browser. The non-expert is able to gain insights into model dynamics, and can explore the model's parameter space at ease. This approach has the potential of stimulating more frequent feedback between experimentalists and modellers, and maybe even the adoption of such a model by experimentalists.

      There is no markup language support. The software package Morpheus describes simulations using a markup language, allowing non-expert users to assemble complex models without writing a single line of C++, while at the same time preserving exact details of each simulation run. Morpheus is (as far as I know) the only based on a markup language. It would be fantastic if Artistoo could read and execute Morpheus ML files. From a technical point of view this should be possible. This would mean, all `Morpheus' models become "Artistoo' models, meaning that Artistoo would become the standard for sharing CPM models with collaborators. Finally the markup language would allow novices to implement new models without being discouraged by the JS requirement. Adopting a common markup language between projects would be the first example of standardization across open-source CPM software packages.

      I can see Artistoo being adopted by ``CPM modellers', who want to share models with collaborators, a wider audience (science communication). It may also find adoption in teaching. At the same time, the adoption of Artistoo faces some challenges: (1) Among modellers existing platforms have more features, are familiar, and have similar computational efficiency; (2) existing models are to be rewritten in the Artistoo framework.

    1. Reviewer #3:

      Park et al. present an analysis of how structural connectomes (estimated with diffusion MRI) change from childhood to young adulthood. To characterize the changes, they embed each connectome into a 3-dimensional space using nonlinear dimensionality reduction (and alignment to a template sample), and then perform a range of analyses of the statistics derived from this space (notably, distances to the template centroid, 'eccentricity'). The paper is well written, the data are fantastic, and the analyses are interesting, but I have a range of methodological concerns.

      1) Interpretability and Lack of Comparison The authors claim repeatedly that they are "capitalizing on advanced manifold learning techniques". One could imagine an infinite number of papers that take a dataset, use a technique to extract a metric, X (e.g., eccentricity), and then write about the changes in X with some property of interest, Y (e.g., age). Given this set of papers (and the non-independence between the set of possible Xs), the reader ought to be most interested in those Xs that provide the best performance and simplest interpretation, with other papers being redundant. Thus, a nuanced approach to presenting a paper like this is to demonstrate that the metric used represents an advance over alternative, simpler-to-compute, or clearer-to-interpret metrics that already exist. In this paper, however, the authors do not demonstrate the benefits of their particular choice of applying a specific nonlinear dimensionality reduction method using 3 dimensions alignment to a template manifold and then computing an eccentricity metric. For example:

      i) Is the nonlinearity required (e.g., does it outperform PCA or MDS)?

      ii) Is there something special about picking 3 dimensions to do the eccentricity calculation? Is dimensionality reduction required at all (e.g., would you get similar results by computing eccentricity in the full-dimensional space?)

      iii) Does it outperform basic connectome measures (e.g., the simple ones the authors compute)?

      There is a clear down-side of how opaque the approach is (and thus difficult to interpret relative to, say, connectivity degree), so one would hope for a correspondingly strong boost in performance. The authors could also do more to develop some intuition for the idea of a low-dimensional connection-pattern-similarity-space, and how to interpret taking Euclidean distances within such a space.

      2) Developmental Enrichment Analysis Both in the main text and in the Methods, this is described as "genes were fed into a developmental enrichment analysis". Can some explanation be provided as to what happens between the "feeding in" and what comes out? Without clearly described methods, it is impossible to interpret or critique this component of the paper. If the methodological details are opaque, then the significance of the results could be tested numerically relative to some randomized null inputs being 'fed in' to demonstrate specificity of the tested phenotype.

      3) IQ prediction The predictions seem to be very poor (equality lines, y = x, should be drawn in Fig. 5, to show what perfect predictions would look like; linear regressions are not helpful for a prediction task, and are deceptive of the appropriate MAE computation). The authors do not perform any comparisons in this section (even to a real baseline model like predicted_IQ = mean(training_set_IQ)). They also do not perform statistical tests (or quote p-values), but nevertheless make a range of claims, including of "significant prediction" or "prediction accuracy was improved", "reemphasize the benefits of incorporating subcortical nodes", etc. All of these claims should be tested relative to rigorous statistics, and comparisons to appropriate baseline/benchmark approaches.

      4) Group Connectome Given how much the paper relies on estimating a group structural connectome, it should be visualized and characterized. For example, a basic analysis of the distribution of edge weights and degree, especially as edge weights can vary over orders of magnitude and high weights (more likely to be short distances) may therefore unduly dominate some of the low-dimensional components). The authors may also consider testing robustness performed to alternative ways of estimating the connectome [e.g., Oldham et al. NeuroImage 222, 117252 (2020)] and its group-level summary [e.g., Roberts et al. NeuroImage 145, 1-42 (2016)].

      5) Individual Alignment The paper relies on individuals being successfully aligned to the template manifold. Accordingly, some analysis should be performed quantifying how well individuals could be mapped. Presumably some subjects fit very well onto the template, whereas others do not. Is there something interesting about the poorly aligned subjects? Do your results improve when excluding them?

    2. Reviewer #2:

      Park et al. report on an analysis of existing semi-longitudinal NSPN 2400 data to learn how the projections of high-dimensional structural connectivity patterns onto a three dimensional subspace change with age during adolescence. They employ a non-linear manifold learning algorithm (diffusion embedding), thereby linking the maturation of global structural connectivity patterns to an emerging approach in understanding brain organization through spatial gradient representations. As might be expected based on the large body of literature indicating changes in structural connectivity in specific brain regions during adolescence, the authors find corresponding changes in the embedding of the structural connectivity patterns.

      While this work touches on an important topic, ties nicely with the increasing body of papers on global brain gradients, and its overall conclusions are warranted, I am not (yet) convinced that it offers fundamentally new insights that could not have been gleaned from previous work (after all, manifold learning simply displays a shadow of the underlying patterns; if the patterns change, so does their shadow). I am also not convinced by the rationale for employing diffusion embedding: the authors state that the ensuing gradients are heritable, conserved across species, capture functional activation patterns during task states, and provide a coordinate system to interrogate brain structure and function, but that would be true for any method that adequately captures biologically meaningful variance in the structural connectivity patterns.

      Other comments:

      The authors show that the maturational change of the manifold features predict intelligence at follow-up, but did not show that intelligence itself exhibited changes that exceeded the error bounds of the regression line. Why not predict IQ change?

      The slight improvements in prediction accuracy observed after adding maturational change and subcortical features to the features at baseline will necessarily happen by adding more regression parameters and may not be meaningful.

    3. Reviewer #1:

      This manuscript describes a longitudinal study of the adolescent structural connectome. The authors find strong effects of expansion of structural connectomes in transmodal brain regions during adolescence. They also report findings centered on the caudate and thalamus, and supplement the structural connectivity analyses with transcriptome association analyses revealing genes enriched in specific brain regions. Finally, intelligence measures are predicted from baseline structural measures. This is an interesting and comprehensive set of analyses on an important topic. Overall, the figures are lovely. The sensitivity analyses are particularly commendable. Some suggestions and points for clarification are below.

      There is not much in the introduction about why co-localized gene sets are of interest to explore. What is already known about brain development using this approach, and how does the current work fill a gap in our knowledge?

      Similarly, the introduction states that the study aims to "predict future measures of cognitive function". What cognitive functions specifically were of interest in this study, and why? No rationale or background is provided for conducting these analyses.

      The authors claim that their study examines "the entire adolescent time period", however some would argue that age 14 does not represent the earliest age at which adolescence onsets. I think it would be more accurate to say the study covers the mid to late adolescent period.

      In the results (page 4) it is stated that three eigenvector explained approximately 50% of the variance in the template affinity matrix. Here it would be helpful to report exactly how much of the variance was explained by each (E1, E2, E3).

      Pubertal development occurs across the age range investigated, and affects brain structure and function. Was information on pubertal stage of participants available? Did some participants undergo changes in pubertal status from timepoint 1 to timepoint 2?

      The introduction does not mention cortical thickness much, therefore these analyses come as a bit of surprise in the results.

      As in the introduction, there is not much interpretation of the transcriptome findings in the discussion.

      For constructing the structural connectome, the Schaefer 7-network atlas was utilized. Can the authors comment on why a functional atlas (rather than a structural atlas) was used here?

    4. Summary: This manuscript describes a longitudinal study of the adolescent structural connectome. Park et al. report on an analysis of existing semi-longitudinal NSPN 2400 data to learn how the projections of high-dimensional structural connectivity patterns onto a three dimensional subspace change with age during adolescence. They employ a non-linear manifold learning algorithm (diffusion embedding), thereby linking the maturation of global structural connectivity patterns to an emerging approach in understanding brain organization through spatial gradient representations. The authors find strong effects of expansion of structural connectomes in transmodal brain regions during adolescence. They also report findings centered on the caudate and thalamus, and supplement the structural connectivity analyses with transcriptome association analyses revealing genes enriched in specific brain regions. Finally, intelligence measures are predicted from baseline structural measures.

      This is an interesting and comprehensive set of analyses on an important topic. Overall, the figures are lovely. The sensitivity analyses are particularly commendable. The paper is well written, the data are fantastic, and the analyses are interesting. Some suggestions and points for clarification (both theoretical and methodological) are below.

      Reviewer #3 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      -The authors claim in the first part of the results that the frequency of CSF-cN spontaneous activity is the same in juvenile and adult mice. In Fig.1G, 61 neurons from 7 animals are illustrated. The authors should state how many juvenile (P14-P24) and adult (P36-P47) mice have been included in the analysis (3 and 4 is different from 5 and 2) and how many neurons have been recorded in each animal. In the methods section, they indicate that acute slices were obtained from P14 to P55 mice. If the reviewer is correct, neurons from P55 mice are not included in Fig. 1G?

      -The immunohistochemical data have been obtained in P30-P52 mice. Are P14 CSF-cNs all VGaT positive?

      -The frequency of CSF-cN spontaneous activity could be the same but underlying mechanisms could completely differ with age. In Fig. 3, TTX fails to alter spontaneous Ca2+ spike expression in 3 animals. How old are these mice? Same questions for the results with Cd (2 animals, sample a little bit small...), ML218 4 animals (4 animals)...etc

      -The focal ejection of 40mM K+ triggers a depolarization of all CSF-cNs "including those previously silent". This is the first time page 9 that the authors mention the fact that some CSF-cNs are not spontaneously active. Is the proportion of silent CSF-cNs different with age? The effects of Cd have been tested in 1 animal. Same for the effect of MCA on Ach-evoked Ca2+ spikes. In my opinion, the sample size has to be increased.

    2. Reviewer #2:

      The present study investigates how CSF-contacting neurons (CSFcNs) of the mouse spinal cord integrate and translate different synaptic inputs using distinct calcium-dependent spike mechanisms. Indeed two different types of voltage-gated calcium channels can be activated, resulting in the generation of spikes with different amplitudes. T-type Ca2+ channels would be involved in the generation of low amplitude spikes while HVA-Ca2+ channels participate in the generation of large amplitude spikes. Then these distinct spikes allow signaling different neurotransmitter systems. Consequently, the data provided here argue in favor of CSF-contacting neurons acting as a sensory system that uses Ca2+ channels-dependent spike activity with graded amplitude corresponding to the activation of different neurotransmitter receptors. This study is based on two-photon calcium imaging performed on spinal cord slices preparations obtained from young and adult mice. My comments are as follows:

      1) All data are based on calcium imaging. Therefore, traces correspond to calcium-dependent fluorescent changes in the cells of interest. Can the author provide at least one sample showing that these calcium events are indeed linked to the generation of spikes; i.e., electrophysiological recordings? In addition, is there any electrophysiological evidence for the existence of calcium-dependent conductances in the CSFcNs? In the same vein, the authors conclude that spontaneous activity of CSFcNs depends upon calcium- but not sodium-spikes as TTX has apparently no effect. But, are the authors sure that in their experimental conditions individual sodium spikes could be detected given the genetic encoded probe used, the kinetic of such spikes and the frequency of the sampling during image acquisition? Note that this does not preclude the conclusion that CSFcNs express calcium-dependent spikes. See also comment 4 below.

      2) Using the activation of different calcium channels to trigger spikes of different amplitude to code distinct signaling pathways associated with distinct neurotransmitter systems is a very attractive mechanism. I was wondering whether the authors ever observed the two processes in one single cell, meaning: did they ever try to apply Ach and ATP on the same cell? To my point of view, this would be an extremely elegant way to show that spikes of variable amplitudes imply the activation of distinct calcium-dependent conductances and are linked to different neurotransmitter signaling in one neuron. This should be possible as they said that 100% of the examined cells responded to Ach, suggesting that the only limitation would be to find a cell that also expresses purinergic receptors (should be highly feasible). In addition, this would strongly demonstrate how much this coding mechanism is valuable if this is present in a single cell, otherwise one could consider that the coding system just depends upon each cell, the neurotransmitter and its associated receptor signaling that by definition can involve distinct calcium-dependent channels. Then it would rather be a mechanism specific to each receptor than a sophisticated coding system.

      3) As a general comment on figures, I would suggest to the authors to provide samples that are more illustrative of the results they claim on. For example on Figure 3 they state that TTX has no effect on spike amplitude and frequency, but the two traces shown (in blue and green) rather indicate a decrease in spike frequency and even an increase in spike amplitude after a few minutes of recording (green trace). [See also comment 4 below]. Another example is in Figure 6 in which one important data is the distinct amplitude of spikes triggered by either Ach or ATP. While this is properly illustrated in panels C, D and E, in contrast the samples chosen for panels A and B show events with the exact same amplitude. Please choose other traces. By the way, panel C is not necessary because the same info are included in panels D and E. I would suggest removing panel C. Finally, in Figure 7 it is stated in the text that in some cells ATP induced first a decrease in fluorescence followed by a large Ca2+ spike, while this specific spike looks much smaller than all the other ones illustrated in the study (Fig 7G). Also, the spike triggered by UTP looks different than the one triggered by ATP. Is it a typical response?

      4) Several experimental details must be provided. First, the justification for the choice of VGAT promoter to drive the GCaMP6f indicator into PKD2L1 neurons is missing. Second, drug concentrations are not justified. This is important as the authors argue that Ach and ATP trigger Ca2+ spikes with different amplitudes, but isn't there the possibility that this is dose-dependent? Did the authors try different concentrations? Third, on TTX experiments (Fig 3), after how long under TTX exposure were measurements performed? While this is a crucial parameter, this is not indicated in the paper. Given the traces provided different conclusions could be reached depending on this timing.

      5) It remains unclear to me why only some of the data (for example Fig 7) make a distinction between dorsal and ventral CSF-contacting neurons. In the zebrafish it is established that ventral and dorsal CSFC neurons have different developmental origins and distinct types of projections related to different functions. Then, if these neurons are suspected to play different roles depending on their ventro-dorsal position also in mice, the entire study should take this into account.

    3. Reviewer #1:

      The authors provide interesting evidence on the properties of CSF-contacting neurons, referred to as 'CSFcNs' in their manuscript, using 2 photon calcium imaging in mice.

      Their work relies on calcium imaging using 2 photon microscopy in slices of the mouse spinal cord. The authors observed calcium transients with two different amplitudes and propose that these transients reflect the activation of different voltage dependent calcium channels (T and L).

      Although the work is of interest, there are throughout the manuscript numerous issues: -shortcuts and oversimplified assumptions (calcium transients do not equal spikes!) (see title of Figure 2, 3) -the massive ignorance of the relevant literature for this small field on CSF-cNs in mice. In particular, but not only, the authors should know and refer to the work of Orts Dell'Immagine, Wanaverbecq, Trouslard who have shown since 2012 that CSF-cN in mice are chemosensory cells whose spontaneous activity is driven by the channel PKD2L1.

      Major comments

      1) The authors assume that calcium transients equal to firing (Figure 2) or calcium spikes (Figure 3) but these are far from being the same. No deconvolution algorithm can use calcium transients to infer spiking with better than 70% accuracy.

      In the recordings of the Wanaverbecq group, spontaneous firing in slices was 0.4Hz in control and 0.1Hz in PKD2L1 KO. The authors find here calcium transients occurring at 0.16Hz (n = 63 cells), suggesting that some of the sparse firing activity is missed by the authors.

      Since calcium transients reflect spiking but not in a linear manner, a calibration is necessary via cell attached or loose patch recordings in order to infer on CSF-cN spiking, or perforated patch to validate the evidence for calcium spikes.

      2) This assumption of calcium = firing does not hold in cells that have an input resistance of GOhms and whose activity has been shown to be driven by the opening of the channel PKD2L1 (Orts Del Immagine et al Neuropharmacology 2016). In particular, observations of the TTX insensitive calcium transients may be due to the PKD2L1 channel.

      => The authors need to combine recordings with perforated Patch Clamp together with the 2P calcium imaging in order to tackle the question of the role of the channel openings in the generation of the different calcium transients observed in WT or KO for PKD2L1.

      From introduction to discussion, the authors should properly cite the work of the Wanaverbecq group as well as other groups in the field, whose contributions were relevant and ignored.

      3) Activation leading to calcium spikes (K+, ATPergic, Cholinergic inputs, ...) was done without blockage of the neurotransmission in the slices and could therefore originate from indirect sources, including activation of metabotropic receptors presynaptically.

      The authors need to solve these issues.

      4) In Figure 7, there are diverse responses that the authors should better illustrate. Many cells appear to not respond for multiple stimuli tested : what is the rational criterion to define that a cell responded or not? Can the authors quantify the proportion of cells responding? Did the author take into account the high level of spontaneous activity? Can the negative dip in response possibly from a motion artifact in panel G and H?

    4. Summary: The reviewers have found the topic of your study of high interest, with very intriguing findings on the different origins of calcium transients in CSFcNs.

      However, after a careful examination of your work , the reviewers have raised the following major concerns:

      1) To conclude on calcium spikes, the imaging data without electrophysiological calibration leaves too much unknown. A careful electrophysiological examination should reveal how calcium transients of different amplitude correlate with the electrical activity of the cell, calcium spikes and spontaneous PKD2L1 channel openings as described extensively in these cells, is absolutely mandatory to conclude.

      2) The manuscript shows a lack of consideration of the importance of the sensory functions and of the role of channel PKDL1 that are both well-established in CSFcNs in mice and other models. More work is necessary to relate to these critical aspects.

      3) The number of animals for juvenile and adult mice used by the authors should be clearly stated (the manuscript only refers to the total number) but also largely increased for the authors to reach robust conclusions.

      4) Overall, more rigor should be implemented throughout the entire manuscript, with a deep writing improvement and a careful inspection of figure panels (choice and fair / complete representation of the data) and more information on conditions used for experiments (promoter used, concentrations for pharmacological agents, selection of ROIs, ventral versus dorsal CSF-cNs, definition and proportion of silent cells, enrichment in the T and L type calcium channels, etc ... ).

      Reviewer #2 and Reviewer #4 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      Jacob and colleagues developed a new experimental "facility" or environment for training macaque monkeys to perform behavioral tasks. Using this facility, the authors trained freely moving macaques to perform a visual "same-different" task using operant conditioning, and under voluntary head restraint. The authors demonstrate that they could obtain reliable eye-tracking data and high performance accuracy from macaques in this facility. They also noted that subordinate macaques can learn to perform basic aspects of the task by observing their dominant conspecifics perform the task in this facility. The authors conclude that this naturalistic environment can facilitate the study of brain activity during natural and controlled behavioral tasks.

      The manuscript is doubtless a hard-fought effort. The new experimental platform introduced by the authors has the capacity to transform how researchers approach the behavioral training of monkeys for some (but not all) tasks. However, in my opinion, the manuscript would have significantly broader impact and appeal if the authors had succeeded in performing wireless neural recordings in this same environment. Without these proof-of-principle neural data, the scope of this manuscript seems more limited. If the authors can obtain these neural data, the manuscript would be substantially stronger.

      There are a few other concerns related to methodology and interpretation that should be addressed.

      Major comments:

      1) In the abstract, the authors state that macaques are widely used to study the neural basis of cognition - but in fact these animals are a valuable model organism for studying many other aspects of brain function beyond cognition. The authors seem to be missing an opportunity to highlight the broad impact of their work.

      2) A gaze window of 3 degrees is rather large for most visual-based experiments. Do the authors think that it would be possible to train animals to maintain tighter fixation windows? And have they tried to do so?

      3) Are these animals water deprived before entering the experimental environment? And how long do the animals typically work in this environment? For how many hours, and for how much fluid?

      4) How did the authors ensure that the macaques do not fight inside the facility? Are the animals continuously housed in this facility or are they moved into this facility only during testing?

      5) Line 227: the authors state the following: "Remarkably, M2 learned the task much faster using social observation and learning than M1 & M3 did using the TAT paradigm". How do the authors rule out the possibility that M2 is simply a "smarter" animal?

      6) Line 354-364: the authors describe their insights about how animals may learn to perform the task in two phases. How can the authors make these strong claims based on data from N=1 macaque?

    2. Reviewer #2:

      The manuscript "A naturalistic environment to study natural social behaviors and cognitive tasks in freely moving monkeys" describes a large-scale system of rooms allowing for non-human primates to, potentially, freely engage in several different behaviors and neuroscientific experiments to be performed. The study is well intended, but in its current form with many claims, but few if any results does not, in my view, meet scientific standards.

      The paper presents the testing environment consisting of different rooms. Compared to earlier work (e.g. Berger et al., 2018), the main innovation is the inclusion of an eye tracking system. Data supports the notion that this works in principle. But there is no analysis of data quality and accuracy. We also do not know whether the system works on every trial, or how often the eye is not detected or the tracker loses the signal.

      The authors claim novelty of this testing environment, but similar ones have been used in behavioral research for decades and in recent years in neuroscience.

      The authors claim that it is easier to place a testing system into a separate cage then in the home cage. It remains unclear what this claim is based on. Motivation of animals in these social settings should be more difficult than in the home cage environment. So, this is a potentially interesting result. It is also a conceptually important claim for the paper's logic, if the social setting should really be beneficial for training. But the claim needs to be substantiated.

      The authors claim that natural behavior can be analyzed because a CCTV camera is mounted in the cage. There are no results or analyses to demonstrate that.

      The authors mention neural recordings on multiple occasions, but do not show any. EM shielding is neither necessary nor new.

      Automatic training appears to be a one-to-one copy of that in Berger et al. 2018, but citation is missing, except for Supplemental Information.

      The authors report an anecdote of one animal (n=1) learning socially from others. There is no indication that this subject might have performed differently without social learning. The interpretation is a just-so story and appears rather anthropomorphic.

      There are no results in the manuscript.

      The manuscript is not organized well. The Methods section reads like a Discussion, important information on methods is distributed across Supplemental Information and Results. Results, as mentioned, does not contain any results or data.

    3. Reviewer #1:

      I'm quite enthusiastic about the care the authors have taken in designing this cutting edge hybrid environment, and the effort they've gone through to describe it in detail. I believe that this endeavor has great merit, and that seeing the advancements in animal welfare and experimentation should be of interest to the general reader. However, at present, the stated interpretations are not fully justified by the results, and this must be addressed.

      The manuscript should be amended and updated in one of two possible ways: the interpretations of the scientific result here should be tamped down significantly, or additional evidence should be presented for some of the claims in the originally submitted manuscript. I am confident that the authors should be able to carry out either of these to a satisfying degreen.

      Major issues:

      1) Throughout the manuscript, stating that the third monkey learned the task "merely by observing two other trained monkeys" is misleading. The naive monkey may have learned very important details about the cognitive testing set-up from observation. But the third monkey learned the task of a unique behavioural shaping paradigm that included -but was not limited to- watching trained monkeys. The authors trained the third monkey on the cognitive task in the absence of the other monkeys, and do not show that the third monkey learned the specific cognitive task from watching other monkeys. Over-interpreting the anecdotal observations here hinders obfuscates what is novel and notable in this manuscript.

      2) The authors repeatedly state that the third monkey learned the task faster than the previous two monkeys. It is quite difficult to parse exactly what the authors mean by this, and exactly what the data is that supports that claim.

      The authors go on to state that M2 learned the "task structure" faster than M1/M3. However, "task structure" is not defined, so it is difficult for a reader to know precisely what was learned faster under social observation. Furthermore, the data showing that M2 learned the task structure faster than M1/M3 is not clear, and it is not known how M1/M3 learned the task structure in isolation. Description of which training steps may be aided by observation of trained monkeys must be clarified. The authors allowed M2 to observe M1 and M3 during initial familiarization of the experimental set-up, but it seems that observation may not have aided M2 in learning the complex same-different task at all.

      Even though M2 may have learned the task structure faster than M1/M3, these observations are anecdotal and should not be over-interpreted. If there is a clear difference in the time to learn basic task structure, it may be due to social observation, but the authors should not favor that interpretation without considering alternatives as well. E.g., monkeys have widely varying personalities (see e.g. Capitanio 1999, Am J Primatology), and this has important implications for the curiosity, exploration behavior, and likelihood to accept and complete new challenges in training. To what extent could the differences in learning rate also be explained by these differences across these 3 monkeys? To what extent does the different training regimen in the task explain differences in learning rate across monkeys (e.g. M2 got two days of repeating correction trials, which significantly alters learning rates)?

      3) There is a vast literature in ethological settings where the gaze of nonhuman primates has been tracked using noninvasive methods that the authors do not acknowledge. Instead, authors state that most infrared eye trackers require head restraint (line 32), though this is demonstrably not the case. For review, see Hopper et al. 2020, Behav Res Methods.

      4) Some important details for introducing monkeys to the testing apparatus during Tailored Automated Training should be described. For example, were animals water-restricted, or on any sort of fluid restriction when TAT began? How did the authors entice the animals to initially explore the testing apparatus?

    4. Summary: This manuscript describes a new experimental environment for training macaque monkeys to perform behavioral tasks. Using this facility, the authors trained freely moving macaques to perform a visual "same-different" task using operant conditioning, and under voluntary head restraint. The authors demonstrate that they could obtain reliable eye-tracking data and high-performance accuracy from macaques in this facility. They also noted that subordinate macaques can learn to perform basic aspects of the task by observing their dominant conspecifics perform the task in this facility. The authors conclude that this naturalistic environment can facilitate the study of brain activity during natural and controlled behavioral tasks.

      The manuscript is broadly organized along three distinct lines of inquiry. First, the authors describe a customized living space for a small group of macaque monkeys. Second, the authors train two of these monkeys to perform a cognitive task in a purpose-built room of the living enclosure. Third, the authors describe their experience training a third monkey to complete the cognitive task.

    1. Reviewer #2:

      In this paper, Numssen and co-workers focus on the functional differences between hemispheres to investigate the "domain-role" of IPL in different types of mental processes. They employ multivariate pattern-learning algorithms to assess the specific involvement of two IPL subregions in three tasks: an attentional task (Attention), a semantic task (Semantics) and a social task (Social cognition). The authors describe how, when involved in different tasks, each right and left IPL subregion recruits a different pattern of connected areas.

      The employed tasks are "well established", and the results confirm previous findings. However, the novelty of the paper lies in the fact that the authors use these results as a tool to observe IPL activity when involved in different domains of cognition.

      The methodology is sound, well explained in the method section, the analyses are appropriate, and the results clear and well explained in the text and in graphic format.

      However, a solid experimental design is required to provide strong results. To the reviewer's view, the employed design can provide interesting results about functional connectivity, but not about the functional role of IPL in the investigated functions.

      I think the study would be correct and much more interesting if only based on functional connectivity data. Note that rewriting the paper accordingly would lead to a thorough discussion about how anatomical circuits are differently recruited based on different cognitive demands and about the variable role of cortical regions in functional tasks. This issue is neglected in the present discussion, and this concept is in disagreement with the main results, suggesting (probably beyond the intention of the authors) that different parts of the right and left IPL are the areas responsible for the studied functions.

      Major points:

      1) The 3 chosen tasks explore functions that are widespread in the brain, and are not specifically aimed at investigating IPL. The results (see. e.g. fig 1) confirm this idea, but the authors specifically focus on IPL. This seems a rather arbitrary and not justified choice. If they want to explore the lateralization issue, they should consider the whole set of involved areas or use tasks showing all their maximal activation in IPL.

      2) The authors aims to study lateralization using an attentional task, considering the violation of a prevision (invalid>valid), a linguistic task, looking for an activation related to word identification (word>pseudoword) and a social task, considering correct perspective taking (false belief>true belief), but they do not consider that in all cases a movement (key press) is required. It is well known that IPL is a key area also for creating motor commands and guiding movements. Accordingly, the lateralization bias observed could be due more to the unbalance between effectors while issuing the motor command, than to a different involvement of IPL regions in the specific tasks functions.

      3) Like point 2, the position of keys is also crucial if the authors want to explore lateralization. This is especially important if one considers that IPL plays a major role in spatial attention (e.g. Neglect syndrome). In the Methods, the authors simply say "Button assignments were randomized across subjects and kept identical across sessions", this should be explained in more detail.

      4) The authors show to know well the anatomical complexity of IPL, however their results are referred to two large-multiareal-regions. This seems to the reader at odds with all the descriptions related to fig.2. If they don't find any more subtle distinction within these 2 macro-regions, they should at least discuss this discrepancy.

      5) The part about Task-specific network connectivity is indeed very interesting, I would suggest to the authors to focus exclusively on this part. (Note that the results of this part seems to confirm that only the linguistic task is able to show a clear lateralization).

    2. Reviewer #1:

      The authors have performed a rare feat in the study of the posterior parietal cortex, which is to achieve a functional parcellation of this crucial area on the basis of its response during a diverse set of tasks. The variety of tasks and the analytical approach married to it are very strong and lead to a division that agrees well with data from patients with lesions and studies in homologous areas of non-human primates.

      Readers are encouraged to note the analytical approach, with particular regard for the permutation testing that establishes the differences between the tasks in the functional connectivity of the area.

      Conceptually, this paper is another strong argument for understanding the broad role of the posterior parietal across tasks and point at the flexibility of its functional response in supporting those roles.

      This manuscript lays out a series of fMRI investigations and analyses centered on examining the response of the IPL during three different tasks (attention, semantics, social cognition). The analyses are largely data-driven and examine functional response and connectivity, to make the argument for a functional parcellation of the IPL into at least two distinct subregions. The manuscript is well-written and the analyses well described. There are some concerns about the analyses that dampen enthusiasm slightly and a lack of consideration of the associated literature in non-human primates, but these problems seem imminently correctable.

      The analyses begin with a data-driven cluster analysis across an anatomically constrained IPL ROI, searching for cluster solutions that efficiently parcellate IPL on the basis of the response of voxels across the three tasks. This analysis is fine, but does constrain the average activity in the identified clusters to differ across the tasks. That makes the univariate activation in 3b a bit circular and hard to interpret. Either the error bars should be removed and a note added that the univariate activity is purely descriptive or the univariate data should be displayed from a slice of the data that did not contribute to the derivation of the clusters. The strongest version of this analysis would hold out entire participants.

      The predictive coding analysis is potentially informative but the details were a bit unclear. In the one versus rest analysis the strongest test would be to build the model on the data from n-1 participants and then test it on the trials of the held-out participant. If this was not done, some justification for not doing it would be in order.

      Finally, the authors should also consider integrating some of the non-human primate literature as it only strengthens their case. In the human literature the IPL has proved a tough nut to crack, but the single unit physiology has revealed strong differences in the homologous areas of macaque, some of which directly map onto the division argued for here.

    3. Summary: Overall the reviewers felt that the manuscript had a fair amount of promise but raised some issues about the specific tasks used and some details of the analysis. One reviewer in particular felt that the manuscript should be reworked around the functional connectivity results, which would strengthen the manuscript. I tend to agree with this assessment, particularly as concerns the lateralization framing which is not very well explored by these tasks.

    1. Reviewer #3:

      The manuscript explores ageing-associated changes in the Drosophila escape-response (Giant Fiber, GF) circuit and the circuits converging onto the GF. This a convenient system amenable to detailed physiological analyses and the authors made a good effort in extracting a large amount of useful information using a wide range of electrophysiological readouts. The authors identified several physiological parameters that are potentially useful for indexing ageing progression in flies such as ID spike generation and ECS-evoked seizure threshold. The host lab is well-known for its expertise in the field of GF physiology; consequently, the experiments were done with a high level of technical competence and presented (mostly) in a clear and informative manner. There is, however, one major issue that could restrict the usefulness of the data presented in the manuscript (please, see major comment 1).

      Major comments:

      1) Standards for conducting ageing studies in Drosophila and other model systems have gone significantly up in the last ~15 years following experimental evidence that genetic background can (and does) have a significant effect on the outcome of 'ageing' experiments (see Partridge and Gems, Nature, 2007). Today, 'backcrossing' relevant lines into a reference wild-type strain multiple times (to remove any second-site mutations) is a gold standard for virtually all ageing studies in Drosophila. Furthermore, this approach is being widely adopted even in the studies investigating physiological properties in developing flies (for example, in Imlach, Cell, 2012, the authors obtained very different electrophysiological results after 'isogenizing' the genetic background via backcrossing, and concluded that "the previous finding may have been due to a second site mutation"). As this important step is not mentioned in either the main text or in 'Methods' section, it is reasonable to conclude that the authors did not perform this step prior to conducting the experiments. Recent papers, one of which was referenced by the authors (Augustin et al PloSBiol 2017 and NeuroAging 2018) repeatedly demonstrated a significant, age-associated increase in the short-response (TTM and DLM) latency in the GF circuit following a strong stimulation of the GF cell bodies in the brain. It is likely that these age-related changes in the GF circuit remained undetected in the flies with non-uniform genetic background likely used in this work. The same problem affects the paper (Martinez, 2007) referenced by the authors throughout the manuscript.

      It is difficult to say which of the findings reported here are most affected by the variability in the genetic background, but any kind of correlation between the lifespans (Figure 1B) and physiological parameters should be taken with a high dose of scepticism.

      2) The manuscript is entirely 'phenomenological' in the sense that it does not investigate the causes of the observed physiological changes. The manuscript (with minor exceptions) does not discuss the possible reasons behind the functional readouts or speculate about what makes the (sub)circuits differentially susceptible to the effect of ageing. For example, when mentioning the effects of temperature and Sod mutation on the fly physiology, the authors limit their comments to generic and obvious statements such as 'oxidative stress exerts strong influences differentially on some of the physiological parameters and the outcomes are distinct from the consequences of high-temperature rearing'. Some of the possible questions the authors could ask are: could changes in the kinetics of relevant ion channels explain some of the results obtained under different temperatures; could the previously demonstrated effect of ROS on voltage-gated sodium channels explain some of the Sod1 phenotypes, etc?

    1. Summary: This work synthesizes bioinformatics, in vivo, and in vitro transport assays to understand the molecular basis for substrate selection and promiscuity of the mitochondrial carrier family (SLC25). This comprehensive work will be of interest to the fields of mitochondrial physiology, transporter specificity and evolutionary dynamics. However, in its current form, it lacks some critical controls for protein expression and some important details about the methodology.

      Reviewer #1 and Reviewer #2 opted to reveal their name to the authors in the decision letter after review.

      Public review:

      This paper takes a novel and comprehensive approach to understand the molecular basis for substrate selection and promiscuity of the mitochondrial carrier family (SLC25). Informed by a deep assessment of evolutionarily conserved features, mutants that selectively impair Pi flux, but retain Cu2+ transport for the mammalian transporter SLC25A3 were established using a variety of in vitro and in vivo transport assays. In addition to providing a molecular perspective on substrate specificity in mitochondrial carrier proteins, this paper provides interesting and convincing insight into how subfamilies of transporters evolved by juggling substrate specificity. However, in its current form, it lacks some critical controls for protein expression and some important details about the methodology, which are enumerated below:

      1) This manuscript does not report any controls for expression levels or membrane localization of the mutants analyzed in Figure 6. These controls are essential to fairly compare the growth phenotypes/transport capacity of the assorted mutants relative to WT Pic2.

      2) The methods section lacks details required to fully understand several different experiments.

      -Figure 1G shows an analysis with reconstituted proteins, but the methods contain no information about purification or reconstitution of the transporters, or the origin of the CuL fluorescent reporter, so it is difficult to evaluate this line of evidence.

      -The methods do not contain information about the NMR experiment shown in Figure S4, and the interpretation of this data as containing a benzene ring is probably not obvious to a broader scientific audience.

      -The details are also sparse regarding the preparation of the homology model. How much sequence similarity do the ATP/ADP translocase and PIC2 share? How large are the insertions and deletions that were addressed by manual alignment? Was an ensemble of models calculated? It is likely that a number of plausible models could be produced - were any alternative models considered? The clustering of the conserved residues shown in Figure 4 is a nice way to validate the model. It would also be nice to analyze whether the homology model shows the expected pattern of hydrophobic residues facing the membrane.

      -The authors should include all details of the bioinformatic pipeline as supplementary data, including the list of gene ids and/or sequences, phylogenetic tree of the initial 2445 sequences (neighbour joining tree) to show PIC2/MIR1 clusters, and the 92 final sequences (gene IDs, multiple sequence alignment). In addition, the authors should show the entire tree of the superfamily.

      3) The manuscript would be strengthened by additional discussion about what is known (if anything) about the functions of other transporters in the PIC2/MIR1 family. Much of the interpretation of the phylogeny regarding the outcomes of gene duplication seem to depend critically on whether the functions and substrate specificities of the yeast and mammalian homologues described here are representative of the entire clade. Likewise, the authors do not indicate whether there is evidence that neighboring sequences outside the core PIC2/MIR1 cluster are not functionally homologous (promiscuous Cu and/or phosphate transport) to PIC1/MIR1.

    1. Reviewer #2:

      In this manuscript, the authors set out to measure participant's decisions about when an item occurred in a short list of 3 or 4 items, where the first and last items were always at the beginning and end, respectively. They report two behavioral studies that examine time judgments to items in the intermediate positions. They show that time judgments (when did you see X item using a continuous line scale) are always a little off but, more importantly, they tend to be anchored to other items presented. The results are interesting and add to our knowledge of the representation of time in the brain mainly by introducing a new paradigm with which to study time. Within the broader context of research on timing capacities, it should not be surprising that participants do not have a continuous representation of time that lasts beyond traditional time interval training of a few hundred milliseconds to a few seconds. Furthermore, research has also shown that 'events' that require attentional resources do morph our perception and memory for time. So while the paradigm is worth expanding on, the behavioral results are not surprising given this past literature. I do feel however that this work is an important first step in developing a more firm model of memory for time.

    2. Reviewer #1:

      This manuscript reports the results of two timing experiments. The experimental paradigm asks participants to judge the time of target items in an unfilled interval between two landmark stimuli. In experiment 1, there is one item that must be judged. In experiment 2, there are two items to be judged. The basic empirical result is that relative order judgments in experiment 2 are more accurate than one might expect from the absolute timing judgments of experiment 1. A model is presented.

      My overall reaction is that this paper does not present a sufficiently noteworthy empirical result. I can't imagine that there is a cognitive psychologist studying memory who would be surprised by the finding that relative order judgments in the second experiment are more accurate than one might expect from the absolute judgments in experiment 1. On the encoding side, in these really short lists (with no secondary task), there is nothing preventing the participant from noting and encoding the order as the items are presented (not unlike the recursive reminding). On the retrieval side, we've known for a very long time that judgments of serial position use temporal landmarks (see for instance a series of remarkable studies by Hintzman and colleagues circa 1970).

      Methodologically, this paper falls short of the standards one would expect for a cognitive psychology paper. There are basically no statistics or description of the distribution of the effect across participants. Although I'm pretty well-convinced that the basic finding (distributions in experiment 2 are different from experiment 1), I could not begin to guess at an effect size. The model is not seriously evaluated. The bimodal distributions are a large qualitative discrepancy that is not really discussed.

      Although the title of the paper invites us to understand these results as telling us something about episodic memory, the empirical burden of this claim is not carried. Amnesia patients (and animals with hippocampal lesions) show relatively subtle differences in timing tasks. There is no evidence presented here, nor literature review, to convince the reader of this point.

    1. Summary: This work assesses the role of within-host viral shedding dynamics and contact heterogeneity on distribution of transmission events in SARS-CoV-2 and influenza. Using multi-scale modeling, with similar resulting generation time and serial interval distributions to published work, predictions are made on the manner and contribution of super spreading to transmission. Distinctions are seen when comparing to applying a similar modeling framework to influenza.

      Essential revisions:

      1) Statistical analysis: The model parameters are estimated using an exhaustive grid search, which yields good fits for the best-fit values, but there is no assessment of statistical certainty in the parameter values. The authors essentially adopted a strategy in the spirit of approximate Bayesian computation (ABC), by proposing parameter values, simulating from a model, and comparing summary statistics of the simulated output to known values from the literature. The analysis would be helped by doing a more formal ABC analysis, as this would provide a better sense of how narrowly constrained the parameter values are given the available data. At minimum, it would be more convincing to consider additional parameter sets gridded across a narrowed region of parameter space before selecting an optimal fit.

      2) Model validation The state of our knowledge about these infections is limited, both by the short time during which this research has been conducted, and the paper's need to rely on data taken from before the introduction of confounding factors such as social distancing and widespread mask usage. For this reason, in addition to the included sensitivity analysis for the model parameters, a sense of the sensitivity of the model's conclusions to the data set to which it is being fitted is needed. How much would these results change if there are errors in our understanding of the distribution of individual R0 values, or serial intervals?

      3) Distinction in assumptions for flu and covid The populations on which the histograms for the two diseases are based are quite different. For SARS-CoV-2, the studies are from China (Shenzhen, Tianjin and Hong Kong), while those for influenza are from Switzerland. Could cultural differences be relevant? What about seasonal differences, as the time during which the early SARS-CoV-2 studies occurred was necessarily restricted?

      Furthermore, the explanation for the difference between influenza and COVID is based primarily on differences in contact patterns. While the discussion (L. 511-523) clarifies this to be based on the efficiency with which exposures lead to infections (and pre-symptomatic transmission), which does sound like a viral parameter, rather than a social one. These viral factors do seem more believable than having to explain why the patterns of social contact exhibited by influenza patients would differ from those of SARS-CoV-2 patients. More focus on possible mechanistic explanations is warranted.

    1. Reviewer #3:

      The manuscript describes interesting experimental and modelling results of a novel study of human navigation in virtual space, where participants had to move towards a briefly flashed target using optic flow and/or vestibular cues to infer their trajectory via path integration. To investigate whether control dynamics influence performance, the transfer function between joystick deflection and self-motion velocity was modified trial-by-trial in a clever way. To explain the main result that navigation error depends on control dynamics, the authors propose a probabilistic model in which an internal estimate of dynamics is biased by a strong prior. Even though the paper is clearly written and contains most of the necessary information, the study has several shortcomings, as outlined below, and an important alternative hypothesis has not been considered, so that some of the conclusions are not fully supported by results and modelling.

      Substantive concerns

      1) The main idea of the paper for explaining the influence of control dynamics is that for accurate path integration performance participants have to estimate dynamics. This idea is apparently inspired by studies on limb motor control. However, tasks in these studies are often ballistic, because durations are short compared to feedback delays. In navigation, this is not the case and participants can therefore rely on feedback control (for another reason, why reliance on sensory feedback in the present study is a good idea, see point 2 below). This means that the task can be solved, even though not perfectly, without actually knowing the control dynamics. Thus, an alternative hypothesis for explaining the results that has not been considered is that the error dependence of control dynamics is a direct consequence of feedback control. Feedback control models have previously been suggested for goal-directed path integration (e.g., Grasso et al. 1999; Glasauer et al. 2007).

      To test this assumption, I modelled the experiment assuming a simple bang-bang feedback control that switches at a predefined and constant perceived distance from the target from +1 to -1 and stops when perceived velocity is smaller than an epsilon. Sensory feedback is perceived position, which is assumed to be computed via integration of optic flow. This model predicts a response gain of unity, a strong dependence of error on time constant (slope similar to Fig. 3) or of response gain on time constant (Eqn. 4.1) with regression coefficients of 0.8 and 0.05 (cf. Fig. 3D), and a modest correlation between movement duration and time constant (r approximately 0.2, similar to Fig. 3A). Thus, a feedback model uninformed about actual motion dynamics and without any attempt to estimate them can explain most features of the data. Modifications (velocity uncertainty, delayed perception, noise on the stopping criterion, etc.) do not change the main features of the simulation results.

      Accordingly, since simple feedback control seems to be an alternative to estimating control dynamics in this experiment, the authors' conclusion in the abstract "that people need an accurate internal model of control dynamics when navigating in volatile environments" is not supported by the current results.

      2) Modelling: the main rationale of the model (line 173 ff: "From a normative standpoint, ...") is correct, but an accurate estimate of the dynamics is only required if the uncertainty of the velocity estimate based on the efference copy is not too large. Otherwise, velocity estimation should rely predominantly on sensory input. In my opinion that's what happens here: due to the trial-by-trial variation in dynamics, estimates based on efference copy are very unreliable (the same command generates a different sensory feedback in each trial), and participants resort to sensory input for velocity estimation. This results in feedback control, which, as mentioned above, seems to be compatible with the results.

      3) Motion cueing: Motion cueing can, in the best case, approximate the vestibular cues that would be present during real motion. Furthermore, it is not clear whether the applied tilt is really perceived as linear acceleration, or whether the induced semicircular canal stimulus is too strong so that subjects experience tilt. Participants might have used the tilt as indicator for onset or offset of translational motion, specifically because it is self-generated, but the contribution of the vestibular cues found in the present experiment might be completely different from what would happen during real movement. Therefore, conclusions about vestibular contributions are not warranted here and cannot solve the questions around "conflicting findings" mentioned in the introduction.

      4) Methods: I was not able to find an important piece of information: how many trials were performed in each condition? Without this information, the statistical results are incomplete. It was also not possible to compute the maximal velocity allowed by joystick control, since for Eqn. 1.9 not just the displacement x and the time constant is required, but also the trial duration T, which is not reported. One can only guess from Fig. 1D that vmax is about 50 cm/s for tau=0.6 s and therefore the average T is assumed to be around 8.5 s.

      5) Results: information that would be useful is not reported. On page 6 it is mentioned that the "effect of control dynamics must be due to either differences in travel duration or velocity profiles", it is then stated that both are "unlikely", but no results are given. It turns out that in the supplementary Figure 4A the correlation between time constant and duration/velocity is shown, and apparently the correlation with duration is significant (but small) in the majority of cases. Why is that not discussed in the results section? Other results are also not reported, for example, what was the slope of the dependence between time constant and error? Why is the actual control signal, the joystick command, not shown and analyzed?

    2. Reviewer #2:

      The authors asked how the brain uses different sensory signals to estimate self-motion for path integration in the presence of different movement dynamics. They used a new paradigm to show that path integration based on vision was mostly accurate, but vestibular signals alone led to systematic errors particularly for velocity-based control.

      While I really like the general idea and approach, the conclusions of this study hinge on a number of assumptions for which it would be helpful if the authors could provide better justifications. I also have some clarification questions for certain parts of the manuscript.

      1) Lines 26-7: "performance in all conditions was highly sensitive to the underlying control dynamics". This is hard to really appreciate from the residual error regressions in Fig 3 and seems to be contradicting Fig 5A (for vestibular condition). A more explicit demonstration of how tau affects performance would be helpful.

      2) One of the main potential caveats I see in the study design is the fact that trial types (vest, visual, combined) were randomly interleaved. In the combined condition, this could potentially result in a form of calibration of the vestibular signal and/or a better estimate of tau that then is used for a subsequent vestibular-only trial. As such, you'd expect a history effect based on trial type more so (or in addition to) simple sequence effects. This is particularly true since you have a random walk design for across-trial changes of tau. In other words, my question is whether in the vestibular condition participants simply use their previous estimate of tau, since that would be on average close enough to the real tau?

      3) I thought the experimental design was very clever, but I was missing some crucial information regarding the design choices and their consequences. First, has there been a psychophysical validation of GIA vs pure inertial acceleration? Second, were GIAs always well above the vestibular motion detection threshold? In other words could the worse performance in the vestibular condition be simply related to signal detection limitations? Third, how often did the motion platform enter the platform motion range limit regime (non-linear portion of sigmoid)?

      4) Lines 331-345: it's unclear to me why you did not propose a more normative framework as outlined here. Especially, a model that would "constrain the hypothesized brain computation and their neurophysiological correlates" would be highly desirable and really strengthen the future impact of this study.

      5) I would highly recommend all data to be made available online in the same way as the analysis code has been made available.

    3. Reviewer #1:

      The authors investigated the importance of visual and vestibular sensory cues and the underlying motion dynamics to the accuracy of spatial navigation by human subjects. A virtual environment coupled with a 6-degrees of motion platform, as described in prior studies, allowed precise control over sensory cues and motion dynamics. The research builds on previous work in several important ways: 1) the authors demonstrate that reliance on vestibular cues leads to an undershooting of trajectories to hidden goal locations, 2) manipulation of the underlying motion dynamics (the time constant) during navigation alters the accuracy of trajectories particularly when subjects are reliant on vestibular cues, 3) probabilistic models were used to demonstrate that path integration errors can be explained by mis-estimates of the underlying motion time constants, and 4) time constant estimates were improved when visual cues were available. Overall, the analyses are appropriate, the conclusions are judicious, and the authors provide an important contribution to understanding the sensory mechanisms underlying human spatial navigation.

      1) Some minor methodological clarifications: how many trials were performed per subject? How many of the trials were performed in each condition (visual, vestibular, combined)?

      2) The study tested performance by both male and female subjects. Could the authors comment as to whether sex differences were observed across performance measures? Perhaps sex can be indicated in some of the scatter plots.

      3) Figure 2A. It would be helpful if the authors identified the start-point of the trajectory and also provided more explanation of the schematic in the caption.

      4) Figure 2B-C. It would be helpful if the authors could expand this section to show some example trajectories and the relationship between examples and plotted data points. This could be done by presenting measures (radial distance, angular eccentricity, grain) for each example trajectory.

      5) Because the range of sampled time-constants can vary across subjects, it would be nice to show plots as in Figure 3B for each subject (i.e., in supplementary material).

      6) Discussion. The broader implications of the findings from the models are not sufficiently discussed. In addition, some comparison could also be made to other recent efforts to model path integration error (e.g., PMC7250899).

    4. Summary: In this manuscript, the authors investigated the importance of visual and vestibular sensory cues and the underlying motion dynamics to the accuracy of spatial navigation by human subjects. A virtual environment coupled with a 6-degrees of motion platform, as described in prior studies, allowed precise control over sensory cues and motion dynamics. To investigate whether control dynamics influence performance, the transfer function between joystick deflection and self-motion velocity was modified at each trial, resulting in subjects relying more on velocity or acceleration to find their way. To explain the main result that navigation error depends on control dynamics, the authors propose a probabilistic model in which an internal estimate of dynamics is biased by a strong prior. Overall, the three reviewers agree that additional data are not necessary. However, the analyses need to be clarified and the conclusion better justified.

      Reviewer #1, Reviewer #2 and Reviewer #3 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      This is an outstanding work from the lab of Dr. Stains establishing rapid post-translational regulation of sclerostin, a robust inhibitor of bone formation. They carefully and clearly establish that sclerostin is rapidly degraded by lysosomes in response to mechanical loading, and further link lysosomal abnormalities, using Gaucher iPSCs, to sclerostin levels.

    2. Reviewer #2:

      The article by Gould et al breaks new ground by demonstrating a role for lysosomal-mediated degradation in the mechanosensitive repression of Sclerostin levels in bone. Though the post-translational repression of Sclerostin has long been apparent, no one has yet unraveled the mechanisms. Therefore, this discovery is important to the skeletal biology community - both because of the findings themselves, and because the conditions/models used by this team to make these discoveries will be useful for other investigators, including their ability to manipulate and observe the rapid lysosome-dependent control of Sclerostin levels in vitro and in vivo in response to PTH or mechanical stimulation. In addition to the importance within this field, the work has broad impact on multiple levels including a) the clinical relevance for understanding and potentially treating osteoporosis and the skeletal phenotypes in individuals with lysosomal disease, and b) the mechanoregulation of lysosomal function and its relationships to crinophagy, which has implications not only for the regulation of Sclerostin, but also for other factors in and beyond the skeleton (RANKL, insulin).

      The study is elegantly designed, clearly communicated, and rigorously conducted. The conclusions drawn in the manuscript are mostly supported by the data provided. In general, it is important to elaborate on what gives the authors confidence that the inhibitors were effective and act as expected throughout the study - but especially Bafilomycin A1 and Apocynin in vivo. If BafA1 and Apocynin treatment in vivo work as expected, they should prevent the rapid load-dependent repression of Sclerostin levels (shown in Figure 1D).

      Other revisions or additions, described below, would improve the quality of the study:

      1) Are Sclerostin levels insensitive to FSS or PTH in Gaucher cells (though it understandably may not be feasible to differentiate these cells in microfluidic devices)?

      2) Since a sex-specific effect of exercise on bone anabolism has previously been described, and TRPV4 also has a sexually dimorphic effect on bone, were any differences observed between male and female animals here?

      3) Can the authors discuss where the pathway used by PTH diverges from that activated by FSS/load?

      4) Is it possible to detect load dependent changes in sclerostin localization in lysosomes in vivo?

      5) Given the non-specific effects of hydrogen peroxide, Figure 6D may not add a great deal in light of the other data that was gathered with more rigorous approaches. Additional controls would give more confidence in the efficacy/specificity of this approach.

      6) Please include how long the OCY454 cells were differentiated prior to the treatments applied.

      7) Please identify the route by which inhibitory agents were administered to the mice (i.e. subcutaneous, intraperitoneal).

      8) Please increase the N for experiments in Figure 4A and 5D, or remove these data and the corresponding conclusions.

    3. Reviewer #1:

      This manuscript by Gould et al presents highly novel data which is logically presented and is likely to have both clinical and fundamental implications. Of relevance to the bone field, it defines a new mechanism by which one of the most important clinical targets for the treatment of osteoporosis is endogenously regulated. Beyond bone, I am not aware of any other examples of stimulus-directed acute lysosomal degradation of a secreted canonical Wnt antagonist as a mechanism to provide rapid de-repression. What seems lacking is a careful analysis of the physiological consequences of the acute degradation of sclerostin.

      1) A landmark paper which convinced many in the field that sclerostin down-regulation is necessary for osteoanabolic responses to loading was based on a transgenic model from the Bellido lab (Tu et al, Bone, 2012). In that study, expression of Sost from the DMP-1 promoter precluded its transcriptional and protein-level down-regulation at late time points. That was sufficient to largely prevent bone gain following loading. Several other groups interpreted this as indicating Sost transcript regulation is required for bone's adaptation to loading, calling into question the physiological relevance of transient post-translational degradation described here. Can the authors reconcile that study with their own?

      2) One way the authors attempt to demonstrate in vivo relevance is through western blotting of mechanically loaded mouse ulnas, showing previously-undocumented acute reductions in lysate sclerostin levels. It is standard practice in the field to quantify sclerostin positive osteocytes histologically, rather than by western blotting. This is because mechanical loading can rapidly increase blood flow to the limb (even in this study, the authors implicate the vasodilator NO) as well as having inflammatory effects, diluting the proportion of osteocyte-specific proteins in the lysate. Demonstrating protein-level sclerostin down-regulation specifically in osteocytes rapidly following loading would be a major addition to this study.

      3) A long-stranding, reproducible finding which has always been very perplexing is that the largest transcriptomic responses to osteogenic mechanical loading occur very quickly, within an hour of loading, before Sost is down-regulated. Even in UMR106 cells in vitro, B-catenin is stabilised before Sost is down-regulated following exposure to substrate strain. The current findings may explain this temporal discrepancy. The authors should responses to sclerostin degradation such as quantifying Wnt target genes to provide physiologically-relevant readouts of their findings.

      4) Figure 3 shows co-localisation of endogenous or ever-expressed sclerostin with lysosomal markers. Does this co-localisation change following FSS or PTH?

      5) It is not clear whether early lysosomal degradation which transiently decreases sclerostin is triggered by the same mechanoresponsive pathways which subsequently down-regulate its RNA levels, or whether the two responses are distinct. Can the authors clarify this? For example, does Sost decrease in the BafA1-treated cells 8 hours after FSS or PTH treatment?

      6) Discussion "that the rapid and transient nature of sclerostin degradation may be critical to the precise anatomical positioning of new bone formation following an anabolic stimulus" is very unclear. How do the authors propose that lysosomal sclerostin degradation produces regionalised responses to a greater degree than the previously-reported transcriptional mechanisms?

      7) The evidence of lysosomal involvement in sclerostin down-regulation is largely based on pharmacological compounds of limited selectivity. A degree of genetic evidence is indirectly provided by the Gaucher cell line, but this is based on a single patient line. Can the authors provide direct genetic evidence that lysosomal function is necessary for sclerostin down-regulation, and ideally for bone formation?

      8) References to previous studies which described mechanisms and relevance of Sost down-regulation are sparse. For example, see previous implications of NO signalling from the Vanderschueren lab (Callewaert et al, JBMR, 2010), protein-level down-regulation of sclerostin in the context of ageing from the Price lab (Meakin et al, JBMR, 2014) relevant to the discussion in the current manuscript, as well as work from the Ferrari lab on sclerostin regulation following both PTH and mechanical loading (e.g. Bonnet et al, JBC 2009; Bonnet et al, PNAS 2012).

    4. Summary: The article by Gould et al breaks new ground by demonstrating a role for lysosomal-mediated degradation in the mechanosensitive repression of Sclerostin levels in bone. Though the post-translational repression of Sclerostin has long been apparent, no one has yet unraveled the mechanisms. Therefore, this discovery is important to the skeletal biology community - both because of the findings themselves, and because the conditions/models used by this team to make these discoveries will be useful for other investigators, including their ability to manipulate and observe the rapid lysosome-dependent control of Sclerostin levels in vitro and in vivo in response to PTH or mechanical stimulation. In addition to the importance within this field, the work has broad impact on multiple levels including a) the clinical relevance for understanding and potentially treating osteoporosis and the skeletal phenotypes in individuals with lysosomal disease, and b) the mechanoregulation of lysosomal function and its relationships to crinophagy, which has implications not only for the regulation of Sclerostin, but also for other factors in and beyond the skeleton (RANKL, insulin).

      Essential revisions:

      The study is elegantly designed, clearly communicated, and rigorously conducted. However, the reviewers require additional data to support the overall conclusion on the significance of lysosome-mediated degradation of sclerostin in skeletal biology. First, it is important to elaborate on what gives the authors confidence that the inhibitors were effective and act as expected throughout the study - but especially Bafilomycin A1 and Apocynin in vivo. If BafA1 and Apocynin treatment in vivo work as expected, they should prevent the rapid load-dependent repression of Sclerostin levels (shown in Figure 1D). Second, the author's demonstration of mechanical load-dependent changes in sclerostin localization in osteocytes lysosomes in vivo by immunohistochemistry would be important to support the in vivo relevance of this pathway in the acute regulation of sclerostin levels. While the western blotting of mechanically loaded mouse ulnas showing previously-undocumented acute reductions in lysate sclerostin levels is interesting, it is unclear if these changes are caused by mechanical loading-induced lysosomal function.

      Reviewer #1 and Reviewer #2 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      In this work, Feilong and colleagues use Human Connectome Project fMRI data to investigate the degree to which the strength of functional connectivity is predictive of general intelligence, and the degree to which that predictive power is improved using the hyperalignment procedures their lab has previously developed. I am broadly very supportive of the goals of improving prediction of individual behavioral differences via improved, functionally-based cross-subject registration, and I have always felt that the hyperalignment procedure is one of the most promising approaches for improving cross-subject functional registration. Overall I feel that this paper is an important next step in the development and maturation of the hyperalignment technique.

      However, I do have two significant concerns with the predictive modeling presented in this work. I note that I am not an expert in these techniques, so these concerns may be due to my own ignorance; however, I would like to see the authors at least better explain these issues to non-experts like myself.

      First, the authors employed a leave-one-family-out cross-validation scheme for their predictive modeling. My understanding is that the field has generally moved away from leave-one-out or leave-few-out cross-validation, as that approach consistently overestimates the predictive power of generated models. The HCP is a large dataset. Can the authors employ a more robust approach of using fully split halves?

      Second, the authors make the claim that fine-grained (vertex-wise) connectivity has substantially better predictive power than coarse-grained (parcel-wise) connectivity, based on the variance in intelligence explained by the predictive models. However, the models based on fine-grained connectivity also have many, many more variables being used to make the prediction. Is this not a confound?

    2. Reviewer #2:

      Summary:

      This paper predicts intelligence using either coarse-grained functional connectivity (based on 360 ROIs) or fine-grained functional connectivity (vertex-wise) after hyperalignment. The results show a two-fold increase of variance explained in general intelligence between coarse-grained and fine-grained connectivity.

      General:

      This is a very clearly-written paper that presents an important result, which has the potential of great impact on the field of behavioral prediction. My comments below are relatively minor and primarily aimed at clarifying a few details in the article. Please find my detailed comments below, approximately in order of importance.

      Major comments:

      1) The fine-grained functional connectivity has richer features than coarse-grained, leading to higher dimensionality in the PCA step (supplementary figure S5). I wonder if this might contribute to improved prediction accuracy. Related to this, it appears that there may also be a relationship between PCA dimensionality and regularization parameter, such that more regularization may be needed when more PCs are used in the model. It would be interesting to test the effect of fixing the PCA dimensionality (and perhaps also the regularization) across all models to control model complexity.

      2) The Glasser 360 parcellation was used throughout this work. There are subject-specific parcels and group-level parcels available for this parcellation. Please clarify which of these were used. If the group-level parcels were used, it might be interesting to see how the coarse-grained prediction accuracies might improve when using subject-specific parcels.

      3) The residuals of fine-grained connectivity profiles were obtained after subtracting coarse-grain connectivity. Why was subtraction used here, rather than regressing out (i.e., orthogonalizing with respect to) the coarse-grained connectivity?

    3. Reviewer #1:

      In this study, Feilong and colleagues showed that hyper-aligned fine-grained cortical connectivity profiles can be used to strongly predict general intelligence in individual participants. This is an important study demonstrating the utility of previously developed connectivity hyperalignment and highlighting the behavioral importance of fine-grained connectivity which is typically ignored in more standard functional connectivity analysis.

      1) How does the bootstrapping handle the family structure in the data? More details are needed.

      2) The authors mentioned that "the code for performing hyperalignment and nuisance regression was adapted from PyMVPA". One of the most important contributions of this study is the impressive demonstration of prediction performance improvement using hyperalignment and fine-grained connectivity profiles. Therefore, it is important that the adapted code and code utilized for the current study be made publicly available. While connectivity hyperalignment code from the previous study is available in PyMVPA, my experience is that it is not easy to use. If no code from the current study is made available, I believe it will be very difficult to replicate this study.

    4. Summary: In this work, Feilong and colleagues use the Human Connectome Project fMRI data to investigate the degree to which the strength of functional connectivity is predictive of general intelligence, and the degree to which that predictive power is improved using the hyperalignment procedures their lab has developed. More specifically, the authors predict general intelligence using either coarse-grained functional connectivity (based on 360 ROIs) or fine-grained functional connectivity (vertex-wise) after hyperalignment. The results show a two-fold increase in variance explained in general intelligence between coarse-grained and fine-grained connectivity. This is a very clearly-written paper that presents an important result, which has the potential of great impact on the field of behavioral prediction. However, the reviewers and editors do have some significant concerns with the predictive modeling presented in this work.

      Reviewer #2 and Reviewer #3 opted to reveal their name to the authors in the decision letter after review.

    1. Reviewer #3:

      The manuscript "High-quality carnivore genomes from roadkill samples enable species delimitation in aardwolf and bat-eared fox" is mostly well written and demonstrates an interesting and useful method for sequencing genomes from low-quality samples. They also provide a comprehensive overview of the state of genomics across the Carnivora clade, with some improved species/subspecies designations. I think the work is of broad interest. The analyses are mostly clear and I think a few additional analyses and small improvements could be made prior to publication, but otherwise have no issues.

      The additional analyses/clarification I would recommend regards the Genetic differentiation estimate: This is a really interesting statistic! For some of the species you have multiple individuals it seems? Can you explain this a little more in the text. I am just not entirely convinced that the statistic is robust, but I think it would be with a few more analyses. My concern is primarily due to having only two individuals in some of your comparisons, because of population structure/relatedness the random regions you sample could have correlated histories. I think this could be addressed by varying window sizes and replicates across comparisons where you have multiple individuals for both the intraspecific and interspecific calculations.

    2. Reviewer #2:

      This manuscript from Allio is an interesting mix of approach demonstration (population genomic sampling via roadkill) and application (demographic analyses, questions about taxonomic status, and phylogenomics). There are some valuable results from the application component of the paper. In particular, I appreciate the comparative approach for studying patterns of intra- vs. inter-species genetic diversity. However, there is some rework to fully normalize those comparisons, that I feel is required.

      I would suggest the authors be more immediately forthcoming about the sizes of their samples, and perhaps consider changes to the introductory text to avoid giving any mis-impressions to readers about what data are ultimately presented in the manuscript. I had envisioned more of a landscape genetics-level sample and analyses, rather than n=3 individuals per each of the two species. Furthermore, while I think the reporting of the genome assembly qualities is important from confirmatory and quality control perspectives, and while presenting the new assemblies, in my view this shouldn't be set up to be a surprising result. These are very high-quality DNA samples, so we expect to be able to achieve DNA assembly qualities to whatever the invested level using current best-practices data generation and analytical methods.

      On the general genetic diversity and taxonomic questions, from my own experience I know that genetic differentiation metrics are not necessarily precisely comparable between a new study based on genome sequence data and an existing published dataset. Sample size affects false positive and false negative SNP calling error rates and sequence coverage and the variation among samples can also make a difference. Thus, especially since this leads to a key result/conclusion (i.e. that the two subspecies of aardwolf may deserve species status), it isn't sufficient that "similar individual sampling was available" for the carnivoran comparative datasets. The datasets should be equalized with sample number and individual sequence coverage (using downsampling) and then SNPs re-called using the same approach, before making the comparison. From the methods it wasn't clear to me the extent to which this all was done. It does appear that the same number of samples were used, and that the SNP calling approach was likely re-done from the read data (although please be more explicit about this, in the description). However, it doesn't appear that the sequence read data were subsampled for equivalency across the samples, which should be incorporated. Hopefully the results are similar, but there can be big changes that affect interpretation, so a careful approach is required.

      The study design and proposed expanded use of roadkill samples in population genomics led me to think of this study, one of my all-time favorites: Brown & Bomberger Brown (2013), Where has all the road kill gone?, Current Biology. For the present study, the question of potential biases in the sample for similar or related reasons is beyond the scope of investigation; this is not relevant for the sample sizes collected and analyses conducted. However, the importance of keeping this possibility in mind should at least be noted given the more expansive promotion of the wider inclusion of roadkill samples in population genomic studies. E.g. could the sample be biased towards individuals with genetically-mediated and/or culturally learned behavioral tolerance of human-disturbed habitats, etc., rather than a truly random sample representative of the overall landscape.

      In the methods section, the collection process and permits for the four samples from South Africa are described in detail. (Could you preemptively explain that the IUCN status for these two species is Least Concern, and thus that CITES permits are not required for the international transport of the samples?). However, the same information is not provided for the two East African samples that were included in the study (also, I think that there should not be two separate sampling sections in the manuscript). Please provide these details or expanded explanations.

    3. Reviewer #1:

      The manuscript by Allio et al. tries to justify that roadkill can be a useful source for genomic sequencing and even genome assembly level data. The authors cover all aspects of using this resource, from a new protocol to extract DNA, through generating a hybrid short- and long-read genome assembly and to various applications, showing that this data can be used in phylogenomic and population genomic analyses. Although I think that the manuscript is useful in highlighting how this resource can be analysed, it covers a lot of different topics and covers them in varying depth, which makes it difficult to follow and understand the real importance of the different sections.

      Major comments:

      -Overall, this manuscript left me a bit confused about what is the main scope. It covers a lot of different topics from the laboratory-end of the spectrum, e.g. protocol used to get good DNA out of roadkills and how to assemble these genomes with a hybrid genome assembly, and crossing into a phylogenomic analyses making taxonomic suggestions and an analyses of the complete carnivora group, plus a demographic analyses showing the changes of population size over time.

      I was left with an impression that the authors tried to cover a lot of different topics but did not go deep enough in any of those. As a consequence, the results section ends up sounding somewhat shallow, while the discussion takes up a lot of space.

      What I would suggest if this manuscript is indeed to serve as a roadmap to roadkill genomics, is to add a figure showing the pipeline and then adjust the structure of the manuscript accordingly. For example, one box in the figure would correspond to one heading in the results/methods, where the DNA analyses is explained - reasoning why a special protocol is needed, what is the main difference to existing methods, how does the yield compare to other methods, etc.

      And then the different topics explored in this manuscript could be shown as different examples of the application - taxonomic questions on intra-/inter-species level, higher level taxonomic analyses (of the whole Carnivora), population genomic analyses, etc. Highlighting this as examples of the potential use of the roadkill genomes would make it understandable why this paper is trying to cover aardwolf and bat-eared fox genomics from so many ends.

      -Even though showing that roadkill samples can be useful for analysing particular species for which obtaining samples is difficult in other ways, I'm missing a discussion of how difficult it is to obtain roadkill samples and what are the ramifications. Can this approach be generally applied due to legislation reasons, do you need permits, do you find enough roadkill to rely on this source or do you only see it as an opportunistic sampling scheme?

      -Genome assembly is not exactly my field of expertise; therefore, I would like the authors to better explain how their hybrid, short- and long-read genome assembly approach is novel. My impression was that such hybrid assemblies are now a rather common and well-established practice. But a lot of space is dedicated to explaining this topic in the introduction and again in the discussion, which to me is something obvious and reads more like a review than a research article. But maybe I'm missing something obvious here, in which case I'd like the authors to make it clearer.

    4. Summary: Collectively, we liked a lot about your paper and we would accordingly like to encourage its continued evolution. However, we felt that the approximately equal balance at present between the roadkill genomics assembly pipeline and the phylogenetic and genetic diversity results was not justified, and we requested a shift accordingly as described below. Second, we require several analytical updates to the manuscript to ensure robustness of the main genetic diversity results.

    1. Reviewer #3:

      In the manuscript by Kim et al., show that, beyond its roles of preventing somatic differentiation in the germline of embryos, Zn-finger protein PIE-1 also functions in the adult germline, where it is both SUMOylated as well as interacts with the SUMO conjugating machinery and promotes SUMOylation of protein targets. They identify HDA-1 as a target of PIE-1-induced SUMOylation. Here too, I find the claims interesting, however data is sometimes missing or does not fully support the claims.

      Main concerns:

      1) A key claim of novelty over previously proposed "glue" functions of SUMO is based on the fact that they find that temporally regulated SUMOylation of a very specific residue in a specific protein is affecting protein activity: The observation that "SUMOylation of HDA-1 only appears to regulate its functions in the adult germline" and not in the embryo together with the finding that "other co-factors such as MEP-1 are SUMOylated more broadly, these findings imply that SUMOylation in the context of these chromatin remodeling complexes, does not merely function as a SUMO-glue (Matunis et al., 2006) but rather has specificity depending on which components of the complex are modified and/or when."

      I find this claim poorly supported by the data. In fact, I find that the data supports that multiple SUMOylations contribute to formation of larger complexes: The His-SUMO IP (Fig 2B) brings down far more un-SUMOylated HDA-1 than SUMOylated. This argues for the presence of large complexes with different factors being SUMOylated and many bringing down unmodified HDA-1. The chromatography experiments (Fig 3B-C) also provide hits that are in complex and not direct interactors. Finally, HDA-1 SUMOylation is indicated to regulate MEP-1 interaction with numerous factors (Fig 3D). If all these factors are in one complex, it is hard to imagine how a single SUMO residue would mediate all of these simultaneously. It is quite likely (and not tested) that loss of HDA-1 SUMOylation leads to (partial?) dissociation of a large complex, rather than loss of individual interactions with the SUMO residue of HDA-1. Unlike claimed by the authors, there is no evidence that the "activity" of HDA-1 is regulated by SUMO modification.

      2) Based on loss of MEP-1/HDA-1 interaction upon pie-1 RNAi and smo-1 RNAi (Fig 4B), the authors conclude that "SUMOylation of PIE-1 promotes the interaction of HDA-1 with MEP-1 in the adult germline".

      The evidence that it is PEI-1 SUMOylation that is affecting MEP-1/HDA-1 interaction is fairly weak. In fact, based on Fig 4A, MEP-1 and HDA-1 interact without expression of PIE-1, and in PIE-1 K68R (sumoylation-deficient), although due to poor labeling of the panel it is not clear whether lane 1 and 4 refer to the WT pie-1 locus without tag or lack of pie-1.

      In 4B the HDA-1 band that is present in L4440 but not in pie-1 or smo-1 RNAi is very faint, and in our experience such weak signal is not linear i.e., bands can disappear or appear depending on the exposure. Importantly, according to the data, seemingly unmodified HDA-1 immunoprecipitated with MEP-1 (Fig 4B). This data contradicts the authors' claim that "These findings suggest that in the adult germline only a small fraction of the HDA-1 protein pool, likely only those molecules that are SUMOylated, can be recruited by MEP-1 for the assembly of a functional NURD complex".

      Furthermore, the fact that pie-1 and smo-1 depletion eliminate the interaction between HDA-1/MEP1 doesn't mean that the SUMOylation of pie-1 specifically is required for the interaction: perhaps un-SUMOylated pie1, and SUMOylation of something else, are both necessary for the interaction. The authors show that MEP-1 is also SUMOylated (Fig3C). When IP-ing GFP-MEP-1, they precipitate all its modified forms and associated factors. One alternative possibility for why smo-1 RNAi abolishes MEP-1/HDA-1 interaction is that MEP-1 SUMOylation is needed for interaction with HDA-1 (independently of pie-1). (On a side note, why are the authors not including MEP-1 SUMOylation in the model?)

      3) On page 13 the authors write: "These findings suggest that SUMOylation of PIE-1 on K68 enhances its ability to activate HDA-1 in the adult germline" and "We have shown that PIE-1 is also expressed in the adult germline where it engages the Krüppel-type zinc finger protein MEP-1 and the SUMO-conjugating machinery and functions to promote the SUMOylation and activation of the type 1 HDAC, HDA-1 (Figure 6)". Activation of HDA-1 is misleading and was never tested. If not performing in vitro assays for HDAC activity, the authors at least need to look at whether pie loss (degron) leads to acetylation of genomic HDA-1 targets and whether it affects HDA-1 (and/or MEP-1) recruitment to these sites. This could be done by ChIP-seq of HDA-1 and H3K9ac in WT and pie-1 degron animals.

    2. Reviewer #2:

      In their manuscript, Kim et al address the role of PIE-1 sumoylation during C. elegans oogenesis. The authors favour a model in which sumoylated PIE-1 acts as a sort of E3-like factor 'enhancing' HDA-1 sumoylation. While the results are indeed very interesting, it is unclear to me whether there is enough data to support the author's model. I have list of comments, suggestions, questions, and concerns, which are listed below, which I hope will help the authors strengthen the manuscript:

      Figure 1)

      I) As with the accompanying manuscript, the extremely low level of SUMO modification should be factored in the model.

      II) Is sumoylation also observed in untagged pie-1? As judged by figure 3A, the authors have a very good antibody to test this.

      III) While the authors claim that PIE-1 sumoylation is not observed in embryos, that panel shows a lower exposure than the corresponding one in Adult (as judged by the co-purified unmodified PIE-1::FLAG). A longer exposure and/or more loading would be helpful.

      IV) Their strategy and optimisation for purification of sumoylated proteins is excellent and will be useful for future research (along with other reagents the authors developed here). Is the 10xHis::smo-1 functional? Could this be tested in vitro and/or in vivo?

      V) In vitro PIE-1 sumoylation would be a desirable addition to this figure.

      VI) In addition to germline PIE-1 localisation, it would be interesting to see embryos and PIE-1(K68R).

      VII) MW markers are missing in the blots.

      Figure 2)

      I) The generation of the ubc-9 ts allele is an exceptional tool. Could the authors show SUMO conjugation levels at permissive vs restrictive temperature? Just out of curiosity, is this a fast-acting allele?

      II) The authors mention that gei-17 alleles are viable, could the authors mention any thoughts on why the tm2723 allele is lethal/sterile?

      Figure 3)

      I) Panel C is mentioned in the text in the wrong place. Also in C, what do the authors think about the big increase in MEP-1 sumoylation in the PIE-1(K68R) background?

      II) I have the same comment for panel D as I had for figure 1 comment III: the exposure/loading for the embryo WB seems lower, as judged by the co-purifying, unmodified HDA-1. A positive control for sumoylated protein coming from embryos would be nice.

      III) In general, the model of PIE-1 acting as a SUMO machinery recruiter should be tested with recombinant proteins. Even if compatible with some results in vivo, showing that this is a plausible mechanism in vitro would be extremely helpful and greatly support the authors' claim.

      Figure 4)

      I) The authors make a quantitative comparison of the HDA-1/MEP-1 interaction in the text. I think this is not correct. Even if these have been run in the same gel, this could just be a lower exposure. In this line, the HDA-1 blot in the 'Adult' IP would benefit from a longer exposure to better appreciate what seems a rather small difference between PIE-1 and PIE-1(K68R).

      II) Since there still seems to be interaction between MEP-1 and HDA-1 in the PIE-1(K68R) background, does smo-1(RNAi) or ubc-9(G56R) reduce this further?

      III) In panel B, the LET-418 blot on the right is massively overexposed.

      IV) Once again, in vitro binding experiments to get some indication that the authors' model is plausible would be a great addition.

      Figure 5)

      I) Could the authors make some quantitation of the immunofluorescence data?

      Overall, I think this manuscript proposes a very interesting model and the results support this model, although I am not convinced these are sufficient to strongly back the authors' claims. I would very much like to see a revised version with some in vitro data backing the authors' model.

    3. Reviewer #1:

      The evidence that sumoylation of K68 in the PIE-1 zinc finger protein is important for HDA-1 type 1 histone deacetylase association and sumoylation seems reasonable, and, is important because as shown in the co-submitted paper HDA-1 sumoylation leads to its association with MEP-1 and LET-418/NuRD complex thus accelerating H3K9ac deacetylation, and silencing gene expression.

      The evidence that PIE-1 is needed for sumoylation of HDA-1, presumably through association of PIE-1 with the UBC-9 SUMO E2, is reasonable. However, several aspects of the authors' model remain unclear, and there is an absence of biochemical assays to establish the role of sumoylated PIE-1 in HDA-1 sumoylation, and the effects of sumoylation on HDA-1 HDAC activity.

      1) How sumoylation of K68 in PIE1 affects its function was not worked out. Can the deleterious effect of the K68R mutation on PIE-1 function be reversed by generating a SUMO-PIE-1 fusion, as was done for HDA-1 in the co-submitted paper? K68 maps to the N-terminal side of ZF1 in the PIE-1 protein in what appears to be an unstructured region. Does the SUMO residue play a role in the interaction of PIE-1 with HDA-1? Are the zinc fingers required for PIE-1 interaction with HDA-1 or UBC-9? No zinc finger mutations were tested. Does HDA-1 have a SIM that would allow it to interact selectively with sumoylated PIE-1? Another possibility is that the PIE-1 SUMO moiety is important because it interacts with the non-covalent SUMO-binding site on the backside of UBC-9 (Capill and Lima, JMB 369:606, 2007), which might stabilize the interaction. The backside interaction of SUMO with UBC-9 is proposed to promote UBC-9-mediated sumoylation of target proteins with SUMO consensus sites that are directly recognized by UBC-9. In this scenario, SUMO-PIE-1 would in effect be acting as an E3 SUMO ligase for HDA-1 by serving as a recruitment "factor". In this regard, the authors could test biochemically whether recombinant PIE-1 or K68SUMO-PIE-1 stimulates sumoylation of HDA-1 by UBC-9, using recombinant WT and KKRR mutant HDA-1 as substrates. These issues deserve discussion.

      2) What is the SUMO E3 ligase that sumoylates PIE-1? Is it possible that through association with UBC-9, perhaps through its zinc fingers, PIE-1 is sumoylated in cis within a PIE-1/UBC-9 complex?

      3) In many places, including the title, the authors make the claim that PIE-1 promotes sumoylation and activation of HDA-1. While it is clear that PIE-1 does increase sumoylation of HDA-1, in a manner requiring K68, and that H3K9ac levels are decreased as a result, the authors do not provide any direct evidence that this process increases HDA-1 catalytic activity, as is implied in the title and elsewhere. As indicated in the review of the co-submitted paper, this would need to be established by carrying out an HDAC assay on control and sumoylated HDA-1 in vitro. Instead of enzymatic activation, it is possible that the PIE-1 interaction and HDA-1 sumoylation results in relocalization of HDA-1 within the nucleus to facilitate more efficient H3K9ac deacetylation.

    4. Summary: In this paper you describe experiments showing that PIE-1 is sumoylated at K68, and that K68 sumoylation plays a role in PIE-1 interaction with HDA-1 and its sumoylation, which leads to its activation. The reviewers found the sumoylation dependence of PIE-1 function in piRNA silencing to be of interest, but raised major issues that need to be addressed. In particular, more mechanistic insights into how sumoylation of PIE-1 at K68 enhances HDA-1 sumoylation and regulation are required.

      This is a co-submission with the manuscript https://www.biorxiv.org/content/10.1101/2020.08.17.254466v2

    1. Reviewer #3:

      This manuscript by Kim et al. describes a role of SUMOylation in Argonaute-directed transcriptional silencing in C. elegans. The authors found that SUMOylation of the histone deacetylase HDA-1 promotes its interaction with both the Argonaute target recognition complex as well as the chromatin remodeling NuRD complex. This enables initiation of target silencing. Impaired SUMOylation of HDA-1 leads to loss of interactions with several protein complexes, reduced silencing of piRNA targets, and reduced brood size. While the findings and claims are interesting, some of the novelty is overemphasized and some of the claims are not fully supported by the data.

      Main concerns:

      1) The importance of HDA-1 SUMOylation for transcriptional repression. The title "HDAC1 SUMOylation promotes Argonaute directed transcriptional silencing in C. elegans" implies a central role of SUMOylation in piRNA-mediated transcriptional silencing. The Argonaute HRDE-1/WAGO-9 targets countless transposons as shown previously and also in this manuscript (Fig S3), and so do the HDA-1 degron and Ubc9 mutant, indicating that histone deacetylation and protein SUMOylation are essential processes in TE silencing. However, the HDA-1 SUMOylation mutant (KKRR) only slightly affects 6 TE families (Fig S3), indicating that SUMOylation of HDA-1 might not be a key mediator of this process. Furthermore, the authors write that "Our findings suggest how SUMOylation of HDAC1 promotes the recruitment and assembly of an Argonaute-guided chromatin remodeling complex to orchestrate de novo gene silencing in the C. elegans germline.", but then they also state that "Comparison with mRNA sequencing data from auxin-treated degron::hda-1 animals revealed an even more extensive overlap with Piwi pathway mutants (Figure S2B), indicating that HDA-1 also promotes target silencing independently of HDA-1 SUMOylation." Based on their results and their own interpretations, I find that the importance of HDA-1 SUMOylation in piRNA-dependent transcriptional silencing is overemphasized.

      Additionally, the model (Fig 7) implies that for initiation of silencing WAGO recruits HDA-1 to targets. This should be tested by analyzing HDA-1 distribution over WAGO targets in WT and upon loss of WAGO.

      2) The mechanistic role of HDA-1 SUMOylation. On page 17 (amongst other places) the authors claim that "The SUMOylation of HDA-1 promotes its activity, while also promoting physical interactions with other components of a germline nucleosome-remodeling histone deacetylase (NuRD) complex, as well as the nuclear Argonaute HRDE-1/WAGO-9 and the heterochromatin protein HPL-2 (HP1)".

      -Regarding activity: Loss of deacetylation/silencing in the SUMO mutant might be due to loss of enzymatic activity, but it might also be due to defects in recruitment/complex formation. There is no data that proves altered enzymatic activity. In fact, Fig 6 indicates SUMO-dependent interaction of WAGO-9 with HDA-1, implying that recruitment is affected. To distinguish between activity and recruitment, at the very least, the authors would need to show that HDA-1 localization to its genomic targets is unaltered upon mutating its SUMOylation site (ChIP-seq of wt and KKRR mutant), while H3K9ac is increased (K9ac ChIP-seq in wt and KKRR mutant) in the mutant. This, in combination with HDA-1 localization in wt and WAGO-9 loss would imply whether complex formation to recruit HDA-1 or HDA-1 enzymatic activity is mostly affected by SUMOylation.

      -Regarding physical interactions: Fig 3D shows that if we fuse a SUMO residue to HDA-1, it will interact with MEP-1, while SUMOylation deficient HDA-1 mutant doesn't interact. However, for the WT HDA-1 control, we only see unSUMOylated protein interacting with MEP-1. Furthermore, in the MEP-1 IPs of samples that should contain SUMO-fused HDA-1, the authors detect a lot of "cleaved", unSUMOylated HDA-1. Unless cleavage happened after IP, during elution (unlikely, and there is "cleaved" HDA-1 in the inputs), these findings argue that the interaction with MEP-1 is not mediated by HDA-1 SUMOylation. An interaction between MEP-1 and unmodified HDA-1 is also shown in the accompanying manuscript, which appears to be dependent on Pie-1 SUMOylation. Thus, SUMOylation of HDA-1 alone seems unlikely to be the major factor necessary for silencing complex assembly. (as a side question: Does the protease inhibitor cocktail used inhibit de-SUMOylation enzymes? I am concerned that deSUMOylating enzymes might compromise some result interpretations.)

      -Regarding functional relevance of HDA-1 acetylation: On pages 12/13 authors claim that because "HDA-1(KKRR) animals and mep-1-depleted worms revealed dramatically higher levels of H3K9Ac compared to wild-type" and "HDA-1, LET-418/Mi-2, and MEP-1 bind heterochromatic", "SUMOylation of HDA-1 appears to drive formation or maintenance of germline heterochromatin regions of the genome." These correlations do not prove function. The authors have performed H3K9me2 (although not H3K9-ac) ChIP-seq in WT, KKRR mutant and HDA-1 degron worms, yet do not analyze globally whether acetylation is lost on genes that are affected (change in RNA-seq vs. change in K9me2 or acetyl). To support the claim that SUMOylation of HDA-1 drives deacetylation and heterochromatin formation, it would be important to show changes in H3K9Ac levels (or other acetyl marks) and potentially NuRD component occupancy between control and HDA-1 SUMOylation-deficient animals at specific targets (i.e. genes derepressed upon loss of SUMOylation identified in RNA-seq, and the reporter locus).

      3) The authors claim (p17) that "initiation of transcriptional silencing requires SUMOylation of conserved C-terminal lysine residues in the type-1 histone deacetylase HDA-1". I do not see any supporting data that has separately looked at formation/initiation and maintenance of silencing (a technically challenging experiment).

      4) The authors repeatedly claim that gei-17 does not play a role in piRNA target silencing, based on loss of gei-17 not affecting the piRNA reporter (Fig 1B). At the same time, they claim that pie-1 plays a role, even though it likewise does not affect the piRNA reporter (it affects the reporter only in F3; data on gei-17 effect in F3 is not present). In the accompanying paper, the authors show that while gei-17 loss by itself causes only moderate effect on extra intestine cells, combined with Pie-1 loss the effect is more severe than when Pie-1 loss is combined with Ubc9 or smo loss. This to me indicates an important role of gei-17 in inhibiting differentiation of germline stem cells to somatic tissues, but these effects are likely synergistic and thus masked by Pie-1. Individually neither Gei-17 nor Pie-1 show an effect on piRNA reporter in P0, but to confirm lack of synergy, their effects should be tested together. Although possible, the present data is insufficient to rule out gei-17 involvement.

    2. Reviewer #2:

      In their manuscript, Kim et al describe a role for HDAC1 (HDA-1) sumoylation in Argonaute-directed transcriptional silencing. The authors suggest that sumoylation of HDA-1 is important for proper assembly of the NuRD deacetylase complex. The role of SUMO modification in heterochromatin has been extensively documented and it is a very interesting topic. The current manuscript provides a very interesting set of results on this topic. I have list of comments, suggestions, questions, and concerns, which are listed below, especially related to the first half of the results:

      1) A general question would be how can HDA-1 sumoylation, which is barely detectable, account for such a big 'positive' effect on complex assembly? HDA-1 SUMO modification seems around 10% after enriching for SUMO-modified proteins, which means that stoichiometry will be way lower than this. While this is common for SUMO-modified proteins, it does make it difficult to associate with a 'simple' model.

      2) In Figure 1, a schematic of the sensor used throughout the study would benefit the reader.

      3) In Figure 1, have the authors checked if the 10xHis::tagged smo-1 has the same effect as the 3xflag::smo-1 (i.e. is it also a partial loss of function allele)?

      4) In Figure 1 it would be nice to see the global SUMO conjugation levels in the different conditions, particularly in the smo-1(RNAi), 3xflag::smo-1, and ubc-9(G56R).

      5) Also Figure 1, was gei-17 depletion/deletion checked in any way (i.e. WB)? Did the authors consider other SUMO E3 ligase, such as the mms-21 orthologue?

      6) While I am not a big fan of fusing SUMO to proteins, in this case it seems like a very reasonable thing to do, considering the modification sites are located very close to the C-terminal end of the protein. Did the authors check an N-terminal fusion?

      7) In Figure 2B, it becomes very clear that the level of SUMO modification of HDA-1 is extremely small, barely detectable after an enrichment method. I also wonder why the gels were cropped so tightly, especially considering that in Figure 3 there is an additional band corresponding to ubiquitylated, sumoylated HDA-1. In vitro modification assays would be helpful. HDA-1 alongside a known and characterised SUMO substrate would indicate how good a substrate HDA-1 is.

      8) In Figure 2D, is the difference between HDA-1(KKRR)::SUMO and HDA-1::SUMO significant?

      9) In Figure 3A-C, it would be useful to control whether the GFP::HDA-1 fusion behaves as the untagged one in the sensor assay (wt vs. KKRR).

      10) I have a few questions regarding Figure 3D:

      I. Considering the extremely low level of HDA-1 sumoylation, did the authors detect SUMO and ub conjugated HDA-1 (not the SUMO usion)?

      II. Is ub conjugated to SUMO or to HDA-1?

      III. Does MEP-1 contain any obvious SIMs and or UIMs?

      IV. To make a stronger case for the SUMO-dependent interaction model, in vitro interaction assays with recombinant proteins would be extremely useful.

      11) In the discussion, the authors compare the lack of requirement for GEI-17 in their manuscript with the requirement for Su(var)2-10 in flies. It is very important to back this claim that the authors control GEI-17 depletion (as pointed out in 5).

      Overall, I think this manuscript provides a very interesting set of results and I believe that, with the addition of some simple biochemical experiments, the quality and impact of the overall work would be much greater.

    3. Reviewer #1:

      The evidence that sumoylation of HDA-1, a type 1 HDAC, plays a key role in establishing transcriptional silencing of piRNA-regulated genes in C. elegans is quite convincing. The genetic analysis demonstrating that the SUMO pathway is involved in piRNA silencing is strong, and the mutational evidence that this involves sumoylation of two Lys in the tail of HDA-1 is reasonable. Likewise, the finding that HDA-1 sumoylation promotes association with NuRD complex components and association of MEP-1, an HDA-1 interactor, with chromatin regulators is convincing. In addition, the evidence that HDA-1 sumoylation increases H3K9ac deacetylation in vivo, leading to negative regulation of hundreds of target genes, and plays a role in the inherited RNAi pathway is solid.

      While the overall conclusion provides an interesting advance in understanding mechanisms of piRNA-mediated gene silencing in C. elegans, the paper is lacking any biochemical analysis of the effects of sumoylation on HDA-1 activity and its association with other transcriptional regulators.

      1) The authors mapped two sumoylation sites close to the C terminus of HDA-1, K444 and K459, based on extremely weak homology with two established sumoylation sites in human HDAC1 that are reported to be important for transcriptional repression (N.B. the authors should indicate here that David et al. reported that K444/476R HDAC1 had reduced transcriptional repression activity in reporter assays.). While the two human sites conform to the sumoylation site consensus, ψKXE, neither K444 nor K459 in HDA-1 fits this consensus (possibly one could argue that K444 is in an inverted motif). The fact that the KKRR mutant HDA-1 is no longer sumoylated is consistent with these two Lys being sumoylated, but it would be reassuring to have direct MS evidence that K444 and K459 are indeed sumoylated, which could be achieved using a SUMO Thr91Arg mutant that generates a GlyGly stub upon trypsin digestion, among other methods.

      2) It remains unclear how sumoylated HDA-1 is recognized by MEP-1 for assembly into the NuRD complex. Does MEP-1, or another NuRD subunit, have a SIM that could facilitate direct interaction of MEP-1 and sumoylated HDA-1?

      3) As the authors discuss, it is surprising that the HDA-1(KKRR)::SUMO protein, which in effect is a constitutively sumoylated form of HDA-1 that will interact constitutively with MEP-1/NuRD, does not have more deleterious effects on the organism, since according to the data in Figure 2B, the stoichiometry of endogenous HDA-1 sumoylation was extremely low. Of course, low sumoylation stoichiometry, which is a general issue with sumoylation studies, means that only a very small fraction of the HDA-1 endogenous population will be able to engage with the silencing complexes at any one time. This point is also worth discussion.

      4) Page 5: Here, and elsewhere, the authors claim that sumoylation of the two C-terminal Lys activates HDA-1 histone deacetylase activity, but provide no direct evidence for this statement. There are no HDAC assays, and it is unclear how C-terminal SUMO residues distant from the catalytic domain would alter its enzymatic activity, unless there is a SIM motif in HDA-1 that might allow for intramolecular interaction with SUMO residues at the tail leading to a conformation change. Did the authors check for a SIM motif in HDA-1? The fact that adding SUMO to the C-terminus rather than one or both of the two Lys would also have to be taken into account in determining bow sumoylation might "activate" HDA-1. To demonstrate that sumoylation activated HDA-1 in vitro deacetylation assays would need to be carried out comparing the activities of unmodified and sumoylated HDA-1. Instead of enzymatic activation, it is possible that the PIE-1 interaction and HDA-1 sumoylation results in relocalization of HDA-1 within the nucleus to facilitate more efficient H3K9ac deacetylation.

    4. Summary: In this paper, your studies showed that sumoylation of HDA-1, a type 1 HDAC, at two C-terminal Lys residues plays a role in establishing transcriptional silencing of piRNA-regulated genes in C. elegans through enhanced NuRD complex interaction and histone H3 deacetylation. The reviewers all found the link between HDA-1 sumoylation and silencing to be interesting, but raised a number of issues that need to be addressed.

      This is a co-submission with the manuscript https://www.biorxiv.org/content/10.1101/2020.08.17.254466v2

    1. Reviewer #3:

      In this manuscript, Soucy et al. describe a new technique that involves a 3D co-culture system that allows the analysis of the regulation of the sympathetic adrenomedullary system. The data demonstrate the advantage of such compartmentalized 3D systems relative to the 2 D system for long-term studies. The findings also show the usefulness of this system to understand the control by preganglionic sympathetic neurons of catecholamines released by the adrenal gland cells.

      The main concern with the work relates to the uncertain physiological relevance of the co-culture system developed by the authors. Although I appreciate the utility of such reductionist techniques to understand how preganglionic sympathetic neurons regulate catecholamines released by the adrenal gland cells, this is too removed from a physiological setting.

      1) It is difficult to judge the level of novelty of the MPS technique reported in this manuscript relative to what is in the previous paper (Ref 36) which is not available.

      2) The innervation of tissues including heart and adrenal gland is highly specific. In addition to the circulating catecholamines secreted by the adrenal glands, cardiomyocytes are tightly controlled by direct innervation. Thus, whether co-culturing PNS with other cells mimic what happens in vivo is not clear.

      3) The number of AMMCs displayed in figure 2B seems minimal as only very few cells were stained with cardiomyocyte markers. It would be interesting to know how many of these AMMCs receive innervation (Fig. 3E).

      4) It is not clear how primary cardiomyocytes were exposed to the catecholamines emanating from the AMMCs? Were these co-cultured or were the cardiomyocytes exposed to the media of AMMCs?

      5) Do the "n" in each figure represent cells or experiments (repeats)?

      6) There is no description of the method used to quantify the immunofluorescent signal.

      7) The Introduction is too long. It can easily be shortened to focus on the literature related to the topic.

    2. Reviewer #2:

      Soucy and colleagues developed a thermoplastic microphysiological system (MPS) to investigate the mechanisms regulating adrenomedullary innervation. This system consists of 3D cultures of adrenal chromaffin cells and preganglionic sympathetic neurons within a contiguous bioengineered microtissue. Using this model, they report that adrenal chromaffin innervation is critical for hypoxia-induced catecholamine release. They also show that opioids and nicotine affect adrenal chromaffin cell response to hypoxia without impairing neurogenic control mechanisms. In addition to providing mechanistic insights on adrenomedullary catecholamine release, this study represents an elegant proof-of-concept that the MPS have the potential to become useful tools to study organ innervation.

      Disclosure: I do not have the expertise to review the engineering aspects of this manuscript and will therefore share some concerns I have regarding the accuracy of this technology to mimic native tissues.

      I understand that one advantage of the MPS over microfluidic devices using micro-posts or micro-tunnels is the presence of an unobstructed interface between the compartments that is similar to tissue interfaces. However, how better is it compared to other organs-on-chips constructs for reaching the biological complexity of an intact organ?

      The system consists of adrenal chromaffin cells and preganglionic sympathetic neurons. I wonder if in this format it could lack the normal cellular heterogeneity of the adrenals. Can the absence of adrenal cortex cells producing aldosterone, androgens and glucocorticoids with important autocrine functions on chromaffin cells interfere with the ability of chromaffin cells to respond normally to a stimulus? Authors discuss that future efforts will incorporate additional adrenal cortical cell populations to better mimic the native physiology. Could they extend this discussion by highlighting the potential weaknesses of the model in its current format? Was any observations made that would suggest caveats?

      In vivo, do all fibers innervating the medulla target the chromaffin cells or do some/most innervate the blood vessels or pericytes? If a majority of the innervation is to blood vessels, how does this system take into account potential changes in blood flow and perfusion of the adrenals that could occur and affect the oxygenation?

      Early work suggests that adrenergic terminals innervate chromaffin cells and that the adrenal medulla receives a sympathetic and parasympathetic efferent and an afferent innervation (J Anat. 1993 Oct; 183: 265-276). How would this system allow to study such complex innervation? Is it possible to add additional neuronal types to this MPS?

      In addition to the nicotinic cholinergic receptors, chromaffin cells express muscarinic receptors that may also be involved in catecholamine release. A quick profiling and comparison of the expression of the different receptors could reinforce the representative nature of the technology to model a biological system.

      One important caveat of MPS is the challenge of delivering a drug in a physiologically realistic manner. Could the author comment on the doses of the different drugs used and how they are representative of what a chromaffin cell would normally "see" in vivo?

      Could the authors comment on the culture media/conditions and how they are representative of a biological system? Would the use of blood or blood components be a better alternative to the system?

    3. Reviewer #1:

      In this manuscript, Soucy and colleagues present a novel innervated system which they use to model the effects of prenatal nicotine and opioid exposure. Using the system they provide potentially interesting insights on how prenatal nicotine and opioid exposure could impact release of catecholamines. However, following careful review of the manuscript,I recommend that the authors provide substantial additional data and evidence to support the biological relevance of their findings.

      Major points:

      1) A main pillar of this manuscript is the assumption that the adrenal medulla is innervated. To substantiate their claims the authors cite books/book chapters, rather than citing convincing primary evidence. In fact, other than old EM images showing vesicular densities akin to synapses, I have not found published images of convincing axonal arborization in the adrenal medulla - if such images exist the authors should at least try to reproduce them for internal consistency of their study. This is particularly relevant if they wish to draw parallels between in vitro and in vivo systems. As this is a major pillar upon which this research stands, the lack of supporting histological evidence, which could be easily done, undermines the validity of this manuscript. Presenting primary evidence (i.e. not a textbook diagram) is essentia.

      2) Multiple experiments lack appropriate controls. See comments on Figure 2B, 2D, Supplementary Figure 2.

    4. Summary: All three reviewers were not convinced that this screening platform has been properly validated vis-à-vis the neurobiology of the adrenal gland, nor that it has a physiologic relevance for the understanding of living processes and organs.

  4. Dec 2020
    1. Author Response

      Response to reviews:

      We appreciate the relevant comments sent to us in this review. We have already revised the paper and we addressed those points in our revised manuscript. There is a particular point, which we had not explained in enough detail in our original version of the paper, and which we believe has led the reviewers to not appreciate a central aspect of our study. We wish to clarify this below:

      The reviewers stated that the idea proposed in our study that the "drift rate corresponds to signal-to-noise ratio" is a quite accepted one in DDM research, which typically assumes that the "within-trial noise" magnitude is fixed (and does not vary with condition), while drifts do. From this, it also followed that one the models we examined (and rejected; our model 2) appears to be a 'strawman', which one would NOT seriously consider.

      REPLY:

      This statement could be correct with regard to the DDM framework, within the domain of perceptual choice. However, we focused here on the DDM extension to value-based decisions, and we believe that the statement above is no longer accurate.

      1. Noise magnitude and within-trial sampling variability in value-based DDM

      Whereas perceptual choices are usually brief (often between .5 - 1 sec) and the stimuli they present are often static (lines or strings of letters in a lexical decision), value-based decisions take longer (typically around 2-5 sec), and the values for which they accumulate evidence are not “given” but rather need to be generated (sampled) during the decision itself. While all versions of the DDM include accumulation noise, the difference pointed out above has made its application to perceptual decisions assume that the "accumulation" term is constant and does not vary with task difficulty (this was also motivated by the attempt to minimize model parameters, so it was thought one could keep this parameter fixed). While this practice has been criticized (Donkin, Brown & Heatcote, 2009), the fact that the tasks involve short and roughly static stimuli (so the accumulation noise may be small compared with noise that appears between trials, when the same stimulus is presented again) has led most researchers to either assume the accumulation noise is fixed or to neglect it altogether (in favor of between trial noise; LBA-model).

      The first application of a sequential sampling model to value based decisions was the decision-field theory (DFT) model (e.g., Busemeyer & Diederich, 2002; Hotaling & Busemeyer, 2012). In this model, accumulation is driven by attentional switches between dimensions that are relevant to distinguish the stimuli, resulting in an explicitly noisy accumulation. More recent application of the DDM to value-based decisions (e.g., Krajbich et al, 2010; Tajima et al., 2016) are consistent with this idea. For example, as described by Tajima et al, the values of the alternatives are not "known" by the subject (even if the alternatives are in full view), but rather they are sampled from a distribution whose width corresponds to their previous experience or knowledge of the alternatives. Thus, in this framework, the within-trial accumulation becomes an intrinsically noisy process. Moreover, as mathematically proved in the DFT model, the within trial accumulation noise is determined by the variance of the sampled values. As long as it was assumed that the distributions of rewards associated with each alternative had equal width, it was possible to assume that the noise term was constant. Since we now know that alternatives vary not only in their attractiveness rating, but also on their certainty about such rating, the most natural assumption is that subjects accumulate value “evidence” by sampling from Gaussian distributions whose means correspond to the options’ value ratings and whose variances correspond to the options’ value uncertainties. This leads directly to our Model 2.

      We understand that this model cannot account for the observed data, and in this sense it is not a true contender. However, given the theoretical rationale above, we believe that showing this explicitly should have (at least) a didactical value for the readers of this literature, who want to understand how certainty should be addressed. Obviously, our results support an alternative model in which the drift of the accumulation process (and not the noise) is affected by the certainty of the alternatives. While this is consistent with what the reviewers believe to be expected, in our reading of the value-based decision literature, we did not find any model in which this was explicitly stated or tested. We believe that these results will motivate further investigation into the mechanism that generates this "normalization" (we aim to discuss a few options in our Discussion section).

      1. More detailed DDM explorations for the certainty effect

      We agree that a more detailed investigation of variants of our Model 4 would be informative. Both reviewers have provided very helpful and relevant suggestions, which we have addressed in our revised manuscript.

      For example, we examined a variant of Model 4 in which the drift decrement with uncertainty is non-linear (we introduced an exponent to characterize this). The model fitting results show that, indeed, this model flexibility is beneficial, resulting in better fits (including the flexibility costs). While the average exponent is close to 1 (the average across the group is .85), there is significant variability between subjects resulting in improved data fits. We also carried out a median-split analysis based on the certainty of the options, in which we allowed both the drift and the accumulation noise to vary with certainty. The results were consistent with our previous conclusions, showing that certainty affects the drift but not the accumulation variability. While this may go beyond the scope of the present paper, we will discuss potential mechanisms that might cause these results.

      References:

      Busemeyer, J. R., & Diederich, A. (2002). Survey of decision field theory. Mathematical Social Sciences, 43(3), 345-370.

      Donkin, C., Brown, S. D., & Heathcote, A. (2009). The overconstraint of response time models: Rethinking the scaling problem. Psychonomic Bulletin & Review, 16(6), 1129-1135.

      Hotaling, J. M., & Busemeyer, J. R. (2012). DFT-D: A cognitive-dynamical model of dynamic decision making. Synthese, 189(1), 67-80.

      Krajbich, I., Armel, C., & Rangel, A. (2010). Visual fixations and the computation and comparison of value in simple choice. Nature neuroscience, 13(10), 1292-1298.

      Tajima, S., Drugowitsch, J., & Pouget, A. (2016). Optimal policy for value-based decision-making. Nature communications, 7(1), 1-12.

    1. Author Response

      Author Response refers to a revised version of the manuscript, Version 2, which was posted on December 17, 2020 (https://doi.org/10.1101/2020.08.28.271643).

      Summary: This is a very interesting study addressing the question of microtubule cytoskeleton reorganization in the immunological synapse. Specifically, the work demonstrates the contribution of KIF21B for the control of the T cell microtubule (MT) network required for T cell polarization during immunological synapse formation. The authors use a variety of microscopy techniques, including expansion microscopy, controlled perturbations of the cell, and computer simulations to generate their results. The authors show that knockout of KIF21B results in longer MTs that result in an inability to polarise the MT network by a mechanism consistent with dynein motor function at the immunological synapse to capture long MTs and center the MT aster at the synapse. They use the Jurkat cell line, which is a classical model for this step in immune synapse function and fully appropriate. They show that KIF21B-GFP can rescue the knockout phenotype and then use this as a way to follow KIF12B dynamics in the Jurkat cells. KIF21B works by inducing pausing and catastrophe, thus, more MTs are shorter when present. They also rescue the defect in the KIF21B KOs with 0.5 nM vinblastine, that directly increases catastrophes, shortens the MTs and restores MT network polarization to the synapse. As a functional surrogate they investigate lysosome positioning at the synapse, which is one of the proposed functions of this cytoskeletal polarization. The use of expansion microscopy in this system is relatively new and clearly very powerful. The modelling component adds to the story and supports the sliding model proposed by Poenie and colleagues in 2006, but cannot say that there is no component of end capture and shrinkage as proposed by Hammer and colleagues more recently. Experiments and modelling are performed to a high standard and the results advance the field.

      We thank the reviewers for their thoughtful and constructive suggestions, and for the positive feedback.

      Reviewer #1:

      This is an excellent study of centrosome polarization in the process of establishing immunological synapse and the effect of kinesin-4 on this process. The authors use a variety of microscopy techniques and controlled perturbations of the cell to obtain beautiful images that clearly suggest that kinesin-4, by increasing frequency of pauses and subsequent MT catastrophes, limits MT length, which assists dynein pulling in polarizing the centrosome. They complement the experiments with modeling based on Cytosim; the model supports the conclusions from the data, and suggests some interesting ideas.

      I am not an expert in experimental techniques, though I understand what's been done, and in my limited opinion, the results are first-rate. The paper is well written and accurate. Modeling, which I know intimately, is done very well.. I have just a few minor comments:

      1) I was not quite clear what does the modeling say about the centrosome sometimes being in apical position, and sometimes half-way between apical and basal positions.

      The model predicts the centrosome to be either in an apical or a basal position, while in the experimental data from the KIF21B knockout cells, it can be polarized halfway. Our results indicate that in the knockout cells, the MT network is under a constant force pointed towards the synapse. This force can lead to major deformations of the nucleus and the centrosome can indent the nucleus. This indentation allows the centrosome to be located at a position half-way between an apical and a basal position. In our simulations, we assume that the nucleus is relatively stiff and cannot change size or shape. Therefore, we only find centrosomes at the apical or basal side. To clarify this point, we added a text to the 6th paragraph of the Discussion:

      “Our simulations suggest that when centrosome translocation is impaired, the MT network is experiencing balanced forces. As a consequence, we predict that in these situations one would observe major deformations of the nucleus because it is trapped in a contracting cage of MTs spanning between the centrosome and the synapse. These deformations could also allow the centrosome to be located half-way between an apical and a basal position of the cell (Figure 4H). In our simulations, we assume a relatively stiff nucleus and therefore we only find the centrosome in an apical or basal position. It could be also possible that nuclear deformations push MTs towards the synapse, where they form dense peripheral MT bundles to accommodate the least curvature (Figure 2A and B).”

      2) I understand that 2d modeling cannot address this issue explicitly, but can the authors speculate about the apparent ring of MTs along the periphery of the synapse in the non-polarized case?

      The MTs in the non-polarized case of some of the panels in Figure 2 and S2B are densely located along the periphery of the synapse. This could indicate that dynein-mediated force generation actively binds these MTs to the synapse plane through multiple motors. Another option could be that these systems are force-balanced, and thus the nucleus is experiencing a downward force. The deformable nucleus would then push all surrounding MTs down into the synapse plane as well, creating this phenomenon of MT alignment along the synapse plane. From our current data, we cannot distinguish the two processes. However, we added a text on the deformability of the nucleus to the 6th paragraph of the Discussion (page 19 of the revised paper):

      “Our simulations suggest that when centrosome translocation is impaired, the MT network is experiencing balanced forces. As a consequence, we predict that in these situations one would observe major deformations of the nucleus because it is trapped in a contracting cage of MTs spanning between the centrosome and the synapse. These deformations could also allow the centrosome to be located half-way between an apical and a basal position of the cell (Figure 4H). In our simulations, we assume a relatively stiff nucleus and therefore we only find the centrosome in an apical or basal position. It could be also possible that nuclear deformations push MTs towards the synapse, where they form dense peripheral MT bundles to accommodate the least curvature (Figure 2A and B)."

      3) My perhaps most significant comment: the model nicely integrates and explains the data, but is it predictive? A detailed model like that clearly can generate some nontrivial prediction that could be experimentally tested.

      As recognized by the reviewer, the main focus of our model was to “integrate and explain the data”. Nonetheless, we can draw at least two nontrivial predictions from the model. A strong prediction with important consequences is the length regulation of MTs by only a small number of KIF21B molecules. This length regulation mechanism could be tested in a reconstituted in vitro system in which the dependence on the number of KIF21B molecules can be systematically changed, or by exact quantification of KIF21B units through fluorescent labeling. This prediction could also potentially be tested in vivo, by the rescue of KIF21B knockout with KIF21B-GFP at different expression levels. However, these experimental validations of the small number of involved KIF21B molecules are very laborious and beyond the scope of this study. The second prediction is related to the KIF21B knockout system. In such a system the centrosome is not repositioned to the synapse. Our simulations suggest that in this case, the MT network is under constant force, but not able to rearrange. Therefore, we predict strong deformations of the nucleus by the MT network. However, we did not directly investigate such deformations in our simulations in which the nucleus is a rather stiff object. To emphasize the predictions from our model, we added the following text in the 4th paragraph of the Discussion (see above).

      4) "Interestingly, in our simulations, a small number of KIF21B motors was sufficient to prevent the overgrowth of the MT network." - this is a bit counter-intuitive: if the motor number is less than MT number, how would this work? Or, by a "small number of KIF21B motors" you mean still greater than ~ 100?

      We agree with the referee that at first sight, it may seem counterintuitive that 10 KIF21B motors can regulate 100 MTs. Key is to realize that length regulation by KIF21B is a very dynamic process. The motor binds to a MT, induces its shrinkage, detaches, and is ready to bind to a different MT. If this happens in about 10s, 10 motors can induce shrinkage of 100 MTs in about 100s. A single motor molecule can thus initiate shrinkage of several different MTs within a short time. To clarify this point, we added a text as explained above in the answer to the second major concern raised by the reviewers.

    1. Reviewer #3:

      General assessment:

      This paper applies a sophisticated psychophysical paradigm to assess the effect of prior choices on perceptual decisions in a group of 17 high functioning (but not mild cases) children and teenagers (8-17 years) with ASD. Using a model that is assumed to dissociate the contribution of prior stimuli and choices, the study found a strong effect of prior choices not stimuli, which is stronger in ASD than controls. Similar results from another data set are also reported. There was no convincing evidence found for a correlation between the effect of the priors and the ASD severity.

      Overall, this is an impressive study with a sophisticated paradigm, elaborate data analysis, ASD participants who were tested on a large battery, in-depth analysis of the literature with interesting insights, convincing results (but see below) and a well written manuscript.

      Major issues:

      1) The finding from the model that the prior stimuli did not have a positive impact (and even negative) on the decision bias is counter-intuitive and needs explanation (I apologize if there is one and I missed it). There were typically 5 prior trials, ~4 of them on one side, e.g. right, resulting in a higher rate of right presses on the test (because the test was unbiased, and the results showed a bias). Assuming the prior trials were mostly replied correctly, there should be a correlation between the stimuli and the choices. I see 2 possible reasons why the model produced negative weights - one is that indeed the choices were different from the stimuli, in which case we need to know the performance of the participants on the prior trials (which would be useful anyway). The other possibility is that the choices for the model were binary and the stimuli were continuous. If the stimuli had been coded as binary, it would have been difficult to dissociate between the stimuli and the choices. In this case, the conclusion should be that the prior stimulus laterality could have impacted the test choices, but not their magnitude. This issue should be explained in the text.

      2) The performance on the test trials staircase procedure is not reported, only the PSE difference. It would be useful to know if the groups differed on this, as the example psychometric curves shown seem shallower in ASD. Biases are likely to push the staircase procedure to higher laterality discrimination thresholds. I suspect (but without proof) that worse performance (more errors) on the staircase procedure may amplify (but not create) the bias. It would be useful to show the performance data and discuss this issue.

      3) The paradigm used is quite complex and complex paradigms are more difficult to fully understand, so I wonder about the justification for it. Why is it better or different from testing SDT shift of criterion by change in target probability? For example, in a Yes/No experiment for contrast detection set around 70% correct, the criterion may shift when there are more Yes or No trials. What would the authors expect in such an experiment? It would be useful to discuss this for the wondering reader.

      4) About the interpretation: the word "perseveration", i.e. a tendency to repeat the last key or recent keys is not mentioned. The authors conducted a "response invariant" experiment which showed significant but much smaller biases (Figure 7). Are these significantly smaller than the 1st experiment (as seems from the plots)? If so, one cannot rule out a major contribution of repeating the recent keys, i.e. perseveration. It would be useful to see the raw data in this case, e.g. what is the %trials of pressing right when the priors were biased to the right. My understanding is that it must be high given that the staircase was symmetric (50/50 trials on left and right) and that a bias emerged from the data.

      5) I wonder if the data could be analyzed to reveal the different contribution of preceding trials, i.e. the details of the serial dependency. Currently, all previous trials are treated equal in the model, but their contribution is not necessarily equal.

    2. Reviewer #2:

      General assessment and major comments:

      The study addresses a timely and important question of the role of potential modulations in perceptual decision-making in the atypicalities observed in perceptual processing of individuals diagnosed with autism. The manuscript is important, and the methods used are sound.

      There are however some issues to consider:

      Thresholds, or other indications of sensitivity and precision of performance in the task are not detailed (although judging by the individual psychometric functions presented in the figures, slopes seem less steep in ASD). Was sensitivity considered in any way in the analysis? wondering how the model fitting would look like and how it would interact with the biases. Bias magnitude could vary as a factor of noise or sensitivity.

      Also, could larger consistency bias in the ASD group result from weaker performance, more lapses of attention etc.?

      Age range is quite large. Did you check for age-related differences? I understand the sample size is not big enough to analyze data across different age groups but maybe as a covariate? (there is also the problematic issue of determining sample size of children based on the study in young adults).

      Not sure why the effects of prior stimuli are considered adaptation effects, particularly in the first experiment where stimuli were briefly presented. Also, regarding the argument in the Introduction about Bayesian priors producing positive effects -- there are other prior effects that may cause 'negative effects' in relation to prior expectations (for example, in perceptual illusions such as the weight-brightness illusion).

      Can you think of a reason why controls did not show significant consistency bias in their responses in the heading discrimination?

      There is some wording in the reports of the statistics such as 'more significant' or 'more marginally' that needs to be rephrased.

      Were the analyses corrected for multiple comparisons?

      Usually RTs in this sort of perceptual task are longer in ASD. Wonder how this is not the case here, although instructions for the subjects emphasize speed and accuracy.

      I agree with the authors. It is interesting to look at correlations between the effects of prior choices and clinical scores of repetitiveness and flexibility in ASD. Did you look at the correlation between the effects of prior choices and SCQ scores across the two groups? Previous work documenting correlation between autistic traits (AQ) and modulated perception provided important information about the generalization of the findings to the broader spectrum of autism in the wider, nonclinical population (see Lawson, Mathys, & Rees, 2017; Hadad, Scwartz, & Binur, 2019).

    3. Reviewer #1:

      General Assessment:

      I found the studies to be well motivated and thoughtfully designed to disentangle competing interpretations in the extant literature on visual perception in ASD. The first two experiments provided compelling evidence that prior choices affect perceptual decision making in ASD, but the outcome of the response invariant condition suggests that the authors' interpretation goes beyond the data.

      Substantive Concerns:

      "In summary, we found here that individuals with ASD demonstrated an increased influence of recent prior choices on perceptual decisions (vs. controls),..." is the major finding in the paper, quoted here in the concluding paragraph. It seems, however that the data support a narrower (and potentially less interesting) conclusion that individuals with ASD demonstrated an increased influence of recent button presses/motor responses, as the finding which forms the basis of the summary went away when different keys were used to report prior vs. test responses (i.e., in the response invariant condition). I understand that the authors present these data as challenges to theories of attenuated priors in ASD, but they seem to sidestep the issue that these data make their general conclusion more complicated.

      For completeness, it would be helpful to present some information on the stimulus values for the test stimuli, as these were set individually using a staircase. Where did these staircases converge? Were there group differences?

    4. Summary: The study addresses a timely and important question of the role of prior choices on perceptual decisions in individuals diagnosed with autism; 17 high functioning (but not mild cases) children and teenagers (8-17 years) with ASD. The experiments are well motivated and thoughtfully designed. Using a model to dissociate the contribution of prior stimuli and choices, the authors found a strong effect of prior choices not stimuli, which is stronger in ASD than controls. Similar results from another data set are also reported.

      Overall, this is a strong study with a sophisticated protocol, elaborate data analysis, ASD participants who were tested on a large battery, in-depth analysis of the literature with interesting insights, interesting results and a well written manuscript.

      The first two experiments provided compelling evidence that prior choices affect perceptual decision making in ASD, but the outcome of the response invariant condition suggests that the authors' interpretation goes beyond the data. This has serious implications for the interpretations of the findings. Also, the bias interpretation should be informed by measures of performance.

    1. Reviewer #2:

      The topic of this manuscript is the basis of continuous and episodic bursting electrical activity in developing spinal cords. The approach used is to employ a simple mathematical model as a representation of the central pattern generator underlying the bursting pattern, and examine how the properties of bursting change with variation in three key system parameters. Some of the model predictions are tested in an actual in vitro spinal cord preparation. Although I enjoyed reading the manuscript, I have some serious concerns about the model that is employed, which I discuss below.

      Major concerns:

      1) The model is a half-center oscillator (HCO) in which one cell inhibits the other, resulting in anti-phasic electrical activity of the two cells. (Each "cell" actually represents a cell population, so the model is a mean field model.) This is certainly one way to get electrical bursts. However, it is not at all clear that such a HCO structure exists in the developing spinal cord, or that there are neural populations with this anti-phasic activity. If such data exists, it is not mentioned in the paper or cited. Indeed, the recordings in Supp. Fig. 1 show extracellular neurogram recordings from ventral roots in different lumbar segments and in which the bursting appears to be synchronous. So I see no evidence that the HCO model reflects the actual neural circuit, other than the fact that it can produce bursting and episodic bursting. This does not mean that such a phenomenological model is without value, but it should be made clear to the reader that that is what the model is. Also, the next two points below do appear to cast doubt on the utility of this model.

      2) In Fig. 3 it is shown that the inter-episode interval (IEI) is increased in the model when the conductance g_h is reduced. Because of this, the episode period (EP) also increases. The data, also in Fig. 3, show the opposite. They show that blocking the h-type current decreases the EP. This seems like a flaw in the model, since it is the h-type current that is responsible for episode production (at least I think it is, see point 4 below). The discrepancy is mentioned in the manuscript, but only briefly and it should be fully addressed.

      3) In Fig. 5 it is shown that, in the model, there is a very small interval of g_NaP where episodic bursting is produced. Otherwise, the model produces continuous bursting (for larger g_NaP values) or silent cells (for smaller g_NaP values). However, the data that is also shown in the figure indicates that blocking the NaP channels has little effect on episodic bursting. This is another serious discrepancy between the model and the experimental data.

      Points for clarification:

      4) It appears from Fig. 1 that episodes stop when h-type current activation slowly moves to an insufficient level to kick off a new burst. Logically, a new episode would start once that activation grows back to a sufficiently large value. Is this right? The mechanism for episode production is never discussed, and it should be.

      5) The model is deterministic, yet there is variation in burst duration and episode duration (see Fig. 3). What is the source of the variation? Does this mean that the episodes are not periodic?

      6) The model has a multistable region in parameter space, and much is made of this in the Results and the Discussion. In Fig. 6, it was demonstrated that hyperpolarizing pulses could switch the system from one behavior to another. Can this be done experimentally in the in vitro prep? If so, was it tried?

      Other:

      7) Discussion is too long and touches on things that were far from the focus of the manuscript. For example, there is about a page and a half of text discussing short term motor memory (STMM) although the Results section did not focus at all on homeostatic functions of the circuit or STMM. Furthermore, some points were made several times during the Discussion, where one time would have been sufficient.

      8) Almost two pages of the Discussion was dedicated to multistable zones, yet in the model the multistable zone was tiny, and there was no evidence that the experimental prep lies in or near that zone. The authors state that in actual neural circuitry there could be a much larger multistable zone, which is true, but there also may be none at all. This discussion appears irrelevant.

    2. Reviewer #1:

      The present paper addresses the very topical problem of understanding of dynamic switching in central pattern generators. The paper investigates switching between bursting and spiking modes in spinal cord neurons. This is modelled using a multichannel HCO that identifies narrow regions in parameters where the system is bistable. It is argued that neurotransmitters drive invertebrate CPGs to favourable bistable regimes that allow rapid switching from one oscillatory state to another (e.g. foraging to escape) to be enacted by fast electrical stimuli. The paper is generally well-written and does a good job at interpreting observations.

      I have two major comments:

      1) The authors seem to ignore the switching between phasic and antiphasic oscillatory states, even though this is shown in Fig.1, and more generally between the polyrhythms that would occur in larger inhibitory networks. The latter switching may be at least as relevant to gait generation as the switching from bursting to spiking. Polyrhythms have also been shown experimentally and theoretically to produce robust multistable states that overlap over a wide parameter space. It would therefore be useful if the authors could comment on the relative robustness of spiking/bursting multistability vs polyrhythm multistability.

      2) It is argued that an hyperpolarizing Ip pulse will induce a transition from continuous spiking to bursting and conversely a depolarizing pulse induces the reverse transition from bursting to continuous spiking. Transitions are a dynamic process which will depend, among other things, on the timing when the pulse is applied during the heteroclinic cycle. In the absence of more information on the dynamics of the system such claims look over-simplistic.

    3. Summary: Both reviewers found that the analysis of data was too shallow and that the HCO model was insufficiently justified in the context of spinal cord CPGs. The reviewers argue that a more robust analysis including a discussion of the dynamic properties of the model (in the context of dynamic switching) was needed to support conclusions.

    1. Reviewer #3:

      General Assessment:

      The manuscript is well written and the methods are sound. The strengths of this manuscript are that this study is the first to systematically perform detailed electrophysiological measurements on inhibitory interneurons (INTs), in particular RC and non-RC INTs using the SOD1 mouse model for ALS. It is very interesting that they showed a dichotomy between reduced excitability in RC neurons (which could lead to an indirect increase in overall excitability of MNs) and non-RC INTs, which actually showed an increase in excitability which would have the opposite effect on MNs.

      Main comments:

      1) Most electrophysiological studies have focused on motor neurons and showed that they become hyperexcitable at very young ages, although there is controversy as to whether the hyperexcitability persists and is causative or compensatory to disease progression.

      2) The dichotomy observed between RC and non-RC Inhibitory neurons is interesting. Given that many of the glycinergic non-RC interneurons are Ia-inhibitory interneurons responsible for reciprocal inhibition, their effects on the target motor neurons have opposite effects on MN excitability. At this point it is mere speculation as to how these changes actually exacerbate the progression of the disease and effects circuit function.

      3) This paper is mainly descriptive with no specific hypothesis other that what has been discuss often in the literature: Motor neuron hyperexcitability occurs from intrinsic alterations in MN ion channels, increased excitatory synaptic activity, or a decrease in inhibitory activity or all of the above. Although the authors are most likely the first to demonstrate changes in inhibitory interneuron excitability with direct electrophysiological recordings, it is unlikely that these findings will significantly move the field forward presently. The authors suggest that biomarkers could be developed, this is just a broad statement without concrete proposal for implementation. It would be useful to show a specific target that could be modified pharmacologically in animals over time to see if this changes the progression/survivability of the ALS animals.

      4) Furthermore, the functional significance of early hyperexcitability as either a cause or compensation of ALS is controversial at present. Numerous studies have addressed hyperexcitability but yet we are still far from understanding the bases for this disease and one cannot help question whether this avenue of investigation is fruitful.

      5) Does this change in interneuron excitability and the dichotomy between RC and non-RC demonstrated persist over the course of the disease? How relevant are these changes to disease progression?

      6) It will be necessary to use other animal models available for comparison since SOD1, although historically a well-studied mouse model, is an ectopic over expresser, and is not the predominate mechanism for ALS in humans. There are others probably more pertinent models, ie. C9ORF72. Whether such changes in inhibitory interneurons occur in those other models and in humans remains to be determined.

    2. Reviewer #2:

      Amyotrophic lateral sclerosis (ALS) used to be considered primarily a disease of motoneurons. Recent work using mouse models of ALS has revealed that pathological changes can also be detected in spinal interneurons, particularly inhibitory interneurons, and that some of these changes can be detected before birth. The present paper is the first to directly examine the electrical properties of spinal inhibitory interneurons in a mouse model of ALS and show that some of these are altered in the neonatal period well before the mice start to exhibit symptoms. The authors show that SOD1 Lamina IX neurons are smaller than the Lamina IX WT neurons whereas no differences were found between WT and SOD1 neurons outside Lamina IX. They also use whole cell recordings to reveal that putative 'Renshaw cells' are less excitable in SOD1 mice than wild type animals whereas non-Renshaw inhibitory SOD1 neurons are more excitable.

      Major Comments

      1) The authors claim that Renshaw cells are in lamina IX, when they have been shown to be located mostly in the ventral part of lamina VII, ventromedial to the motor nucleus (Alvarez and Fyffe, 2007). In addition, not all calbindin+ neurons in lamina VII are Renshaw cells. From the location of the whole cell recordings shown in fig.2, it seems likely that most of the recorded neurons are not Renshaw cells because they are outside the classical 'Renshaw' area. It is not clear why the authors are focusing on glycinergic neurons in lamina IX, as there is no evidence that they belong to a unique class or that they are presynaptic to motoneurons.

      2) The concern about the identity of the Renshaw cells obviously undermines the statistical modeling to segregate Renshaw versus non-Renshaw cells. Furthermore, it was not clear from the text whether the model used both WT and SOD1 calbindin-positive neurons to define 'Renshaw cells'. Assuming it did, and given that there were changes in the electrical and morphological properties of the calbindin+ SOD1 neurons, is it not surprising that they could be grouped with the WT 'Renshaw cells'?

      In addition, the characteristics the 'Renshaw cell' population used for the model are not clear. On line 186 that it states that 15/23 of the whole cell recorded interneurons were positive for calbindin. Does this refer to 15 WT and 23 SOD1 neurons? Thus 38 neurons were calbindin positive. Of the remaining 21 neurons how many were calbindin-negative and how many were not tested? How many of the 38 calbindin-positive neurons had their dendrites reconstructed sufficiently from the intracellular fill to be used in the model? The model predicted that 80% of the 59 patched interneurons were Renshaw cells. How many of these were in the calbindin-negative group and how many were in the not-tested group? The spatial distribution of these groups should also be plotted. However, it seems very unlikely that 80% of the recorded cells are Renshaw based on their location as shown in fig.2B.

      Second it would have been useful apply the model to known non-Renshaw cells, to establish that it was not generating too many false positives. Another way the authors could test the model is to establish if it could distinguish WT and SOD1 neurons based on their morphology.

      3) The authors suggest that the reduced excitability of 'Renshaw cells' might contribute to the excitability changes seen in motoneurons. However, based on their own data, this is not a straightforward conclusion. They find that 'non-Renshaw cells' are hyperexcitable and since this population would include 1a inhibitory interneurons and other premotor inhibitory interneurons, it is not clear what the overall effect on motoneuron excitability would be. Additionally, because the authors suggest that 'Renshaw cells' are less excitable this would presumably lead to reduced inhibition of 1a inhibitory interneurons counteracting a potential loss of inhibition onto motoneurons from Renshaw cells.

    3. Reviewer #1:

      In this study, the authors investigated whether the morphological and electrophysiological properties of glycinergic interneurons in the spinal ventral horn of GlyT2eGFP SOD1 G93A mice are altered compared with GlyT2eGFP WT mice at P6-P10 (the SOD1 G93A mice is the classic mouse model of amyotrophic lateral sclerosis). Such an investigation has never previously been done. The main body of results relies on a sample of 34 WT and 25 SOD1 patched interneurons located throughout the ventral horn. The authors found that soma sizes of patched interneurons are not significantly different in SOD1 animals than in WT animals but their dendrites are larger. The onset and the peak of persistent inward currents (PICs) are more depolarized in SOD1 interneurons suggesting that they are less excitable than in WT. Immunohistochemistry for Calbindin was performed in a subset of the patched interneurons to identify Renshaw cells (7 cells in WT animals and 6 cells in SOD1 animals). Calbindin positive cells display more depolarized PICs onset and peak in SOD1 than in WT animals. A predictive statistical analysis was then performed in order to include in the Renshaw cells sample cells that were not tested for calbindin. This analysis suggested that the predicted Renshaw cells are less excitable in SOD1 mice than in WT mice whereas the predicted non-Renshaw cells are more excitable. The implications of these findings for the ALS pathophysiology are discussed.

      However, a number of major concerns substantially weaken the findings:

      1) Morphological properties Texas red allowed the authors to localize the patched cells in the ventral horn, to measure the soma and the dendrites and to investigate whether the patched cells were immunopositive to Calbindin. It appears that the soma volumes of the patched neurons are on average 2-3 times larger than the soma of the general population of GlyT2-GFP neurons in the ventral horn or in lamina IX (Table 1). No explanation is provided for this discrepancy. Does it mean that there is a systematic recording bias towards the largest interneurons ? Alternatively, is there a systematic swelling of the patched cells or a shrinking in the fixed spinal sections? Also, it is not clear what the dendritic parameters are? It is necessary in Figure 2 to show a reconstruction of dendrites in order to figure out which dendritic length, surface and volume are reported in Table 1.

      2) Electrophysiological properties The shift in the onset of the persistent inward currents onset is taken as an important indicator of a reduced excitability in SOD1 interneurons. However the measurement of the PIC onset is problematic. It is claimed in the Material and Methods section that "PIC onset was defined as the voltage at which the current began to deviate from the horizontal, leak substracted trace" (lines 374-375), which seems reasonable. However, in Figure 3A, the arrow for the PIC in the SOD1 motor neuron (red trace) does not point to the initial deviation from the horizontal which actually occurred at about -60mV, i.e. close to the PIC onset for the WT motoneuron (blue trace), in contradiction with the authors claim. The arrow points to a second component whose onset appears at a more depolarized voltage. Then the net current is likely to be complex and a pharmacological dissection of the currents at work is required both in WT and SOD1 neurons. Indeed, the net inward current might result from the summation of inward and outward currents. Are they outward currents at work? Are the inward currents Na+ or Ca++ currents? In the absence of such a pharmacological "dissection" it is difficult to fully interpret the data.

      3) Identification of the Renshaw cells The authors identified a subset of GlyT2 neurons as Renshaw cells because they expressed Calbindin-D-28K. This sole criteria does not allow a proper identification of Renshaw cells, particularly in P6-P10 mice. Indeed, many non-Renshaw cells in the ventral horn are calbindin-immunopositive during this post-natal maturation period in addition to the Renshaw cells (Siembab et al, J Comp Neurol, 2010). One distinguishing feature of Renshaw cells is that they are excited by recurrent motor axon collaterals. Then, the presence of VACht boutons on the GlyT2 cells would have been an interesting additional identification criteria. However, there is another source of VACht boutons than motor axon terminals in the spinal cord (Zagoraiou et al, Neuron 2009). Since this is an electrophysiological work, the authors had the possibility to unambiguously identify Renshaw cells: the presence of synaptic excitations in response to the stimulation of motor axons in a ventral rootlet (using oblique spinal cord slices, see for instance: Lamotte d'Incamps and Ascher, J Neurosci 2008; Bhumbra et al, J Neurosci 2014). The authors are advised to perform such an electrophysiological identification of Renshaw cells.

      4) Statistics and predictive model The number of patched cells identified as "Renshaw cells" on the basis of their Calbindin immunopositivity is low (7 WT and 6 SOD1). Indeed, I do not see any reason why the authors did not repeat the experiments in order to gather a more reasonable number of cells. Statistical analysis was performed on this low cell samples, in order to investigate whether each property under investigation differs or not in WT and SOD1 animals as reported in Table 3 (normality of the distribution was tested for each property and either ANOVA analysis or Kruskall Wallis analysis was performed). The validity of statistics on such low cell samples is questionable. The analysis was then extended to all patched cells using sophisticated random forest and principal components analysis in order to check whether some cells among those not tested for calbindin display enough similarities with the calbindin-positive cells to be considered as putative Renshaw cells. The model predicted that 80% of the 59 patched cells were "Renshaw cells", a percentage astonishingly larger than the percentage of calbindin-positive cells in the ventral horn (65%). This prediction is doubtful since the number of calbindin-positive cells is already higher at P6-P10 than the number of Renshaw cells (see bullet point 3). Nevertheless, the authors made statistics (not shown on the paper) on the basis of this prediction, and they found that the predicted Renshaw cells are less excitable in SOD1 mice than in WT mice whereas the predicted non-Renshaw cells are more excitable.

    4. Summary: In the present study, the authors searched for early signs (during the neonatal period) of amyotrophic lateral sclerosis (ALS) disease focusing on a specific class of spinal interneurons; i.e.: glycinergic interneurons. In SOD1 mice, they aimed at testing whether these inhibitory neurons exhibit measurable changes at young age that could then contribute to the MN pathology known to develop later. The originality of this study is that, for the first time, it examines specifically inhibitory neurons. The authors investigated the morphological and electrophysiological properties of lumbar glycinergic interneurons in the spinal ventral horn in one model of SOD1 mice compared to WT P6-P10 mice. In addition, the authors more specifically considered Renshaw cells in this process and found that these cells were less excitable in SOD1 mice.. Based on these experimental data they created a statistical model to make predictions on RC cells (and non-Renshaw cells found to be more excitable in SOD1 mice) to further demonstrate that early changes in their excitability could account for the disease.

      Despite the fact that this paper addresses the potential role of an unprecedentedly investigated class of neurons (inhibitory ones) in ALS disease, reviewers pointed to several concerns. First, there is a major problem with the identification of the Renshaw cells. Indeed arguments using the localization within the ventral horn of the spinal cord, the calbindin expression, the size and the number are questionable as it is done here. In addition, because the characteristics of this type of cell has been later used for the predictive statistical model, it importantly attenuates the validity of the model and credibility of the conclusions reached. Finally, because of the problems addressed above and because this paper is mainly descriptive without bringing real new hypotheses this paper might not participate in moving the field of ALS significantly forward. Thus, the three reviewers and I agree that the paper would be better suited for a specialised audience assuming detailed comments about the methodology are addressed.

    1. Reviewer #4:

      General assessment of the work

      Gene drives can be used for sustainable control of disease vectors, and there is a need for a different gene drive strategies that can be tailored to the particular species, timescale, and desired spatial spread. Kandul and colleagues present a welcome new addition to the growing number of strategies for gene drive, called HomeR, that combines elements of killer-rescue and homing-based drive to exert spatiotemporal control over its spread, whilst counteracting the rise of resistant mutations. Whilst it is extremely promising, some major claims of this manuscript are inaccurate or unsupported by the evidence. The authors could easily address the most important concerns by expanding their sequencing analysis to better detect and quantify resistant mutations, paying careful attention not to overstress the potential of this drive to mitigate resistance, and by comparing the relative strengths of different drive strategies instead of focussing only on features that are most flattering to the HomeR strategy.

      Numbered summary of any substantive concerns

      1) The drive release strategy of Fig 4A + 4C is primed to underestimate and potentially mask resistance. In Fig 4A, where the authors search for signs of resistance, the population was seeded with males that were all homozygous for the drive, meaning that 100% of their G0 progeny will inherit it. As the rate of homing is close to 99%, only a small fraction of their G1 could have inherited a non-drive (potentially resistant allele) allele. In a realistic release scenario, resistant alleles will have ample opportunity to be generated and subsequently selected. Though still far from adequate, resistance testing would have been better performed on samples collected from the lower frequency releases in panel C. This experiment should not be used to draw strong conclusions about resistance to pHomeR, but should be used to make broader observations regarding the spread and stability of the construct.

      2) The strategy for sampling resistance will obscure almost all resistance in the population, and would fail to detect even a strong selection for it. Flies were only selected for resistance genotyping if they lacked GFP, meaning they carry two non-HomeR alleles (i.e. homozygous for the R1 allele or transheterozygous with another R1/R2/WT). One would expect most resistant alleles to be heterozygous in a population that was seeded with almost complete drive homozygosity. The authors could, and should, have done more to identify and quantify these. Amplicon sequencing was used to sample the full diversity of alleles in a larger pool of individuals (including GFP+ flies) collected at G10, why was this approach not used throughout? By adopting the approach earlier they would have been able to track the changing frequencies of R1 and R2 alleles over time.

      3) The impression given in the figure and main text is that R1 alleles were rare (or entirely absent), when they were not. In spite of the incredible advantage given to the drive, and a bias in sampling method that would mask the presence of resistant alleles, resistance was observed in every generation tested (G2, G3 and G10). The authors claim that because GFP-individuals were not observed in later generations, the resistant alleles had not come under positive selection. This logic is flawed, and indeed their own amplicon sequencing analysis performed on G10 flies revealed several resistant alleles, including an R1 present in 80% of non-drive alleles. The two most frequent mutant alleles detected were in frame, and I do not agree that these are likely to be deleterious recessive (as the authors speculated). These could be functionally resistant mutations. I believe there were many more R1 alleles in heterozygosity with the HomeR allele, these alleles could have been spreading, but were excluded from the genotyping analysis. Could these putative R1 individuals not have been specifically tested to see if they do, or do not confer resistance?

      4) The modelling takes a very limited approach to comparing different drive strategies, and by comparing proof-of-principle designs, important differences are obscured. For example, simple modifications that would mitigate resistance are likely to be included in many designs - such as multiplexing gRNAs. The nuances of each design are lost in a discussion focused on the rate of spread, which is largely irrelevant now because all of the drives are predicted to spread well.

      5) The authors did not discuss the relevance of having performed releases in a population that was already homozygous for Cas9. Do the release experiments and model really suggest the drive could spread if released into an otherwise WT population? I'm not sure the data presented in this manuscript can support that claim.

    2. Reviewer #3:

      The authors are to be commended for the effort put into careful experimental design and clear presentation of methods and results.

      My main concern with the manuscript is that the claim about their specific polymerase gene being "ultraconserved" is not backed up with their own data or by citations from the literature. If the gene sequence was ultra-conserved, I wouldn't have expected the authors to be able to do so much recoding of the gene without fitness consequences. Furthermore, it is clear that homozyogous-viable NHEJ mutations did develop in the experiment. Without explanation, this seems to be a fatal flaw in the design.

      This manuscript describes a modification of the general homing gene drive concept by use of a split drive system that increases the frequency of a recoded polymerase gene that replaces a cleavage susceptible, naturally occurring, haplosufficient, conserved polymerase gene. This approach is taken in order to limit the evolution of cleavage resistance in the naturally occurring gene.

      I am not convinced that the research presented achieves the intended goals. I did a quick look for literature on the "ultraconserved" polymerase pol-y35 gene and could find none. I am not sure if the conservation is at the DNA sequence level or at the amino acid level. If at the amino acid level, then it makes sense that resistance alleles can form at the DNA level that don't impact the protein at all. Figure 2a shows the 22 and 27 recoded nucleotides for the two guide RNA sites. The authors say that these changes to the sequences didn't seem to impede fitness. Did the authors try many other recodings and finally decide on these because all others caused loss of fitness, or is it just that this gene is robust to substitutions even though the protein is conserved.

      Figure 4C shows that the frequency of flies with at least one copy of the pol-y35home R1 increased from about 25% to about 50% between the parental and F0 generation when there was no Cas9 present. As long as the transgenic males were competitive with the wild flies this makes sense because the released flies were homozygous for that allele and the offspring should all have inherited one copy of the gene. What doesn't make sense is that when the work was done with all flies harboring the Cas9, the pol-y35home R1 increased less than in the former case, from the parental to generation F0, the frequency of flies with the pol-y35home R1. In some replicates the frequency of such flies didn't increase at all. It should be noted that the parents were always homozygous. This certainly indicates a fitness cost to the flies with a combination of Cas9 and the homing construct.

      In this same figure, results from the model are plotted. It seems like the model assumes no fitness cost because it shows an exact increase from 25% to 50% flies carrying at least one copy of the pol-y35home R1 theoretical construct. In later generations the experimental results outperform the model. Presumably, this model is used to construct figure 6. This mismatch needs to be addressed in the manuscript.

      The fact that in all three replicates of the experiment without Cas9, the F0 is above 50% indicates that something else may be going on that is unrelated to gene drive. It could be due to heterosis between the two slightly different strains of flies. When wildtype males mate with wildtype females, the offspring are more inbred than when a transgenic male mates with a wildtype female. Just a hypothesis.

    3. Reviewer #2:

      Kandul et al. present an interesting study that could lead to important improvements on the use of homing-based gene drives. However, there are a number of things that should be addressed to improve the manuscript for better comprehension by readers.

      Overall the manuscript presents a load of data. But the presentation of these data could be made in a better digestible way. The authors should go over their manuscript with a reader in mind, that is interested but not necessarily knows all the relevant literature in the very detail.

      Abstract (line 18): Please remove "inherently confinable" from the abstract. The drive is indeed designed in a split drive design, however, all the experiments were done in a homozygous Cas9 background. Therefore, there are no experimental data for a split drive provided in this manuscript. The split situation seems to be here more for a practical reason to be allowed to do the experiments in a less stringent laboratory environment. Thus there are no experimental data that would support the confineable nature of this drive. Actually there are not even modelling data to this. Thus, such a statement should not be put in the abstract. This manuscript is not a demonstration of a confineable drive.

      Results (line 124): How was Pol-gamma35 identified? It would be interesting to the reader to get to know about the exact reasoning, why this gene was chosen. Or were there several ones chosen before and this turned out to work the best or was the easiest to design. This could be very interesting considerations important to the field.

      Results (lines 147-148 Fig. 1B; lines 155-156 Fig. 1C) and Methods (lines 698 and 706) and Figure 1 (both Figure and legend): The addressing of the Figure panels and the writing to it don't fit! Has there been a rearrangement of the Figure that was not worked through the text? When referring to "B" in the text, it is still about Act5C-Cas9 and the nos-Cas9 data are in the text referred to Fig1C! But Fig1C is BLM! In current panel Fig1B, what does "all" mean below the X-axis? This is not comprehensible. Panel C is not really described in the Figure legend!

      Results (line 253), Discussion (lines 526-527), and supplementary Figure 1 (line 1101). "converting recessive non-functional resistant alleles into dominant deleterious /lethal mutations" is completely misleading! There is no "conversion" and how should that be done molecularly. There is a continuous removal of such alleles from the population because of lethal transheterozygous conditions caused in the drive. However, there is no active conversion of such alleles into dominant lethal ones. This needs to be clearly rewritten to avoid the misleading idea. Supplementary Figure 1 also seems to have a slight conceptual problem. What are "cells" (rectangles) with a red frame and a green core? Green means at least one wt allele (this must include the recoded rescue allele!). Red means biallelic knock-out: thus a red cell cannot have a wt allele. Thus what is a red-framed green core cell? To explain the removal of R2 alleles, a depiction of yellow framed red core cells in the germ line would be helpful, since this would explain how R2 alleles are selected against and might be continuously removed from the population!

      Results (from line 424 to end of results): Before going into the modelling, the reader should be clearly informed about all the different approaches that are now to be compared. This is currently not done well, if at all! Thus moving current Fig 6 before current Fig. 5 might clearly help! Also a better explanation of the panels in Figure 6 is necessary as well as a correction of Fig6 Panel E! A comparison of a great number of the currently approached toxin-antidote (gene destruction - rescue, but not killer-rescue!) systems is greatly appreciated. However, the authors cannot expect the general reader to know about the small detailed differences between the systems that are compared here. Thus the authors need to provide some explanations and categorization of the different approaches here and also cite all the respective literature.

      -First subdivision: Non-homing (interference-based drives) VERSUS Homing (thus overreplication-based drives). This will also help them to better understand why the interference-based drives (TARE and ClvR) are more sensitive to fitness parameters than overreplication drives.

      -Second subdivision: same-site VERSUS distant site. This is important to understand the difference between the here modelled TARE and the CLvR. Actually ClvR is a TARE, but you use TARE here more specifically as the results in the respective paper are demonstrating only a same-site TARE! But this needs to be clearly stated here!

      -Third subdivision: viable VERSUS haplosufficient VERSUS haploinsufficient. This also needs to be clearly depicted in labelling panels C to F of Figure 6, which are currently hard to grasp what the essential differences are, before looking at the panels in detail: C: HGD of viable gene (HGD) D: HGD of viable gene with rescue (HGD+R) E: HGD of haploinsufficient gene with rescue (HGD-hi+R). THIS PANEL NEEDS MAJOR CORRECTION!! F: HGD of haplosufficiant (essential) gene with rescue (HomeR)

      -Forth subdivision: split VERSUS non-split. Here for the split HGD situation, the respective papers of which the current authors are co-authors should be cited: Kandul et al. 2020 (actually published end of 2019 and still cited as biorxiv Archives article 2019a!?) and Li et al. 2020, Elife). In addition, it is also important to state clearly that "split or two locus" is completely independent of the "distant site" concept! The reader needs to understand the differences of the systems that are compared here, without having the reader to go to the respective publications themselves and then try to find out what the differences really are. This is not so obvious and the current authors have a clear chance here to do that and help the reader in the mists of all this similar but still distinct approaches.

      Figure 6 Panel E: This depiction is not consistent within itself, not consistent with the legend, and not consistent with the cited literature!

      -Why should the rescuing drive construct over the wt allele be lethal as indicated in the right two boxes?

      -The cited paper Champer et al. 2020b, which is by now also published in PNAS! clearly states that there is maternal carry over, which actually makes it so hard to use and is probably only working via male propagation. In the Figure legend, it is said that "maternal carryover and somatic expression ... are empirically unavoidable", which is in contrast with the depiction! The legend then also states that this is "unachievable". This should be better replaced by "hard to achieve", since the approach is published and seems to drive, even though probably just via the males! Thus the depiction of panel E needs to be thoroughly revised.

      Discussion (lines 499-500): The haplolethal HGD works (admittingly poorly) despite the maternal carryover (Champer et al 2020b). Therefore, your statement needs to be refined or deleted: "requires germline-specific promoter that lacks maternal carryover" is not consistent with the published paper! The drive could go via the males because then you do not have maternal carry over! And homing based drives can go via males and do not necessarily have to be promoted through females, see also KaramiNejadRanjbar et al. 2018.

      Discussion (lines 540 to 543). This sentence is based on an old but clearly overruled idea! NHEJ repair is not restricted to a time before the fusion of the paternal and maternal genetic material. It has been clearly demonstrated that R1 and R2 alleles are also generated in the early embryo after the zygote state (Champer et al 2017, KaramiNejadRanjbar et al. 2018). Actually, all of the authors' Figure 1C and Supplementary Figure 1 are about NHEJ mutation in the early embryo causing "BLM". Thus this sentence is inconsistent with current beliefs and also with the authors' own writing!

      Figure 4: Panel C graph: Why is, in the controls, the transgene consistently and significantly higher inherited to the next generation (0). It is about 75% progeny sired by the transgenic fathers compared to the wild type fathers? Was there an age advantage of the transgenic ones or whatever other fitness factor? This is surprising and no explanation is given at all! In contrast, in the Cas9 background in generation 0, less than 50% carry the drive allele, which is probably due to induced lethality. But this should also be commented upon. In the legend it is stated that 7 of 9 flies carried an R1 allele heterozygous to an R2 allele. What about the other two?