24,553 Matching Annotations
  1. Aug 2023
    1. Author Response:

      We thank eLife for carrying out the peer review of our preprint. In this letter, we will provide a response to the eLife assessment, and the editor’s public review, and will also address the major points raised in the peer-review of our study.

      First, we wish to inform the readers that including this review, our manuscript has now been reviewed 5 times. These have included three reviews at an earlier journal, a review at eLife under the older model, and the current review at eLife under the new model. In an effort to provide transparency and increase the reader’s confidence in our study, all the prior reviews and our rebuttals to them have been uploaded to Biorxiv and are publicly available for all readers to peruse [1]. These reviews will show that we have responded comprehensively with additional data, and analyses over the last 3 years. Of the current reviewers, Reviewer #1 (who was also Reviewer #1 at the earlier journal) has reviewed our manuscript all 5 times. At the prior journal, an additional Reviewer (#2) carried out 3 cycles of review – and we responded fully and comprehensively to all the issues and comments of that Reviewer. It is our understanding that the prior Reviewer #2 did not respond to the review request from eLife, after which eLife recruited two new Reviewers (current Reviewers #2 and #3), who have now reviewed our work twice – once under the older model and now again under the newer model.

      Next, to ease readability, we will respond to the review in three parts. Part A will be dedicated to the editors’ public review. Part B will be dedicated to the response to eLife assessment, and we will respond to the reviewers’ comments in Part C.

      Part A: Response to editor’s public review: We thank the editor for his nuanced and fair read of our data and our inferences, and of the multiple back-and-forth cycles of reviews and rebuttals. The editor’s public review highlights key points put forth in our data, and succinctly discusses the evidence provided for our claims. Here, we respond to each of these highlights.

      (i) The editor agrees that subject to the broader limits of lineage fate-mapping experiments, which are universal for every prior and current study of vertebrate development, we have provided sufficient evidence for the presence of a population of cells within the myenteric ganglia, which shows mesodermal and not neural crest derivation, and which expresses the pan-neuronal marker Hu among other neuronal and mesenchymal/mesodermal markers.

      Given that the current accepted annotation for enteric neurons depends on their expression of pan-neuronal markers (which we show are expressed by MENs), expression of neurotransmitter-encoding genes and proteins (such as CGRP, NOS1, ChAT, etc, which we show are expressed by MENs), and their localization within the enteric plexuses (we show evidence of intra-ganglionic localization of MENs in the myenteric plexus), our data suggests that in describing MENs, ours is the first report describing the presence of a mesoderm-derived neuronal population in a significant neural tissue. By virtue of the continual expansion of the MENs population with maturation and aging, we show evidence that MENs contributes to the post-natal maturation and aging of the enteric nervous system (ENS), and by reducing the proportions of MENs in aging tissue, we can rejuvenate the ENS to normalize gut function in aging mice.

      (ii) The editor comments on whether beyond the accepted norm of their intraganglionic localization and expression of pan-neuronal markers, MENs can be described as functional neurons. We agree that in our manuscript, we did not test how MENs function. This is expressly because the current report is the first step in the study of MENs and does not aim to understand how MENs regulate various gut functions. In this response however, we wish to put forth a few arguments that would clarify some of the existing evidence on the functional nature of MENs as well as the current state of knowledge on ENS functions. These would help the readers understand the current evidence on the functional nature of MENs, and in addition, why it would be premature to expect MENs to exhibit canonical neuronal behavior.

      a. MENs generate neurotransmitters and neuropeptides: Enteric neurons release various neurotransmitters, and their ability to generate important neurotransmitters such as nitric oxide (NO) and acetylcholine depends on their expression of enzymes Nitric Oxide Synthase 1 (NOS1) and choline acetyltransferase (ChAT). Our work shows that sub-populations of MENs express these important neurotransmitter-generating enzymes (Fig 3). Further, our data also shows that MENs express CGRP, which is an important neuropeptide for regulating various gut functions (Fig 3). These important data show that at the protein level, many MENs have the same cellular machinery as that of NENs that can help carry out regulation of important gut functions.

      b. MENs have been shown to be functional in a prior study: Recently, enteric neurons have been shown to carry out significant immunomodulatory functions. These have included the expression of cytokines such as IL-18, which regulates intestinal barrier (as shown by Jarret et al. [2]), and CSF1, which regulates macrophage recruitment [3]. Jarret et al shows that the enteric neuron-derived IL-18 regulates immunity at the mucosal barrier. We show that the IL-18 – expressing enteric neurons are MENS (Fig 4), and thus, the data from Jarret et al [2] provides evidence that MENs are indeed functional in the in vivo environment.

      c. We do not quite know how many enteric neurons work at the electrophysiological level: Canonical vertebrate neurons exhibit resting membrane potentials (RMP) in the range of -70 to -80 mV, and during neuronal activation, an increase in membrane potential beyond the threshold of -55 mV activates their action potential [4]. By contrast, past and recent studies have shown that the average RMP of rodent and human enteric neurons is significantly more positive than -70 mV (for human ENS: -48 ± 8 mV, for mouse ENS: -46 ± 6 mV for S neurons, -56 ± 5 mV for AH neurons) [5, 6]. These data suggest that enteric neurons show significant departures from canonical neuronal behaviors and thus, expecting MENs to adhere to canonical neuronal behavior – when most of the ENS does not adhere to expected norms - would be incorrect.

      d. A neuron is not defined by its ability to generate an action potential: Neuronal behavior does not require the presence of action potentials, as observed in the neurons in C. elegans [7], much in the same way that the presence of action potentials is not restricted to neurons as it occurs in nonneuronal cells, including in enteroendocrine cells of the mammalian gut [8]. Thus, the presence or absence of action potentials cannot be the basis for adjudicating whether or not a neurotransmitter-expressing cell in a neural tissue is a functional neuron.

      (iii) The Editor, after reading the extensive prior and recent correspondence between the authors and the reviewers on whether the cells analyzed in the transcriptomic experiments are the same as those observed in tissues (called tissue MENs by a reviewer), opined that he found “the authors' assertions that they have described a cluster of cells that express both neuronal and mesodermal genes, and that this cluster corresponds to the tissue MENs described in lineage tracing, to be broadly sound”.

      We are enthused by the Editor’s opinion, as we had previously argued that our data connecting the transcriptomic data to tissue MENs is robust on the basis of extensive immunohistochemical validations of marker genes found in our single cell transcriptomic analyses. The Editor notes some confusion on why some marker genes not specific to MENs were used for the analyses and further points to the prior rebuttals we have posted on Biorxiv [1], where detailed clarifications on the choice of marker genes have been made. In the interest of readability, we direct the readers to these prior rebuttals at Biorxiv for more details. Succinctly, we initially tested canonical neuronal genes by immunolabeling (such as NOS1, ChAT, CGRP, etc) in NENs and MENs before performing single cell transcriptomic experiments. After performing the transcriptomic experiment, we next chose to validate neuronal and mesenchymal genes that were found expressed in the MENs cluster (such as DCN, SLPI, IL-18, NT-3, etc). Finally, in previous cycles of review, on the reviewer’s insistence, we included data on the expression of a host of neuronal genes and their encoded proteins (including Vsnl1, Pde10a, etc) to provide further evidence of neuronal identity of MENs.

      While without a significantly large cluster of NENs, it is impossible to know in our transcriptomic data, whether a gene expressed by MENs would be similarly expressed by NENs, it is important to note that lack of detection of a gene in the single cell experiments cannot be inferred as lack of its expression in those cells, and hence, our inferences on whether any marker gene was exclusively expressed by neurons of a particular lineage were determined by immunohistochemistry. Additionally, we wish to reiterate and inform the readers that our study provides detailed analysis of prior work by May-Zhang et al [9], where they have described a small cluster of Phox2b-expressing cells from the murine myenteric plexus that shows the expression of neuronal and mesenchymal markers. Our analyses shows that the transcriptomic profile of MENs matches the molecular signature of these cells. In the longitudinal muscle – myenteric plexus layer, only glial cells and neurons express Phox2b [10], suggesting that this cluster sequenced by May-Zhang et al are cells of the myenteric plexus. We provide evidence that the majority of the MENs were left unsequenced by MayZhang et al and that this minimized the representation of MENs in their data (Fig 5). These data together provide important confirmation of our argument that the transcriptomic MENs point to no other cell type but the tissue MENs.

      (iv) The Editor opines that a weakness in our current data is the significant overrepresentation of MENs in the single cell experiment, while also noting that our “explanation - that some cells are more sensitive to manipulations required to prepare cells for sequencing - is certainly well-represented in the literature and is therefore plausible….But it isn't fully satisfactory”. In our prior arguments (as well as in Part C), we have provided explanations based on prior observations that the issues of disproportionate representation of cell types are a technical limitation of the single cell transcriptomic methodology, which is prevalent in other experimental conditions for ENS (including the gut cell atlas study by Elmentaite et al [11]), and for other cell types in various organs. Due to this limitation, proportions of cells in the single cell space should not be inferred as their proportions in tissues. We also agree with the Editor that owing to the low representation of NENs, our data does not allow for a detailed comparison of the similarities and differences between the neurons of the two lineages, and that “an ideal analysis would have more cells, deeper sequencing, and comprehensive validation of the identity of each cluster of cells.” While in this study our aim was to describe the existence of MENs and not to perform an in-depth characterization of their sub-populations, we agree that this is the logical next step in creating a better understanding of the true diversity of ENS neurons. To that, we are currently evolving the methodologies to allow for a deeper and a more comprehensive analyses and validation of the various MENs populations, and study how they differ from NENs. We aim to publish these data in our next study.

      (v) We agree with the Editor’s assessment on our transcriptomic data that “these data and analyses bolster the authors' claims, without conclusively establishing them. That is, these data should neither be dismissed nor, on their own, considered definitive.” We have only used our single cell transcriptomic data to provide additional support for our claims (which are based on extensive lineage fate mapping and immunohistochemical analyses) and are not using these as a stand-alone definitive proof of a mesodermal origin. The data from the transcriptomic experiments were used to learn additional molecular markers, whose expression in MENs in tissue could be tested by immunohistochemistry. With this methodology, we provide data on the coexpression of neuronal and mesenchymal markers by MENs, and test by computational analyses whether similar neuronal population exists in other murine and human transcriptomic datasets.

      In addition, we completely agree with the Editor that “at this stage in the history of single-cell analysis, the criteria for using single cell sequencing data to establish cell type and cell origin is are not well established, and that neither the presence nor absence of specific sets of genes in single cells should not, for both technical and biological reasons, be considered dispositive as to identity.” We are very mindful of this limitation of these analyses and hence have continually ensured that our study only uses transcriptomic data of postnatal MENs to define a preliminary molecular signature of MENs, and not to infer developmental origins of MENs.

      (vi) We thank the Editor for his summary and for highlighting that despite using multiple lines of evidence to support our hypothesis, the current reviewers are not yet convinced of the mesodermal origin of MENs. Our study utilizes well established tools for lineage fate-mapping (which are the only tools that currently are widely disseminated and accepted in the field of developmental biology) to show that MENs are not derived from the (Wnt1-cre, Pax3-cre -expressing) neural crest and instead are derived from the (Mesp1-cre, Tek-cre -expressing) mesoderm. The reviewers agree that by using multiple lines of evidence, we have established that our results of lineage fate-mapping are real and not due to any artifact. With this rationale, the reviewers would agree that MENs observed in tissue do not show evidence of derivation from neural crest while showing evidence of derivation from the mesoderm. Despite this, we cannot ascertain the scientific rationale for why despite agreeing with our lineage fate-mapping methods and analyses, the reviewers remain unconvinced as to the developmental origins of MENs. We do not know what other experiment would pass the reviewers’ muster to definitively annotate the mesodermal origins of MENs.

      We wish to highlight that a recent study in ctenophores, where the investigators show evidence of a syncytial neural net [12], shows that much of the dogmatic view of how neurons are supposed to work is being overturned and newer paradigms that support broader interpretations for the definitions of neurons and how they regulate functions are being established. Our work on the developmental origins of a large population of neurons of the ENS, which is regarded as a primordial and conserved neural tissue, should be viewed in a similar vein.

      Part B: Response to eLife assessment: Ours is the first report on the mesodermal derivation of a large population of neurons in a significant nervous system in mammals. We show that this population of neurons, called MENs, is molecularly distinct from the canonical neural crest-derived lineage of neurons, and that the post-natal ENS shows evidence of increasing presence of MENs in the maturing and aging ENS. We show that the two neuronal lineages are sensitive to their own growth factors, which can be used to manipulate their proportions in tissue, and thereby provide a potential rejuvenating therapy for age-associated intestinal dysmotility. We also show that on the basis of MENs’ marker expression, MENs maybe present in the human ENS, and that disproportionate changes in their proportions are associated with chronic gut dysmotility disorders. Our work has profound implications in the multiple fields, including those of enteric and peripheral neurobiology, developmental biology, medicine, and aging. We are thankful that the eLife assessment found that we provide sufficient evidence for this important work.

      Part C: Response to Reviewers: Here, we wish to note that all the comments of the reviewers have been sufficiently addressed in prior reviews. All prior reviews, and our extensive rebuttals are available at our preprint for the readers’ perusal [1]. In this response, we wish to succinctly address some comments that have continued to emerge in this round of peer-review.

      (i) We wish to highlight that the Reviewers 1 and 2 agree that our lineage-fate mapping experiments are correct and the results are not a result of any artifact. In addition to the additional reviewer in the prior reviews at an earlier journal, whose comments were addressed in full, we have a total of three reviewers who agree that our results on lineage fate-mapping are robust. Reviewer 3 comments on the possibility of ‘cre mosaicism’ or the deleterious issues with long-term expression of cre. Our prior rebuttals have dealt with this comment at length, but succinctly, our results are (a) based on extensive cre and floxed reporter controls for both the lineages, and (b) replicate observations made by other labs – including the Pachnis, the Heuckeroth, and the Southard-Smith labs to provide confidence that these are not due to any artifacts in cre or reporter gene expression. Finally, cre in the two lineage fate mapping systems (Wnt1-cre and Mesp1-cre) is only developmentally expressed and thus, there is no reasonable possibility that our results would be impacted by long-term expression of cre. Thus, our results and inferences on lineage fate mapping, which is central to our annotation of the two distinct developmental lineages, correctly describe the developmental origin of MENs.

      (ii) By using extensive immunolabeling for (~21) markers that were learnt from our transcriptomic experiments, we provide evidence of the firm connection between the cluster of cells we annotated as MENs in the single cell transcriptomic experiments and the MENs we observe in tissues. Thus, we have performed more validation for these neurons than any other studies that have traditionally used 2 - 3 markers to validate a cell cluster in the ENS.

      In addition, by providing evidence of the expression of pan-neuronal marker Hu and other ENS markers that include NOS1, ChAT, CGRP, etc and ~40 neuronally significant genes, we have established the neuronal nature of MENs. With regards to annotation of MENs as neurons, we expected and understand the confusion in the field with our discovery of mesoderm-derived neurons that coexpress neuronal and mesenchymal markers. We wish to put forth the following arguments for the readers to consider.

      a. The annotation of Hu-expressing cells within the myenteric ganglia has been traditionally accepted as an enteric neuron. In those terms, by virtue of their intra-ganglionic presence and expression of Hu (and our data shows that Hu antibodies do not discriminate between the three neuronal isoforms of Hu) and other neuronal markers such as NOS1, ChAT, and CGRP, MENs should be annotated as neurons. We had addressed the semantic nature of this question in our last rebuttal (review #3, reviewer 1), which is available on the preprint [1].

      b. As the molecular data on MENs suggests that they have significantly different biology, it would not be unreasonable to expect that their neuronal behavior may be quite different. This is underscored by the fact that we observe many MENs to lack the expression the protein SNAP25, whose presence is thought to be central to canonical neuronal behavior. We also cite evidence that neurons without SNAP-25 expression occur in the CNS neurons as well. In light of these discoveries, gauging the biology and neuronal behavior of MENs is a significant undertaking as it cannot be assumed that the behavior of MENs will be similar to that of NENs.

      c. It is not logical to say that “Expressing one of the Hu proteins (Elavl2) probably isn't enough to call these "neurons" especially when neurons usually express Elavl3-4 (HuC/D)” especially when there are currently no antibodies to discriminate between the three neuronal gene products.

      d. While at the outset it maybe an easy proposition to suggest that we provide evidence of neuronal activity in MENs by calcium flux or by electrophysiological means, it is important to know that calcium flux exists in all cells of the gut wall, including in smooth muscles, enteric glia, neurons and thus studying calcium flux will not provide definitive proof of neuronal behavior in MENs. Further, we reiterate from Part A of this response letter that “neuronal behavior does not require the presence of action potentials, as observed in the neurons in C. elegans [7], much in the same way that the presence of action potentials is not restricted to neurons as it occurs in non-neuronal cells, including in enteroendocrine cells of the mammalian gut [8]. Thus, the presence or absence of action potentials cannot be the basis for adjudicating whether or not a neurotransmitter-expressing cell in a neural tissue is a functional neuron.”

      (iii) Our identification and validation of the molecular identity MENs using single cell transcriptomic experiments helps us establish the congruency of our cell cluster with a similar cluster enteric neurons previously observed by the SouthardSmith lab in their analyses. Thus, similar to our observations on the lineage-fate mapping models, observations on our transcriptomic data are also in-line with the observations made by other labs in the field.

      (iv) To address any remaining confusion in the minds of the reviewers and of the readers about the correct methodology for interpreting single cell transcriptomic data and the limitations of this technique, we wish to reiterate that:<br /> a. Single cell or nucleus RNA sequencing methods are biased towards sequencing transcripts that are abundant relative to all other transcripts for that individual cell (detection and amplification bias). Thus, while the same transcript may be equally expressed at an absolute level in two different cells, it will be more readily sequenced and detected in the cell where the transcript is relatively more abundant.

      b. Correct interpretation of single cell/nucleus transcriptomic data relies on an understanding that not all transcripts of a cell can be sequenced and detected, and thus absence of the expression of transcripts in a cell does not imply absent gene expression. Together this shows the fallacy of an argument often put-forth by the reviewers that a lack of detection of a gene transcript (for e.g. Phox2b) in MENs in a scRNAseq experiment should be inferred as a lack of expression of this transcript, even though we provide evidence of the expression of PHOX2B protein in MENs, and the expression of this transcript in the MENs in the data from the Southard-Smith lab.

      c. scRNAseq is not a technique where annotation of a previously unknown cluster should be biased by the detection of expression of one or two genes, and instead establishing identity or conferring novel annotation of that cluster is defined by co-expression of several genes which must be validated in tissue.

      d. It is well known that enzyme-based dissociation methods are unequally tolerated by diverse cell types, which is known to cause over- or underrepresentation of several cell types in scRNAseq (Uniken Venema et al.[13], who showed that dissociation method drives detection and abundance of cells sequenced; Wu et al.[14], showed the existence of similar dissociation bias in the kidney; Tiklova et al.[15] showed that specific subpopulations of Dat-expressing neurons in the developing mammalian brain were underrepresented in scRNAseq). The Gut Cell Atlas study (Elmentaite et al.[16]) was not able to detect NENs in the adult intestinal tissue. The lack of detectable canonical enteric neurons (NENs) in the adult tissue in their study should not be viewed as an absence of NENs in those tissues, and with the same logic, a restricted abundance of NENs and a larger abundance of MENs in our dataset cannot and should not be viewed as a reliable indicator of their actual proportions in tissues. The aim of our study is not to provide a comprehensive molecular atlas for all cells that reside in the LM-MP tissue layer, but to use the information in this atlas to identify a cell cluster that best describes MENs, and then use additional tools to validate this information.

      e. Without extensive validation by immunohistochemical or other means, detection of transcripts of a particular gene ‘Z’ (which is known to be expressed in cell type ‘X’) in a particular cell cluster ‘A’ of a single cell transcriptomic dataset does not directly imply that cell cluster ‘A’ points to cell type ‘X’. Thus, the detection of transcripts of the gene Wt1 (which is known to be expressed in mesothelial cells) in MENs, in itself does not mean that the MENs cluster comprises of mesothelial cells. It simply suggests that in addition to its expression in mesothelial cells, Wt1 gene is also expressed by MENs – an inference which is supported by data that show the expression of LacZ in myenteric ganglia cells in the WT1-cre transgenic mouse (Wilms et al 2005 [17]).

      (v) Our study has performed two scRNAseq studies, first to establish the distinct molecular signature of MENs, and second to provide transcriptomic evidence of MENs-genesis. In the last and current review, Reviewer 2 opines that we should perform an additional single cell RNA sequencing experiment just to show that the MENs cluster is represented in the mesoderm-enriched transcriptomic data. There is no doubt that owing to the expression of various mesodermal-markers that we show are expressed by MENS (both transcriptomically in scRNAseq and at the level of proteins in tissues), the cluster of MENs is mesodermal in origin. Thus, we have already provided evidence and met a higher burden of proof on the mesodermal identity of MENs, and thus, we do not consider the costly scRNAseq experiment proposed by the reviewer a definitive experiment that would justify the time or the cost.

      (vi) Our prior rebuttals have provided the reviewers with evidence that shows that our study has used standard bioinformatic pipelines to analyze our data, and our inferences of the transcriptomic data are sound and well validated by additional methods.

      (vii) Many comments of the reviewers that required textual edits were already carried out after the prior review at eLife. While a revised version of our manuscript was submitted to eLife for the current review, it is unfortunate that the reviewers have not updated many of their comments. For the sake of brevity, we will not be responding further to the comments that we have already addressed at length in prior rebuttals or in form of textual edits.  

      References

      1. Kulkarni, S., et al., Age-associated changes in lineage composition of the enteric nervous system regulate gut health and disease. bioRxiv, 2022: p. 2020.08.25.262832.
      2. Jarret, A., et al., Enteric Nervous System-Derived IL-18 Orchestrates Mucosal Barrier Immunity. Cell, 2020. 180(1): p. 50-63 e12.
      3. Muller, P.A., et al., Crosstalk between muscularis macrophages and enteric neurons regulates gastrointestinal motility. Cell, 2014. 158(2): p. 300--13.
      4. Chrysafides, S.M., S.J. Bordes, and S. Sharma, Physiology, Resting Potential, in StatPearls. 2023: Treasure Island (FL) ineligible companies. Disclosure: Stephen Bordes declares no relevant financial relationships with ineligible companies. Disclosure: Sandeep Sharma declares no relevant financial relationships with ineligible companies.
      5. Yew, W.P., et al., Electrophysiological and morphological features of myenteric neurons of human colon revealed by intracellular recording and dye fills. Neurogastroenterol Motil, 2023. 35(4): p. e14538.
      6. Furukawa, K., G.S. Taylor, and R.A. Bywater, An intracellular study of myenteric neurons in the mouse colon. J Neurophysiol, 1986. 55(6): p. 1395-406.
      7. Liu, Q., G. Hollopeter, and E.M. Jorgensen, Graded synaptic transmission at the Caenorhabditis elegans neuromuscular junction. Proc Natl Acad Sci U S A, 2009. 106(26): p. 10823-8.
      8. Gribble, F.M. and F. Reimann, Enteroendocrine Cells: Chemosensors in the Intestinal Epithelium. Annu Rev Physiol, 2016. 78: p. 277-99.
      9. May-Zhang, A.A., et al., Combinatorial Transcriptional Profiling of Mouse and Human Enteric Neurons Identifies Shared and Disparate Subtypes In Situ. Gastroenterology, 2021. 160(3): p. 755-770 e26.
      10. Corpening, J.C., et al., A Histone2BCerulean BAC transgene identifies differential expression of Phox2b in migrating enteric neural crest derivatives and enteric glia. Dev Dyn, 2008. 237(4): p. 1119-32.
      11. Elmentaite, R., et al., Cells of the human intestinal tract mapped across space and time. Nature, 2021. 597(7875): p. 250-255.
      12. Burkhardt, P., et al., Syncytial nerve net in a ctenophore adds insights on the evolution of nervous systems. Science, 2023. 380(6642): p. 293-297.
      13. Uniken Venema, W.T.C., et al., Gut mucosa dissociation protocols influence cell type proportions and single-cell gene expression levels. Sci Rep, 2022. 12(1): p. 9897.
      14. Wu, H., et al., Comparative Analysis and Refinement of Human PSC-Derived Kidney Organoid Differentiation with Single-Cell Transcriptomics. Cell Stem Cell, 2018. 23(6): p. 869-881 e8.
      15. Tiklova, K., et al., Single-cell RNA sequencing reveals midbrain dopamine neuron diversity emerging during mouse brain development. Nat Commun, 2019. 10(1): p. 581.
      16. Elmentaite, R., et al., Single-Cell Sequencing of Developing Human Gut Reveals Transcriptional Links to Childhood Crohn's Disease. Dev Cell, 2020. 55(6): p. 771783 e5.
      17. Wilm, B., et al., The serosal mesothelium is a major source of smooth muscle cells of the gut vasculature. Development, 2005. 132(23): p. 5317-28.
    2. Reviewer #1 (Public Review):

      The manuscript by Kulkarni et al proposes a new cellular origin of ENS, which is increased with age and therefore may be associated with the gradual decline of gut function. The study is based on an initial observation that many enteric neurons do not seem to retain tdTomato expression in Wnt1Cre-R26-Tom mice, suggesting a loss of neurons that are replaced by a non-neural crest source. Further detection of reporter expression within the ENS of Tek and Mesp Cre-lines indicated a mesodermal origin of the new enteric neurons. Mesodermally derived neurons (MENS) were associated with Met, while neural crest derived neurons (NENS) expressed Ret. GDNF could decrease occurrence of MENS (defined as tdTomato-negative cells), while HGF had the opposite effect. Age-associated decline in gut transit was alleviated with GDNF treatment, while Ret heterozygote mutants had an increase of MENS. Overall, the study suggests that neural crest derived neurons are replaced by mesodermal-derived neurons that lead to an overall reduction in GI-physiology and that manipulation of the balance between the two types of neurons could have beneficial effects of age-associated gut malfunction. Generation of neurons from non-ectodermal sources would be a paradigm shift not only in the ENS, but in the Neuroscience field as a whole. The presence of mesenchymal marker genes in subsets of cells of the ENS in native gut tissue is convincing and the lack of retained fluorescent reporter expression in ENS from the many neural and Cre drivers used is indeed clear.

      The current state of the manuscript is though not conceivable as it has unsound interpretation of data at many places, most importantly there is no firm connection between the MENs identified in tissue and the scRNA cluster annotated as MENs. "scRNA-seq-MENs" show very little expression of the bona fide neuron markers used to detect "tissue-MENs" including Elavl4 and the overall proportions of "scRNA-seq-MENs" in the tissue is very far from that of "tissue-MENs". Hence, the claims that "tissue-MENs" equals "scRNA-seq MENs" could be excluded or their interpretation discussed in an unbiased manner. Marker expression of "scRNA-seq MENs" are suggestive of mesothelial cell identities, not ENS cells. Even the annotation of scRNA-seq profiles denoted as neural-crest derived enteric neurons (NENs) is highly questionable as 25% of the cells display bona fide lympathic epithelial cell markers and no neuronal markers.

    3. Reviewer #2 (Public Review):

      In this study, the authors propose the possibility that some neurons in the enteric nervous system (ENS) originate postnatally from a non-ectodermal source. This possibility is investigated using a combination of transgenic lines, single cell RNA-sequencing (scRNA-seq), and immunofluorescence. Initially the authors identify a subset of neurons within myenteric enteric ganglia that are not lineage-labeled by canonical neural-crest derived cre-LoxP strategies. In their analysis, the group seeks to show that these neurons have an origin distinct from neural crest-derived progenitors that are known to initially colonize the developing gut. The team uses multiple cre lines (both Wnt1-cre and Pax3-cre) as well as several distinct reporter lines (ROSA-tdTomato, ROSA-EGFP, Hprt-tdTomato) to demonstrate that the lack of labeling by neural crest cre transgenes is consistent across several tools and not due to any transgene or reporter line artifact. Based on prior analysis that suggests some neurons in the ENS might be arising from a mesodermal lineage, the authors evaluate the possibility that mesoderm could contribute neurons to the ENS by evaluating expression of Tek-cre and Mesp1-cre tagged cell types in myenteric ganglia. The work with transgenic lines is convincing that some ENS neurons originate from an alternative source in the postnatal intestine and that this population increases in aging mice.

      The authors apply single cell RNA-sequencing to identify additional markers of these non-neural crest enteric neurons. They rely on dissociation of laminar gut muscle preparations, stripped from the outside of the adult intestine, that contain many cell types including smooth muscle, vasculature, and enteric ganglia. In the analysis of this scRNA-seq data, the authors focus on a cluster of cells in the resulting UMAP plots as being the MENs cluster based on labeling of this cluster with three genes (Calcb (CGRP), Met, and Cdh3). Based on expression of these marker genes there are a very large number of MENs and very few neural crest-derived enteric neurons (NENs) seen in the UMAPs. It is not clear why this difference in cell numbers has occurred. The early lineage tracing data shown with cre transgenes (Figures 1 and 2) shows relatively equal numbers of NENs and MENs in confocal imaging studies, yet in the RNA-seq UMAPs thousands of MENs are displayed while very few NENs are present. There is the possibility that the authors have identified a cell cluster as MENs that does not coincide with the Mesp1-cre or Tek-cre lineage labeled neurons observed within enteric ganglia of the laminar gut muscle preparations. The authors state that they have "used the single cell transcriptomics to both confirm the presence of MENs and identify more MEN-specific markers", however there is not a direct relationship made in this study between the MENs imaged and the cells profiled by single cell RNA-sequencing.

      In their analysis the authors note a difference in the percentage of enteric neurons labeled by the neural crest lineage tracer line, Wnt1-cre, relative to the total neurons labeled by the pan-neuronal marker HuC/D with age of the mice studied. They undertake a temporal analysis of the percentage of Wnt1-cre labeled neurons over total HuC/D neurons over the lifespan and note a decrease of Wnt1-cre labeled neurons with age. Further, the team assessed levels of growth factors that are known to promote proliferation and survival of NENs (GDNF-Ret signaling) versus factors known to promote growth of mesoderm (HGF) with age and document a decrease in GDNF-Ret signaling while HGF levels increase with age. The authors propose that the balance between these two signaling pathways is responsible for the shift in proportions of NENs versus MENs in aging animals.

      Some of the conclusions of this paper are supported, but several additional analyses are needed to reach the outcomes that the authors infer:

      1) Because the scRNA-seq data generated in this study derives from mixed cell populations present in laminar gut muscle preparations, there is a gap between the image data shown for the mesodermal cre lineage tracing and the MENs clusters the authors have selected in their single cell RNA-seq analysis. The absence of direct transcriptional profiling of cells labeled by Mesp1-cre or Tek1-cre expression prevents the authors from definitively connecting their in situ lineage labeling with specific clusters in the single cell RNA-seq analysis.

      2) Differential gene expression is the standard approach for identifying markers of a particular cluster and yet this is lacking in this study, and the rationale for why some genes were prioritized as markers of MENs is missing from the manuscript. Reanalysis of the authors posted single cell RNA-seq data found that genes integral to calling MENs (marker genes) were detectable in the data. Met, Cdh3, Calcb, Elavl2, Hand2, Pde10a, Vsnl1, Tubb2b, Stmn2, Stx3, and Gpr88 were all expressed in very few cells and at low levels. Given this, how were these genes chosen to be marker genes for MENs, especially given the low sequencing depth utilized?

      3) The authors rely on Phox2b as a marker for all ENS cells, including MENs. However, reprocessing of the authors posted single cell RNA-seq data finds that Phox2b is not detected in any of the cells in the MENs cluster and it's only expressed in very few cells of the neuroglia cluster. This discrepancy between the data the authors have generated and what is widely known about Phox2b expression in the ENS field must be explained as the absence of Phox2b message suggests there is an issue with reliance on low-depth scRNA-seq data for reaching the stated conclusions.

      4) The authors have not considered potential similarities between their MENs and other developing ENS lineages, like enteric mesothelial fibroblasts reported by Zeisel et al. 2018, and further analysis is needed to show that MENs are indeed a distinct cell type. Top marker genes of the author's MENs clusters were expressed more often in the clusters that were left out of Morarach et al 2021's E15.5 and E18.5 datasets because those clusters were mostly Phox2b-negative on UMAPs. This lack of Phox2b expression matches the characteristic of the MENs clusters' Phox2b-negative status in the authors single cell dataset. It is important to note that the Morarach dataset consists of Wnt1-cre lineage labeled (originating from neural crest) flow sorted cells. This is of import as it implies that Phox2b-negative cells ARE present within the Wnt1-cre lineage labeled population, an aspect that is relevant to this study's data analysis.

      5) Upon reprocessing of the authors MENs-genesis dataset with integration by sample as the authors describe, Met expression is evident within the cluster of NENs on the resulting UMAP plot and yet the authors rely on this gene as a marker of MENs. Whether Met expression is restricted to MENs should be resolved because the authors state it is exclusive to MENs and they subsequently investigate this gene across lifespan. Because it is not clear that Met is absent from neural crest derived enteric neurons this caveat complicates the interpretations of the present study.

      6) The authors apply MHCst immunofluorescence to mark MENs, but do not show any RNA expression for the MHCst transcripts in their single cell data. How did the authors come to the conclusion that MHCst IHC would be an appropriate marker for MENs? This rationale is missing from the text.

    4. Reviewer #3 (Public Review):

      In this manuscript, the authors challenge the fundamental concept that all neurons are derived from ectoderm. Specifically, they aim to show that while the early ENS arises embryologically from neural crest (NENs), with age it is slowly replaced by mesoderm-derived neurons (MENs). This claim is based on an array of transgenic reporter mice, immunofluorescence, and transcriptomics. They further propose that the transition from NENs to MENs is regulated by a changing balance in GDNF-RET versus HGF-MET signaling, respectively.

      This is a provocative and potentially paradigm-changing proposal, but the data presented and the interpretation of that data fall short of establishing it.

      1) MENs share more common characteristics with fibroblasts. The authors interpret this as representing neurons with fibroblast characteristics. Why not fibroblasts with neuronal characteristics? The ability to express neurotransmitter receptors and calcium channels is common in fibroblasts, but that isn't sufficient to characterize a neuron. For example, many cell types express neurotransmitters (CGRP in ILCs, Penk in fibroblasts). Expressing one of the Hu proteins (Elavl2) probably isn't enough to call these "neurons," especially when neurons usually express Elavl3-4 (HuC/D). Including calcium imaging and showing presence of action potentials would strengthen the argument that these are in fact neurons.

      2) The scRNA-seq is unconvincing. There are several technical issues and the analysis omits important information required to make an unbiased assessment.

      a. One issue in the interpretation is that MENs are shown by IHC to constitute half the neuronal population, with NENs making up the other half. The authors state that they performed an unbiased approach, sequencing all cells in the muscularis. If it were truly unbiased, then why do they detect a 28-fold increase in MENs in the single cell data? This does not reflect the IHC findings and points to an issue in technique that needs to be addressed.

      b. Cell populations annotated by the author are confusing. The "unknown" population expresses many genes that are epithelial markers. This is puzzling because the authors state that they only sequenced the muscularis. This leads to questions regarding the initial samples and whether they were dissected appropriately or contaminated by another population.

      c. The authors report a population of ICCs at P21 which is not identified at 6-months. Closer inspection of their data shows bona fide ICC markers, Ano1 and Kit, in their SMC cluster at 6-months, with failure to identify ICC clusters, raising questions about whether they have identified a new cell type.

      d. While the authors critically examine other scRNA-seq datasets and claim that those groups mislabeled their populations, the above does not instill confidence in their ability to counter the unified literature.

      3) MENs are identified based on genes that could be related to neurons, including calcium channels, neurotransmitter receptors, etc. It is worth noting that mesenchymal cells, ICCs, and smooth muscle also possess these characteristics. Therefore, it hard to justify why these MENs are considered "neurons." The authors should perform an analysis to examine homology between clusters in order to show which clusters the MENs are more similar to, neurons or otherwise.

      4) Several issues raise questions about the quality of the scRNA-seq data, making interpretations very difficult:

      a. MENs are identified to have higher UMI counts than other cells, which the authors interpret as the cells being bigger than others. If this is the case, why is this only observed in the P21 dataset and not at 6 months. Notably, high UMIs are also a sign of doublet contamination.

      b. Authors include data from RBCs. As they do not have a nucleus, RNA abundance is low as expected. However, markers for RBCs include smooth muscle specific markers, MYH11 (an MEN marker) and Acta2. The presence of these markers can indicate high levels of "ambient RNA" which enters droplets from other cells lysed during digestion. Interestingly, MENs appear to cluster close to RBCs.

      c. In light of the above possible evidence of doublet contamination and high levels of ambient RNA, the markers of MENs need to be reconsidered. MENs are stated to express markers that were previously (up until this manuscript) accepted markers of intestinal mesothelium (Ukp3b Krt19, WT1), smooth muscle cells (Myh11), and fibroblasts (Dcn, C3, Col6a1), raising the possibility that MENs are an erroneous cluster containing RNA from all these cell types.

      5) The MEN population appears to be the largest cell population in the gut, which is unprecedented. The authors compare their scRNA-seq data to several other studies that have not made similar observations. Such analysis of other datasets is used to inform on the new data being generated. In the current manuscript, however, this takes the reverse approach and the authors analyze other data based on the assumption that they all mislabeled the MEN population.

      a. In their assessment of Drokhlyansky et al., the authors claim that their mesothelium annotation is wrong despite expressing known mesothelial markers. This includes the gene Upk3b which is a bona fide mesothelial marker in the gut but is also expressed by "MENs." They proceed to analyze the Elmentaite et al. dataset and state that their "transitional fibroblast" population are actually MENs. That paper also has a population of Upk3b+ mesothelial cells and it is unclear why those are not actually MENs like in the Drokhlyansky et al. study.

      b. The authors often refer to the study of May-Zhang et al. and their cluster annotated as "mesenchymal neurons" in the gut. It should be known that the original authors never made this claim. Rather, they acknowledge that the clusters in their study with poor correlation to neuronal profiles exhibit strong predictions for mesenchymal and vascular/immune cell types. They state: "We considered the possibility that these clusters might be non-neuronal." If these are "mesenchymal neurons" then the same logic would indicate that there are vascular neurons and immune cell neurons, and therefore this does not make a very compelling case.

      6) A weakness of this study is that a lot of the data relies on reporter gene expression. The authors need to acknowledge several weaknesses of this approach. First, Wnt1-tdT recombination may be incomplete or one can have "Cre mosaicism" and therefore the lack of tdT is not sufficient evidence to say that those neurons are not neural crest-derived. Second, one can have off-target or leaky Cre expression, leading to low-level tdT expression, as seen in many of the images in this study. Third, Cre can exhibit toxicity and this may be more problematic in older mice given the long-term continuous expression of Cre (He et al, Am J Pathology, 2014;184:1660; Loonstra et al, PNAS, 2001;98:9209; Forni et al, J Neurosci, 2006;26:9593; Rehmani et al, Molecules, 2019;24:1189; Gillet et al, Sci Rep, 2019;9:19422; Stifter and Greter, Eur J Immunol, 2020;50:338).

    1. eLife assessment

      Yamamoto and Matano provide solid evidence that a G63E/R CD8+ T-cell escape mutation in the accessory viral protein Nef may facilitate the induction of neutralizing antibody (nAb) responses in rhesus macaques infected with SIVmac239. Functional analyses support that this mutation specifically impairs Nef`s ability to stimulate PI3K/Akt/mTORC2 signalling. This important study suggests that the accessory viral Nef protein impairs B cell function and effective humoral immune responses and is of interest for researchers and physicians interested in HIV/AIDS and vaccine development.

    2. Reviewer #1 (Public Review):

      This work describes the induction of SIV-specific NAb responses in rhesus macaques infected with SIVmac239, a neutralization-resistant virus. Typically, host NAb responses are not detected in animals infected with SIVmac239. In this work, seventy SIVmac239-infected macaques were retrospectively screened for NAb responses and a subset of nine animals were identified as NAb-inducers. The viral genomes from 7/9 animals that induced NAb responses were found to encode nonsynonymous mutation in the Nef gene (amino acid G63E). In contrast, Nef G63E mutation was found only in 2/19 NAb non-inducers - implicating that the Nef G63E mutation is selected in NAb inducers. Measurement of Nef G63E frequencies in plasma viruses suggested that Nef G63E selection preceded NAb induction. Nef G63E mutation was found to mediate escape from Nef-specific CD8+ T-cell responses. To examine the functional phenotype of Nef G63E mutant, its effect on downmodulation of Nef-interacting host proteins was examined. Infection of rhesus and cynomolgus macaque CD4+ T cell lines with WT or Nef G63E mutant SIV suggested that Nef mutant reduces S473 phosphorylation of AKT. Using flow cytometry-based proximity ligation assay, it was shown that Nef G63E mutation reduced binding of Nef to PI3K p85/p110 and mTORC2 GβL/mLST8 and MTOR components - kinase complex responsible AKT-S473 phosphorylation. In vitro B-cell Nef invasion and in vivo imaging/flow cytometry-based assays were employed to suggest that Nef from infected cells can target Env-specific B cells. Lastly, it was determined that NAb inducers have significantly higher Env-specific B-cells responses after Nef G63E selection when compared to NAb non-inducers. Finally, a corollary was drawn between the Nef G63E-associated B-cell/NAb induction phenotype and activated PI3K delta syndrome (APDS), which is caused by activating GOF mutations in PI3K, to suggest that Nef G63E-meidated induction of NAb response is reciprocal to APDS.

      Strengths:<br /> This study aims to understand the viral-host interaction that governs NAb induction in SIVmac239-infected macaques - this could enable identification of determinants important for induction of NAb responses against hard-to-neutralize tier-2/3 HIV variants. The finding that SIV-specific B-cell responses are induced following Nef G63E CD8+ T-cell escape mutant selection argue for an evolutionary trade-off between CTL escape and NAb induction. Exploitation of such a cellular-humoral immune axis could be important for HIV/AIDS vaccine efforts.

      Although more validation and mechanistic basis are needed, the corollary between PI3K hyperactive signaling during autoimmune disorders and Nef-mediated abrogated PI3K signaling could help identify novel targets and modalities for targeting immune disorders and viral infections.

      Weaknesses:<br /> Although the paper does have strengths in principle, the weaknesses of the paper are that the mechanistic basis of Nef-mediated induction of NAb responses are not directly examined. For example, it remains unclear whether SIVmac239 with engineered G63E mutation in Nef would induce faster and potent NAb responses. A macaque challenge study is needed to address this point.

      As presented, the central premise of the paper involves infected cell-generated Nef (WT or G63E mutant) being targeted to adjacent Env-specific B cells. However, it remains unclear how this is transfer takes place. A direct evidence demonstrating CD4+ T cell-associated and/or cell-free Nef being transferred to B-cell is needed to address this concern.

      The interaction between Nef and PI3K signaling components (p85, p110, GβL/mLST8, and MTOR) has been explored using PLA assay, however, this requires validation using additional biochemical and/or immunoprecipitation-based approaches. For example, is Nef (WT or mutant form) sufficient to affect PI3K-induced phosphorylation of Akt in an in vitro kinase assay? Moreover, the details regarding the binding events of WT vs mutant Nef with PI3K signaling components is lacking in this study. Lastly, it is unclear whether the interaction of Nef with PI3K signaling components is a conserved function of all primate lentiviruses or is this SIV-specific phenotype.

      It has been previously reported that the region of Nef encoding glycine at position 63 is not conserved in HIV-1 (Schindler et al, Journal of Virology 2004). Thus, does HIV-1 Nef also function in induction of NAb responses in humans? or the observed phenotype specific to SIV?

    1. eLife assessment

      This manuscript describes useful data on the mechanisms underlying the activation of the receptor tyrosine kinase FGFR1 and stimulation of intracellular signaling pathways in response to FGF4, FGF8, or FGF9 binding to the extracellular domain of FGFR1. Solid quantitative binding experiments are presented to demonstrate that FGF4, FGF8, and FGF9 exhibit distinct binding affinities towards FGFRs.

    2. Joint Public Review

      In this manuscript, Karl et al. explore mechanisms underlying the activation of the receptor tyrosine kinase FGFR1 and stimulation of intracellular signaling pathways in response to FGF4, FGF8, or FGF9 binding to the extracellular domain of FGFR1. Quantitative binding experiments presented in the manuscript demonstrate that FGF4, FGF8, and FGF9 exhibit distinct binding affinities towards FGFRs. It is also proposed that FGF8 exhibits "biased ligand" characteristics that is manifested via binding and activation FGFR1 mediated by "structural differences in the FGF- FGFR1 dimers, which impact the interactions of the FGFR1 trans membrane helices, leading to differential recruitment and activation of the downstream signaling adapter FRS2".

      In the absence of any structural experimental data of different forms of FGFR dimers stimulated by FGF ligands the model presents in the manuscript is speculative and misleading.

    1. eLife assessment

      This study of extrachromosomal DNA (ecDNA) aims to identify genes that distinguish ecDNA+ and ecDNA- tumors. This timely study is important in addressing the genes responding to the amplification of the ecDNA. The data presented are for the most part solid, there were concerns regarding the clarity in the description of the analysis methods and whether the evidence for specific genes required to maintain the ecDNA+ state was entirely conclusive.

    2. Reviewer #2 (Public Review):

      In their manuscript entitled "Transcriptional immune suppression and upregulation of double stranded DNA damage and repair repertoires in ecDNA-containing tumors" Lin et al. describe an important study on the transcriptional programs associated with the presence of extrachromosomal DNA in a cohort of 870 cancers of different origin. The authors find that compared to cancers lacking such amplifications, ecDNA+ cancers express higher levels of DNA damage repair-associated genes, but lower levels of immune-related gene programs.

      This work is very timely and its findings have the potential to be very impactful, as the transcriptional context differences between ecDNA+ and ecDNA- cancers are currently largely unknown. The observation that immune programs are downregulated in ecDNA+ cancers may initiate new preclinical and translational studies that impact the way ecDNA+ cancers are treated in the future. Thus, this study has important theoretical implications that have the potential to substantially advance our understanding of ecDNA+ cancers.

      Strengths<br /> The authors provide compelling evidence for their conclusions based on large patient datasets. The methods they used and analyses are rigorous.

      Weaknesses<br /> The biological interpretation of the data remains observational. The direct implication of these genes in ecDNA(+) tumors is not tested experimentally.

    1. eLife assessment:

      This study introduces a valuable paradigm in the field of adipose tissue biology: blocking triglyceride storage in adipose tissue does not lead to lipodystrophy and impaired glucose homeostasis but instead improves metabolic health. The evidence supporting these claims is convincing, based on a comprehensive metabolic analysis, although mechanistic studies would strengthen the study and its impact. This study will be of high interest to those in the adipose tissue biology and metabolism fields.

    2. Reviewer #1 (Public Review):

      The present study examined the physiological mechanisms through which impaired TG storage capacity in adipose tissues affects systemic energy homeostasis in mice. To accomplish this, the authors deleted DGAT1 and DGAT2, crucial enzymes for TG synthesis, in an adipocyte-specific manner. The authors found that ADGAT DKO mice substantially lost the adipose tissues and developed hypothermia when fasted; however, surprisingly, ADGAT KO mice were metabolically healthy on a high-fat diet. The authors found that it was accompanied by elevated energy expenditure, enhanced glucose uptake by the BAT, and enhanced browning of white adipose tissues. This unique animal model provided exciting opportunities to identify new mechanisms to maintain systemic energy homeostasis even in a compromised energy storage capacity. Overall, the data are compelling and support the conclusions of the paper. The manuscript is also clearly written.

    3. Reviewer #2 (Public Review):

      Here, Chitraju et al have studied the phenotype of mice with an adipocyte-specific deletion of the diglycerol acyltransferases DGAT1 and DGAT2, the two enzymes catalyzing the last step in triglyceride biosynthesis. These mice display reduced WAT TG stores but contrary to their expectations, the TG loss in WAT is not complete and the mice are resistant to a high-fat diet intervention and display a metabolically healthier profile compared to control littermates. The mechanisms underlying this are not entirely clear, but the double knockout (DKO) animals have increased EE and a lower RQ suggesting that enhanced FA oxidation and WAT "browning" may be involved. Moreover, both adiponectin and leptin are expressed in WAT and are detectable in circulation. The authors propose that "the capacity to store energy in adipocytes is somehow sensed and triggers thermogenesis in adipose tissue. This phenotype likely requires an intact adipocyte endocrine system...." Overall, I find this to be an interesting notion.

    4. Reviewer #3 (Public Review):

      In this study, the authors sought to test the hypothesis that blocking triglyceride storage in adipose tissue by knockout of DGAT1 and DGAT2 in adipocytes would lead to ectopic lipid deposition, lipodystrophy, and impaired glucose homeostasis. Surprisingly, the authors found the opposite result, with DGAT1/2 DKO in adipocytes leading to increased energy expenditure, minimal ectopic lipid deposition, and improved glucose homeostasis with HFD feeding. These metabolic improvements were largely attributed to increased beiging of the white fat and increased brown adipose tissue activity. This study provides an interesting new paradigm whereby impairing fat storage, the major function of adipose tissue, does not lead to severe metabolic disease, but rather improves it. The authors provide a comprehensive assessment of the metabolism of these DKO mice under chow and HFD conditions, which support their claims. The study lacks in mechanistic insight, which would strengthen the study, but does not detract from the authors' major conclusions.

    1. eLife assessment

      This study provides an important, original framework to study locomotion on the ground with physics-based simulations. Through numerical simulations, the authors propose that intermediate numbers of body modules and high body symmetry enhance speed. The evidence that evolution may favour bilateral symmetry and modularity for efficient directed locomotion is still incomplete, however.

    2. Reviewer #1 (Public Review):

      The manuscript presents a framework for studying biomechanical principles and their links to morphology and provides interesting insights into a particular question regarding terrestrial locomotion and speed. The goal of the paper is to derive general principles of directed terrestrial locomotion, speed, and symmetry.

      Major strengths:<br /> The manuscript is a unique and creative work that explores performance spaces of a complicated question through computational modeling. Overall, the paper is well written and well crafted and was a pleasure to read.

      The methods presented here (variable agents used to represent ultra-simplified body configurations that are not inherently constrained) are interesting and there's significant potential in them for a properly constrained question. For the data that is present here their hypotheses (while they can be anticipated from first principles) are very well validated and serve as a robust validation of these expectations and can help.

      Of particular interest was the discussion of the transferability of morphologies designed under one system and moving to another. From a deep-time perspective, of particular interest is the transition from subaqueous to terrestrial locomotion which we know was a major earth life transition. The results of this study show that the best-suited morphologies for subaqueous movement are ill-suited (from a locomotor speed standpoint at least) to fully terrestrial locomotion which begs the question of if there are a suite of forms that have balanced performance in both and how that would differ from aquatic morphologies.

      Major weaknesses:<br /> 1. There is a major disagreement between the target and parameters.

      From a biomechanics perspective the target of this study, Directed Locomotion, is a fairly broad behavioral mode. However, what the authors are ultimately evaluating their model organisms on is a single performance parameter (speed, or distance traveled after 30s). Statements such as "bilateral symmetry showed to be a law-like pattern in animal evolution for efficient directed locomotion purposes" (p 12 line 365-366) are problematic for this reason.

      Attaining the highest possible speed is a relevant but limited subset of ways one might interpret performance for directed locomotion. Efficiency, power generation, and limb loading/strain are equally relevant components.

      The focus on speed coupled with selection for only the highest performing morphologies, rather than setting a minimum performance threshold, fundamentally restricts the dynamics of the system in a way that is not representative of their specified target and pulls the simulations toward a specific, anticipatable, result.

      Locomotor efficiency is alluded to later in the manuscript as one of the observed outcomes, but speed is not equivalent to locomotor efficiency (in much the same way that it is not the sole metric for describing performance with respect to directed locomotion). Energy/work/power have not been accounted for in the manuscript so this is not a parameter this study weighs in on.

      The data and analyses the others present do show an interesting validation of these methods in assessing first-order questions relating the shape of a single performance surface to a theoretical morphology, which has significant potential value.

      2. There is a significant population and/or sample size and biasing.

      Thirty simulations of a population of 101 morphologies seems small for a study of this kind, particularly looking to investigate such a broad question at an abstract level. Particularly when the top 50% of morphologies are chosen to mutate. It would be very easy for artificial biases to rapidly propagate through this system depending on the parameters bounding the formation of the initial generation.

      This strong selection choosing the best 50 morphologies and mutating them enforces an aggressive effect that simulates an even more potent phylogenetic inertia than one might anticipate for an actual evolutionary history (it's no surprise then that all of the simulations were able to successfully retrieve a suite of morphotypes that recovered the performance peak for this system within 1500 generations).

      Similarly, why is it that a 4^3 voxel limit was chosen? One can imagine that an increase in this voxel limit would allow for the development of more extreme geometries, which might be successful. It is likely that there might be computational resource constraints involved in this, it would be useful for the authors to add additional context here.

    3. Reviewer #2 (Public Review):

      Acknowledging practical difficulties in teasing out the principles behind animal locomotion from the body's functions and survival needs, the authors embark on a computational experiment to replay the "tape of life." Specifically, the chief objective of the study was to explore the necessity of symmetry and modularity for better-directed locomotion on the ground.

      Towards this important goal, the authors put forward a comprehensive computational study using physics-based simulations of 3D voxel-based robots. Such a simulation environment allows one to capture salient dynamics behind locomotion, including interactions with the environment. The authors undertake simulations for three different gravitational environments, water, Earth, and Mars. The work has several methodological strengths, with respect to the ingenuity of the approach and the elegance of the analysis; I was particularly intrigued by the use of graph theory in the context of modularity. Results point to a complex, rich role of modularity and symmetry in locomotion, modulated by the gravitational environment.

      The association between "locomotion ability" and average speed is, in my view, tenuous, whereby locomotion is a complex phenomenon that should be assessed across a range of intertwined dynamic metrics that include, for example, stability with respect to external perturbations and energy efficiency. I also am not fully convinced of i) the adequacy of the spatial resolution, whereby I failed to see a compelling argument regarding the completeness of 64 voxels; ii) the realism of the oscillatory patterns, whereby all the voxels are set to oscillate at the same, constant, frequency of 2Hz; and iii) the accuracy of simulations in water where added mass effects seem to be neglected. Overall, dynamics and control aspects could be improved in both the methods and the interpretation of the results. Finally, I believe that a stronger connection between the hypotheses of the study and the literature (in animal or robot locomotion) would help frame the narrative better. I would be particularly curious to see some tie with human bipedal locomotion.

      The work bears important implications in the study of locomotion, shedding light on the role of modularity and symmetry, beyond what one could gather from mere observations. Not only do I expect these new insights to stimulate further research in the area of locomotion, but also I envision other communities embracing a similar computational approach to address related questions in life sciences and robotics.

    1. eLife assessment

      This is a valuable addition to the literature as it helps us understand the role of tRNA modifying enzymes in Mycobacterium tuberculosis. By knocking out one of the enzymes, the authors convincingly demonstrate the importance of tRNA-modifying enzymes for intra-host growth of tubercle bacteria. Some of the claims regarding modification as well as the role in virulence could be strengthened through further bioinformatics and phylogenetic analyses as well as experimental approaches. The work will be of interest to microbiologists.

    2. Reviewer #1 (Public Review):

      Tomasi et al. performed a combination of bioinformatic, next-generation tRNA sequencing experiments to predict the set of tRNA modifications and their corresponding genes in the tRNAs of the pathogenic bacteria Mycobacterium tuberculosis. Long known to be important for translation accuracy and efficiency, tRNA modifications are now emerging as having regulatory roles. However, the basic knowledge of the position and nature of the modifications present in a given organism is very sparse beyond a handful of model organisms. Studies that can generate the tRNA modification maps in different organisms along the tree of life are good starting points for further studies. The focus here on a major human pathogen that is studied by a large community raises the general interest of the study. Finally, deletion of the gene mnmA responsible for the insertion of s2U at position 34 revealed defects in growth in macrophage but in test tubes suggesting regulatory roles that will warrant further studies. The conclusions of the paper are mostly supported by the data but the partial nature of the bioinformatic analysis and absence of Mass-Spectrometry data make it incomplete. The authors do not take advantage of the Mass spec data that is published for Mycobacterium bovis (PMID: 27834374) to discuss what they find.

    3. Reviewer #2 (Public Review):

      In this study, Tomasi et al identify a series of tRNA modifying enzymes from Mtb, show their function in the relevant tRNA modifications and by using at least one deleted strain for MnmA, they show the relevance of tRNA modification in intra-host survival and postulate their potential role in pathogenesis.

      Conceptually it is a wonderful study, given that tRNA modifications are so fundamental to all life forms, showing their role in Mtb growth in the host is significant. However, the authors have not thoroughly analyzed the phenotype. The growth defect aspect or impact on pathogenesis needs to be adequately addressed.

      - The authors show that ΔmnmA grows equally well in the in vitro cultures as the WT. However, they show attenuated growth in the macrophages. Is it because Glu1_TTC and Gln1-TTG tRNAs are not the preferred tRNAs for incorporation of Glu and Gln, respectively? And for some reason, they get preferred over the alternate tRNAs during infection? What dictates this selectivity?

      - As such the growth defect shown in macrophages would be more convincing if the authors also show the phenotype of complementation with WT mnmA.

      An important consideration here is the universal nature of these modifications across the life forms. Any strategy to utilize these enzymes as the potential therapeutic candidate would have to factor in this important aspect.

    1. eLife assessment

      In this important study, authors have integrated genetic and genomic datasets from humans and mice to unveil shared networks and pathways associated with coronary artery disease. Their compelling analysis has led to the identification of novel regulatory genes and pathways in vascular tissues and in the liver, allowing for a more in-depth understanding of CAD pathogenesis.

    2. Reviewer #1 (Public Review):

      This manuscript represents an elegant bioinformatics approach to addressing causal pathways in vascular and liver tissue related to atherosclerosis/coronary artery disease, including those shared by humans and mice and those that are specific to only one of these species. The authors constructed co-expression networks using bulk transcriptome data from human (aorta, coronary) and mouse (aorta) vascular and liver tissue. They mapped human CAD GWAS data onto these modules, mapped GWAS SNPs to putatively causal genes, identified pathways and modules enriched in CAD GWAS hits, assessed those shared between vascular and liver tissues and between humans and mice, determined key driver genes in CAD-associated supersets, and used mouse single-cell transcriptome data to infer the roles of specific vascular and liver cell types. The overall approach used by the authors is rigorous and provides new insights into potentially causal pathways in vascular tissue and liver involved in atherosclerosis/CAD that are shared between humans and mice as well as those that are species-specific. This approach could be applied to a variety of other common complex conditions.

      The conclusions are largely supported by the analyses. Some specific comments:

      1. It appears that GWAS SNPs were mapped to genes solely through the use of eQTLs. Current methods involve a number of other complementary approaches to map GWAS SNPs to effector genes/transcripts and there is the thought that eQTLs may not necessarily be the best way to map causal genes.<br /> 2. Given the critical causal role of circulating apoB lipoproteins in atherosclerosis in both mice and humans and the central role of the liver in regulating their levels, perhaps the authors could use the 'metabolism of lipids and lipoproteins' network in Fig 3B as a kind of 'positive control' to illustrate the overlap between mice and humans and the driver genes for this network.<br /> 3. Is it possible to infer the directionality of effect of key driver genes and pathways from these analyses? For example, ACADM was found to be a KD gene for a human-specific liver CAD superset pathway involving BCAA degradation. Are the authors able to determine or predict the effect of genetically increased expression of ACADM on BCAA metabolism and on CAD? Or the directionality of the effect of the hepatic KD gene OIT3 on hepatic and plasma lipids and atherosclerosis.<br /> 4. While likely beyond the scope of this manuscript, the substantial amount of publicly available plasma proteomic and metabolomic data could be incorporated into this multiomic bioinformatic analysis. Many of the pathways involve secreted proteins or metabolites that would further inform the biology and the understanding of how these pathways relate to atherosclerosis.

      The findings here will motivate the community of atherosclerosis investigators to pursue additional potential causal genes and pathways through computational and experimental approaches. It will also influence the approach around the use of the mouse model to test specific pathways and therapeutic approaches in atherosclerosis. In addition, the computational approach is robust and could (and likely will) be applied to a variety of other common complex conditions.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Mouse models are widely used to determine key molecular mechanisms of atherosclerosis, the underlying pathology that leads to coronary artery disease. The authors use various systems biology approaches, namely co-expression and Bayesian Network analysis, as well as key driver analysis, to identify co-regulated genes and pathways involved in human and mouse atherosclerosis in artery and liver tissues. They identify species-specific and tissue-specific pathways enriched for the genetic association signals obtained in genome-wide association studies of human and mouse cohorts.

      Strengths:<br /> The manuscript is well executed with appropriate analysis methods. It also provides a compelling series of results regarding mouse and human atherosclerosis.

      Weaknesses:<br /> The manuscript has several weaknesses that should be acknowledged in the discussion. First, there are numerous models of mouse atherosclerosis; however, the HMDP atherosclerosis study uses only one model of mouse atherosclerosis, namely hyperlipidemic mice, due to the transgenic expression of human apolipoprotein E-Leiden (APOE-Leiden) and human cholesteryl ester transfer protein (CETP). Therefore, when drawing general conclusions about mouse pathways not being identified in humans, caution is warranted. Other models of mouse atherosclerosis may be able to capture different aspects of human atherosclerosis. Second, the mouse aorta tissue is atherosclerotic, whereas the atherosclerosis status of the GTEX aorta tissues is not known. Therefore, it is possible that some of the human or mouse-specific gene modules/pathways may be due to the difference in the disease status of the tissues from which the gene expression is obtained. Third, it is unclear how the sex of the mice (all female in the HMDP atherosclerosis study and all male in the baseline HMDP study) and the sex of the human donors affected the results. Did the authors regress out the influence of sex on gene expression in the human data before performing the co-expression and preservation studies? If not, this should be acknowledged. Fourth, some of the results are unexpected, and these should be discussed. For example, the authors identify that the leukocyte transendothelial migration pathway and PDGF signaling pathway are human-specific in their vascular tissue analysis. These pathways have been extensively described in mouse studies. Why do the authors think they identified these pathways as human-specific? Similarly, the authors identified gluconeogenesis and branched-chain amino acid catabolism as human and mouse-shared modules in the vascular tissue. Is there evidence of the involvement of these pathways in atherosclerosis in vascular cells?

      Overall, acknowledging these drawbacks and adding points to the discussion will strengthen the manuscript.

    1. eLife assessment

      The manuscript describes a valuable theoretical calculation focusing on the structural changes in the photosynthetic reaction center postulated by others based on time-resolved crystallography using X-ray free-electron laser (XFEL) (Dods et al., Nature, 2021). The authors provide solid arguments that calculated changes in redox potential Em and deformations using the XEFL structures may reflect experimental errors rather than real structural changes.

    2. Reviewer #1 (Public Review):

      First, I agree with the authors of this manuscript that conformational changes in the XFEL structures with 2.8 A resolution are not reliable enough for demonstrating the subtle changes in the electron transfer events in this bacterial photosynthesis system. Actually, the data statistics in the paper by Dods et al. showed that the high-resolution range of some of the XFEL datasets may include pretty high noise (low CC1/2 and high Rsplit) so the comparison of the subtle conformational changes of the structures is problematic.

      The manuscript by Gai Nishikawa investigated time-dependent changes in the energetics of the electron transfer pathway based on the structures by Dods et al. by calculating redox potential of the active and inactive branches in the structures and found no clear link between the time-dependent structural changes and the electron transfer events in the XFEL structures published by Dods, R.et al. (2021). This study provided validation for the interpretation of the structures of those electron-transferring proteins.

      The paper was well prepared.

    3. Reviewer #2 (Public Review):

      The manuscript by Nishikawa et al. addresses time-dependent changes in the electron transfer energetics in the photosynthetic reaction center from Blastochloris viridis, whose time-dependent structural changes upon light illumination were recently demonstrated by time-resolved serial femtosecond crystallography (SFX) using X-ray free-electron laser (XFEL) (Dods et al., Nature, 2021). Based on the redox potential Em values of bacteriopheophytin in the electron transfer active branch (BL) by solving the linear Poisson-Boltzmann equation, the authors found that Em(HL) values in the charge-separated 5-ps structure obtained by XFEL are not clearly changed, suggesting that the P+HL- state is not stabilized owing to protein reorganization. Furthermore, chlorin ring deformation upon HL- formation, which was expected from their QM/MM calculation, is not recognized in the 5-ps XFEL structure. Then the authors concluded that the structural changes in the XFEL structures are not related to the actual time course of charge separation. They argued that their calculated changes in Em and chlorin ring deformations using the XEFL structures may reflect the experimental errors rather than the real structural changes; they mentioned this problem is due to the fact that the XFEL structures were obtained at not high resolutions (mostly at 2.8 Å). I consider that their systematic calculations may suggest a useful theoretical interpretation of the XFEL study. However, the present manuscript insists as a whole negatively that the experimental errors may hamper to provide the actual structural changes relevant to the electron transfer events.

    1. eLife assessment

      This study presents valuable findings on Legionella pneumophila effector proteins that target host vesicle trafficking GTPases during infection and more specifically modulate ubiquitination of the host GTPase Rab10. The evidence supporting the claims of the authors is solid, although it remains unclear how modification of the GTPase Rab10 with ubiquitin supports Legionella virulence and the impact of ubiquitination during LCV formation. The work will be of interest to colleagues studying animal pathogens as well as cell biologists in general.

    2. Reviewer #1 (Public Review):

      In this manuscript, Kubori and colleagues characterized the manipulation of the host cell GTPase Rab10 by several Legionella effector proteins, specifically members of the SidE and SidC family. They show that Rab10 undergoes both conventional ubiquitination and noncanonical phosphoribose-ubiquitination, and that this posttranslational modification contributes to the retention of Rab10 around Legionella vacuoles.

      Strengths<br /> Legionella is an emerging pathogen of increasing importance, and dissecting its virulence mechanisms allows us to better prevent and treat infections with this organism. How Legionella and related pathogens exploit the function of host cell vesicle transport GTPases of the Rab family is a topic of great interest to the microbial pathogenesis field. This manuscript investigates the molecular processes underlying Rab10 GTPase manipulation by several Legionella effector proteins, most notably members of the SidE and SidC families. The finding that MavC conjugates ubiquitin to SdcB to regulate its function is novel, and sheds further light into the complex network of ubiquitin-related effectors from Lp. The manuscript is well written, and the experiments were performed carefully and examined meticulously.

      Weaknesses<br /> Unfortunately, in its current form this manuscript offers only little additional insight into the role of effector-mediated ubiquitination during Lp infection beyond what has already been published. The enzymatic activities of the SidC and SidE family members were already known prior to this study, as was the importance of Rab10 for optimal Lp virulence. Likewise, it had previously been shown that SidE and SidC family members ubiquitinate various host Rab GTPases, like Rab33 and Rab1. The main contribution of this study is to show that Rab10 is also a substrate of the SidE and SidC family of effectors. What remains unclear is if Rab10 is indeed the main biological target of SdcB (not just 'a' target), and how exactly Rab10 modification with ubiquitin benefits Lp infection.

    3. Reviewer #2 (Public Review):

      This manuscript explores the interplay between Legionella Dot/Icm effectors that modulate ubiquitination of the host GTPase Rab10. Rab10 undergoes phosphoribosyl-ubiquitination (PR-Ub) by the SidE family of effectors which is required for its recruitment to the Legionella containing vacuole (LCV). Through a series of elegant experiments using effector gene knockouts, co-transfection studies and careful biochemistry, Kubori et al further demonstrate that:

      1. The SidC family member SdcB contributes to the polyubiquitination (poly-Ub) of Rab10 and its retention at the LCV membrane.<br /> 2. The transglutaminase effector, MavC acts as an inhibitor of SdcB by crosslinking ubiquitin at Gln41 to lysine residues in SdcB.

      Some further comments and questions are provided below.

      1. From the data in Figure 1, it appears that the PR-Ub of Rab10 precedes and in fact is a prerequisite for poly-Ub of Rab10. The authors imply this but there's no explicit statement but isn't this the case?<br /> 2. The complex interplay of Legionella effectors and their meta-effectors targeting a single host protein (as shown previously for Rab1) suggests the timing and duration of Rab10 activity on the LCV is tightly regulated. How does the association of Rab10 with the LCV early during infection and then its loss from the LCV at later time points impact LCV biogenesis or stability? This could be clearer in the manuscript and the summary figure does not illustrate this aspect.<br /> 3. How do the activities of the SidE and SidC effectors influence the amount of active Rab10 on the LCV (not just its localisation and ubiquitination)<br /> 4. What is the fate of PR-Ub and then poly-Ub Rab10? How does poly-Ub of Rab10 result in its persistence at the LCV membrane rather than its degradation by the proteosome?<br /> 5. Mutation of Lys518, the amino acid in SdcB identified by mass spec as modified by MavC, did not abrogate SdcB Ub-crosslinking, which leaves open the question of how MavC does inhibit SdcB. Is there any evidence of MavC mediated modification to the active site of SdcB?<br /> 6. I found it difficult to understand the role of the ubiquitin glycine residues and the transglutaminase activity of MavC on the inhibition of SdcB function. Is structural modelling using Alphafold for example helpful to explain this?<br /> 7. Are the lys mutants of SdbB still active in poly-Ub of Rab10?

    1. eLife assessment

      This important study outlines a new role for caspases during cellular differentiation. The methodology used is convincing and state-of-the-art. The newly discovered cellular cascade described here uncovers that caspases can achieve high substrate specificity during differentiation. As such, the work will be of broad interest to cell biologists.

    2. Reviewer #1 (Public Review):

      This manuscript describes the transient proteolysis of several Nups during myogenesis due to activation of caspase 3, and how this "trimming" leads to defects in nuclear export. The authors show the NPC-related course of events during cellular differentiation and suggest mechanistic insights into exactly why this limited proteolysis is needed for myogenesis. In addition, the authors introduce a novel concept for caspase cellular function that might be worth investigating in the future. Overall, the authors present an elegant and interesting piece of work, performed at the usual superb quality of this group, and indeed the figures throughout the manuscript clearly show a very high level of experimental expertise.

    3. Reviewer #2 (Public Review):

      Cho and Hetzer provide evidence that nuclear pore complexes (NPCs) are "trimmed" by caspases as a key element of muscle (and other) differentiation programs. Overall, the data are of high quality and are well presented. There is an interesting mechanism demonstrated whereby nuclear and cytosolically-oriented nups are specifically degraded from the NPC (fragments are sometimes associated with the NPCs), which leads to a specific inhibition of nuclear export. A highlight is a quantitative proteomic analysis of nuclear fractions that nicely demonstrates the change in the nuclear proteome upon NPC trimming, which includes elevated levels of many NES-containing factors. An important control is that these nuclear proteome changes don't occur when caspases are inhibited. These data are valuable although they fall short in demonstrating that NPC trimming is actually required for the execution of the differentiation program. It is recognized, however, that editing several nup genes at several sites to prevent caspase recognition would be extremely challenging and unfeasible, thus ultimately this does not detract from the significance of the findings. Indeed, there is a new broadly impactful concept being introduced - that caspases need not be destructive but they can be productively utilized to contribute to cell fate decisions.

    1. eLife assessment

      In this valuable study, the authors seek to characterize the role of splicing factor SRSF1 during spermatogenesis with a conditional knockout for Srsf1 in male germ cells. The spermatogenesis phenotype is convincingly supported, but two central claims of the study, that SRSF1 is required for spermatogonial homing and self-renewal and that this function is mediated by regulation of splicing of the gene Tial1, are inadequately supported. Support is inadequate because homing and self-renewal phenotypes were not explicitly tested, and functional data were not provided to support a role for alternative splicing of Tial1 in the fertility phenotype. The work will be of interest to reproduction and stem cell biologists.

    2. Reviewer #1 (Public Review):

      The authors revealed that spermatogonia-related genes (e.g., Plzf, Id4, Setdb1, Stra8, Tial1/Tiar, Bcas2, Ddx5, Srsf10, Uhrf1, and Bud31) were bound by SRSF1 in the mouse testes by Crosslinking immunoprecipitation and sequencing (CLIP-seq). Using Vasa-cre mouse line, the authors successfully evidenced that SRSF1 in the testis is essential for homing and self-renewal in mouse spermatogonial stem cells. Further evidence showed that SRSF1 directly binds and regulates the expression of Tial1/Tiar via AS to implement SSC homing and self-renewal. Immunoprecipitation mass spectrometry (IP-MS) data showed that the AS of SSC is regulated by SRSF1 coordinated with other RNA splicing-related proteins (e.g., SRSF10, SART1, RBM15, SRRM2, SF3B6, and SF3A2). The authors revealed the critical role of SRSF1-mediated AS in SSC homing and self-renewal, which may provide a framework to elucidate the molecular mechanisms of the posttranscriptional network underlying the formation of SSC pools and the establishment of niches. The experiments are well-designed and conducted, the overall conclusions are convincing. This work will be of interest to stem cell and reproductive biologists.

    3. Reviewer #3 (Public Review):

      In this study, Sun et al examine the role of the splicing factor SRSF1 in spermatogenesis in mice. Alternative splicing is important for spermatogenic development, but its regulation and major developmental roles during spermatogenesis are not well understood. The authors set out to better define both SRSF1 function in testes and the contribution of alternative splicing. They collect several large 'omics datasets to define SRSF1 targets in testis, including RNA interactions by CLIP-seq in whole testis, protein interactions by IP-mass spec in whole testis, and RNA sequencing to detect expression levels and splice variants. They also examine the phenotype of germline conditional knockouts (cKO) for Srsf1, using the early-acting Vasa-Cre, and find a severe depletion of germ cells starting at 7 days post partum (dpp) and culminating with a lack of germ cells (Sertoli Cell Only Syndrome) by adulthood. They detect differences in gene expression as well as differences in splicing between control and knockout, including 9 genes that are downregulated, experience alternative splicing, and whose transcripts are also bound by SRSF1, and identify the Tial1/Tiar transcript as one of these targets. They conclude that SRSF1 is required for homing and self-renewal of spermatogonial stem cells, at least in part through its regulation of Tial1/Tiar splicing.

      Strengths of the paper include detailed phenotyping of the Srsf1 cKO, which convincingly supports the Sertoli Cell Only phenotype, establishes the timing of the first appearance of the spermatogonial defect, and provides new insight into the role of splicing factors and SRSF1 specifically in spermatogenesis. Another strength is the generation of CLIP-seq, IP-MS, and RNA-seq datasets which will be a useful resource for the field of germ cell development. Major weaknesses include a lack of robust support for two major claims: first, there is inadequate support for the claim of defects in either "homing" or "self-renewal" of spermatogonia in the cKO, and second, there is inadequate support for the claim that altered splicing of the Tial1 transcript mediates the effect of SRSF1 loss. A moderate weakness is the superficial discussion of the CLIP, RNA-seq, and IP-MS datasets, limiting their otherwise high utility for other researchers. Overall, the paper as it stands will have a moderate impact on the field of male reproductive biology. Specific points that should be addressed to improve support for the claims are below.

      Major comments

      1) In Fig 1D, it appears that SRSF1 is expressed most strongly in spermatogonia by immunofluorescence, but this is inconsistent with the sharp rise in expression detected by RT-qPCR at 20 days post partum (dpp) (Fig. 1B), which is when round spermatids are first added; this discrepancy should be explained or addressed.

      2) It is important to provide a more comprehensive basic description of the CLIP-seq datasets beyond what is shown in the tracks shown in Fig. 2B. This would allow a better assessment of the data quality and would also provide information about the transcriptome-wide patterns of SRSF1 binding. No information or quality metrics are provided about the libraries, and it is not stated how replicates are handled to maximize the robustness of the analysis. The distribution of peaks across exons, introns, and other genomic elements should also be shown.

      3) The claim that SRSF1 is required for "homing and self-renewal" of SSCs is made in multiple places in the manuscript. However, neither homing nor self-renewal is ever directly tested. A single image is shown in Fig. 5E of a spermatogonium at 5dpp that does not appropriately sit on the basal membrane, potentially indicating a homing defect, but this is not quantified or followed up. There is good evidence for depletion of spermatogonia starting at 7 dpp, but no further explanation of how homing and/or self-renewal fit into the phenotype.

      4) In Fig. 6A (lines 258-260) very few genes downregulated in the cKO are bound by SRSF1 and undergo abnormal splicing. The small handful that falls into this overlap could simply be noise. A much larger fraction of differentially spliced genes are CLIP-seq targets (~33%), which is potentially interesting, but this set of genes is not explored.

      5) The background gene set for Gene Ontology analyses is not specified. If these were done with the whole transcriptome as background, one would expect enrichment of spermatogenesis genes simply because they are expressed in testes. The more appropriate set of genes to use as background in these analyses is the total set of genes that are expressed in testis.

      6) In general, the model is over-claimed: aside from interactions by IP-MS, little is demonstrated in this study about how SRSF1 affects alternative splicing in spermatogenesis, or how alternative splicing of TIAL1 specifically would result in the phenotype shown. It is not clear why Tial1/Tiar is selected as a candidate mediator of SRSF1 function from among the nine genes that are downregulated in the cKO, are bound by SRSF1, and undergo abnormal splicing. Although TIAL1 levels are reduced in cKO testes by Western blot (Fig. 7J), this could be due just be due to a depletion of germ cells from whole testis. The reported splicing difference for Tial1 seems very subtle and the ratio of isoforms does not look different in the Western blot image.

    1. eLife assessment

      This important study presents findings with broad implications for the use of AlphaFold2 models in ligand binding pose modeling, a common task in protein structure modeling. The computational experiments and analyses provide compelling results for the GPCR protein family data, but the conclusions are likely to apply also to other proteins and they will therefore be of interest to biophysicists, physical chemists, structural biologists, and anyone interested or involved in structure-based ligand discovery.

    2. Reviewer #1 (Public Review):

      The authors assess the accuracy of AlphaFold2 (AF2) structures for small molecule ligand pose prediction versus the accuracy with traditional template-based homology models and experimental crystal structures (with a different ligand). They take a careful, rigorous approach leveraging the wealth of structural data around the GPCR protein family and using state-of-the-art docking methods. They find that binding sites are significantly more accurately modeled by AF2 compared to traditional template-based approaches, but this does not translate to greater accuracy in small-molecule docking pose prediction. The important findings around this conclusion have broad implications for the use of AF2 models in ligand binding pose prediction for proteins and drug design.

      Strengths:<br /> The strength of the work is the rigor and careful, thoughtful comparison that cleverly leverages the cut-off date of April 30th, 2018 used in the training of AF2. While the authors list their limited number of docking methods as a caveat, the fact that they use three state-of-the-art ligand docking methods is a strength of the work; many studies use just one. The rigorous analysis of the binding site RMSD and docked ligand pose RMSD is novel to my knowledge and is particularly insightful.

      Weaknesses:<br /> The authors are rigorous in their approach by using state-of-the-art workflows that are high-throughput in nature. However, human expert-refined models and expert selection from multiple models could improve the results of ligand pose prediction when using protein models. The authors alluded to this for traditional models but this can also be true when starting from AF2 models. This is difficult to test systematically and rigorously, however. One possible experiment is to explore whether using multiple AF2 models (there are five by default) would have an effect on pose accuracy, perhaps for selected examples such as NK1R, 5HT2A, and DRD1 to help build out the discussion further. Another possible weakness is that the authors focus only on GPCRs for reasons they state but make a good argument as to why the conclusions are likely to extend to other protein classes.

      Context:<br /> One of the most common and impactful uses of protein structures is in small molecule therapeutic chemical tool design. When no experimental structure is available, models are frequently used and such models include traditional template-based homology models and, more recently, AF2 models. AF2 is widely recognized as an inflection point in protein structure prediction due to the unprecedented accuracy of the protein structure models produced automatically. However, understanding whether this unprecedented accuracy translates to better small molecule ligand pose prediction has been an open question, and this study directly addresses the question in a systematic, rigorous approach.

    3. Reviewer #2 (Public Review):

      While the question of 'are AlphaFold-predicted structures useful for drug design' has largely seen comparisons of AF versus experimental protein structures, this paper takes a less explored (but perhaps more practically important) angle of 'are AlphaFold-predicted structures any better than the previous generation of homology modeling tools' to the protein-ligand (rigid) docking problem. The conclusions of this work will be of largest interest to the audience less familiar with the precision required for successful rigid docking, while the expert crowd might find them obvious, yet a good summary of results previously shown in the literature. Further work, understanding the structural objectives/metrics that should be placed on future AlphaFold-like models for better pose prediction performance, would greatly expand the practicality of the observations made here.

      The main conclusion of the paper, that structural accuracy (expressed as RMSD) of the protein model is not a good predictor of the accuracy the model will show in rigid docking protein-ligand pose prediction, is a good reminder of the well-appreciated need for high-quality side chain placements in docking. The expected phenomenon of AlphaFold predicting 'more apo-like structures' is often discussed in the field, and readers should be cautious about drawing conclusions from the rigid (rather than flexible, as in some previous works) docking done here.

      The authors have very clearly communicated that the use of AlphaFold-generated structures in traditional docking might not be a good idea, and motivated that the time of a molecular designer might be better spent preparing a high-quality homology model. The visual presentation of the conclusions is very clear but might leave the reader wanting a more in-depth discussion of which structural elements of the AF models lead to bad docking outcomes. For example, Fig. 3 presents an example of a very accurate AlphaFold prediction leading to the ligand being docked completely outside of the binding pocket. Close inspection of the Figure suggests a clash of the ligand with the slightly displaced tryptophan residue in the AF model that might be to blame, as can be confirmed by comparison of the model and PDB structure by the reader themselves but has not been discussed by the authors. Only a few examples of the systems used are shown even visually, leaving the reader unable to study more interesting cases in depth without re-doing the work themselves.

      The authors acknowledged that several recent studies exist in this space. They point out two advancements made in their work, worthy of further review. Similarly, it's important to evaluate the novelty of this work's claims vs previously available results, and the diversity of information made available to the reader.

      "First, we use structural models generated without any use of known structures of the target protein. For machine learning methods, this requires ensuring that no structure of the target protein was used to train the method." This is done by limiting the scope of the work to GPCRs whose structures became available only after the training date of AlphaFold (April 30, 2018), as well as not using templates available after that date during prediction. The use of a time limit seems less preferable than the approach taken in Ref. 1 of discarding templates above a sequence identity cutoff. On the other hand, the 'ablation test' performed in Ref. 2 showed no loss in accuracy when no templates were used at all. Authors should discuss in more detail whether these modeling choices could change anything in their conclusions and why they made their choices compared to those in previous work.

      "Second, we perform a systematic comparison that takes into account the variation between experimentally determined structures of the same protein when bound to different ligands." Cross-docking is indeed a more appropriate comparison than self-docking (as done in previous works), and the observation that the accuracy of AF models is similar to that between different holo structures of the same protein is interesting. Previous literature on cross-docking should however be discussed, and the well-known conclusions from it that small variations in side-chain positions, in otherwise highly similar structures, can lead to large changes in docked poses. It is important to realize that AlphaFold models are 'just another structure' - if previous literature is sufficient to show the sensitivity of rigid docking, doing it again on AF structures does not add to our understanding. Further, Ref. 3 might have already addressed the question of correlation between binding site RMSD and docking pose prediction accuracy - see e.g. Supplementary Figure 3 there (also Figure S15 in Ref. 2).

      Further, the authors should discuss the commonly brought up problem of AlphaFold generating 'more apo-like structures' - are the models used here actually 'holo-like' because of the low RMSDs? (what RMSD differences are to be expected between apo and holo structures of these systems?) How are the volumes of the pockets affected? The position on this problem taken by previous works is worth mentioning - "much higher rmsd values are found when using the AF2 models (...), which reflect the difficulties in performing docking into apo-like structures" in Ref. 1 and "computational model structures were predicted without consideration of binding ligands and resulted in apo structures" in Ref. 2.

      Because of this 'apo problem', Ref. 2 assumed that rigid docking (as done here) would not succeed and used flexible docking where "two sidechains at the binding site were set to be flexible". In fact, the reader of this new paper will be left to wonder if it is not simply presenting a subset of the results already seen in Ref. 2, where "the success ratios dropped significantly for them because misoriented sidechains prevented a ligand from docking (Figure S14)". While this conclusion is not made as clear in Ref. 2 as it is here, a comparison of Figures 4 and S14 there will lead the reader to the same conclusion, and more -- that flexible docking meaningfully improves the performance of AF models, and more so than homology models.

      Finally, certain data analyses present in previous works but not here should be necessary to make this work more informative to the readers:<br /> a) Consideration of multiple top poses, e.g., in Ref. 2, Figures 4 and S14 mentioned before, comparison of success rates in top 1 and top 3 docked poses add much context.<br /> b) Notes on the structural features preventing successful docking, see e.g., in Ref. 1, Table 2 or in Ref. 4, Tables 2 and 4.

      This work has the potential to become an important piece of the puzzle, if deeper insights into the reasons for AF model failures are drawn by the authors. These could include a discussion of the problematic structural elements (clashes of side chain with ligands, missing interactions/waters, etc.), potential solutions with some preliminary data (flexible docking, softening interactions, etc.), or proposals for metrics better than RMSD to score the soundness of pockets generated by AF for docking.

      References:<br /> 1. Díaz-Rovira, A. M., Martín, H., Beuming, T., Díaz, L., Guallar, V., & Ray, S. S. (2023). Are Deep Learning Structural Models Sufficiently Accurate for Virtual Screening? Application of Docking Algorithms to AlphaFold2 Predicted Structures. Journal of Chemical Information and Modeling, 63(6), 1668-1674. https://doi.org/10.1021/acs.jcim.2c01270<br /> 2. Heo, L., & Feig, M. (2022). Multi-state modeling of G-protein coupled receptors at experimental accuracy. Proteins: Structure, Function, and Bioinformatics, 90(11), 1873-1885. https://doi.org/10.1002/prot.26382<br /> 3. Beuming, T., & Sherman, W. (2012). Current assessment of docking into GPCR crystal structures and homology models: Successes, challenges, and guidelines. Journal of Chemical Information and Modeling, 52(12), 3263-3277. https://doi.org/10.1021/ci300411b<br /> 4. Scardino, V., Di Filippo, J. I., & Cavasotto, C. (2022). How good are AlphaFold models for docking-based virtual screening? [Preprint]. Chemistry. https://doi.org/10.26434/chemrxiv-2022-sgj8c

    1. eLife assessment

      This useful study investigates the roles of C. elegans MYRF transcription factors myrf- and myrf-2 in the temporally controlled activation of the miRNA lin-4, a key step in larval developmental timing. While some of the findings are solid, other evidence is incomplete because of concerns about the technical approaches. This study provides information that will be useful to those interested in the regulation of lin-4 expression in C. elegans.

    2. Reviewer #1 (Public Review):

      In this work, the authors set out to ask whether the MYRF family of transcription factors, represented by myrf-1 and myrf-2 in C. elegans, have a role in the temporally controlled expression of the miRNA lin-4. The precisely timed onset of lin-4 expression in the late L1 stage is known to be a critical step in the developmental timing ("heterochronic") pathway, allowing worms to move from the L1 to the L2 stage of development. Despite the importance of this step of the pathway, the mechanisms that control the onset of lin-4 expression are not well understood.

      Overall, the paper provides convincing evidence that MYRF factors have a role in the regulation of lin-4 expression. However, some of the details of this role remain speculative, and some of the authors' conclusions are not fully supported by the studies shown. These limitations arise from three concerns. First, the authors rely heavily on a transcriptional reporter (maIs134) that is known not to contain all of the regulatory elements relevant for lin-4 expression. Second, the authors use mutant alleles with unusual properties that have not been completely characterized, making a definitive interpretation of the results difficult. Third, some conclusions are drawn from circumstantial or indirect evidence that does not use field-standard methods.

      The authors convincingly demonstrate that the cytoplasmic-to-nuclear translocation of MYRF-1 coincides with the activation of lin-4 expression, making MYRF-1 a good candidate for mediating this activation. However, the evidence that MYRF-1 is required for the activation of lin-4 is somewhat incomplete. The authors provide convincing evidence that lin-4 activation fails in animals carrying the unusual mutation myrf-1(ju1121), which the authors describe as disrupting both myrf-1 and myrf-2 activity. The concern here is that it is difficult to rule out that ju1121 is not also disrupting the activity of other factors, and it does not disentangle the roles of myrf-1 and myrf-2. Partially alleviating this issue, they also find that expression from the maIs134 reporter is disrupted in putative myrf-1 null alleles, but making inferences from maIs134 about the regulation of endogenous lin-4 is problematic. Helpfully, an endogenous Crispr-generated lin-4 reporter allele is used in some studies, but only using the ju1121 allele. Together, these findings provide solid evidence that MYRF factors probably do have a role in lin-4 activation, but the exact roles of myrf-1 and myrf-2 remain unclear because of limitations of the unusual ju1121 allele and the use of the maIs134 reporter. The creative use of a conditional myrf-1 alleles (floxed and using the AID system) partially overcomes these concerns, providing strong evidence that myrf-1 acts cell-autonomously to regulate lin-4, though again, these key experiments are only carried out with the maIs134 transgene.

      A second important question asked by the authors is whether MYRF activity is sufficient to activate lin-4 expression. The authors provide evidence that supports this idea, but this support is somewhat incomplete, because the authors rely partially on the maIs104 array and, more importantly, on mutant alleles of MYRF-1 that they propose are constitutively active but are not completely characterized here.

      The authors also approach the question of whether MYRF-1 regulates lin-4 via direct interaction with its promoter. The evidence presented here is consistent with this idea, but it relies on indirect evidence involving genetic interactions between myrf-1 and the presence of multiple copies of the lin-4 promoter, as well as the detection of nuclear foci of MYRF-1::GFP in the presence of multiple copies of the lin-4 promoter. This is not the field-standard approach for testing this kind of hypothesis, and the positive control presented (using the TetR/TetO interaction) is unconvincing. Thus, the evidence here is consistent with the authors' hypothesis, but the studies shown are incomplete and do not represent a rigorous test of this possibility.

      Finally, the authors ask whether MYRF factors have a role in the regulation of other miRNAs. The evidence provided (RNAseq experiments, validated by several reporter transgenes) solidly supports this idea, with the provision that it is not completely clear that ju1121 is disrupting only the activity of myrf-1 and myrf-2.

    3. Reviewer #2 (Public Review):

      In this manuscript, the authors attempt to examine how the temporal expression of the lin-4 microRNA is transcriptionally regulated. However, the experimental support for some claims is incomplete. The authors repeatedly use the ju1121(G247R) mutation of myrf-1, but more information is required to evaluate their claim that this mutation "abolishes its DNA binding capability but also negatively interferes with its close paralogue MYRF-2". Additionally, in the lin-4 scarlet endogenous transcriptional reporter, the lin-4 sequence is removed. Since lin-4 has been reported to autoregulate, it seems possible that the removal of lin-4 coding sequence could influence reporter expression. Further, concrete evidence for direct lin-4 regulation by MYRF-1 is lacking, as the approaches used are indirect and not standard in the field. Overall, while the aims of the work are mostly achieved, data regarding the direct regulation of lin-4 by MYRF-1 and placing the work into the context of previous related reports is lacking. Because of its very specific focus, this paper reports useful findings on how a single transcription factor family might control the expression of a microRNA.

    1. eLife assessment

      Resistance of Plasmodium falciparum to artemisinin, which has become a threat to malaria control, has been linked to mutations in the parasite protein K13. This study provides important new insights into the function of K13 in the endocytosis of hemoglobin, a central process for the activation of artemisinin derivatives. Conditional protein mislocalization combined with high-resolution imaging provides convincing evidence that K13 is involved in the formation of cytostomes, the structures involved in the endocytosis of host cytosol. This study will be of interest to scientists working on parasite biology as well as antimalarial drug resistance.

    2. Reviewer #1 (Public Review):

      In this paper, the authors investigated the localization and function of the protein Kelch 13 (K13) in Plasmodium falciparum. Mutations of K13 confer parasite resistance to artemisinin derivatives, the first-line treatment for malaria. Previous studies have shown that K13 is located at the cytostome - a structure that the parasite uses to take up hemoglobin - and that K13 mutations confer artemisinin resistance by dampening hemoglobin endocytosis. Digestion of host hemoglobin is thought to be essential for artemisinin activation through the production of haem. However, the exact function of K13 is currently unknown, and direct evidence for a role of K13 in the production of haem (and artemisinin activation) is missing.

      The authors used fluorescent dextran to visualize endocytosis, and show an accumulation of dextran-positive structures colocalizing with GFP-tagged K13. They confirm the localization of K13 to cytostomes by immune-electron microscopy, showing that the protein is localized to the cytostomal collar. Using a genetic knock-sideways strategy, the authors show that mislocalization of K13 results in defects in cytostome formation and morphology, with the disappearance of the electron-dense cytostomal collar, as evidenced by serial block face scanning electron microscopy and transmission electron tomography. Finally, they provide direct evidence that K13 mislocalization leads to a decrease in haemoglobin digestion products, haem and hemozoin.

      The paper is very well written and the work is very well performed, relying on a validated genetic approach and high-quality imaging. While conceptually the study does not bring many novel insights, the confirmation of K13 localization and, most importantly, the demonstration that K13 is required for cytostome formation and function constitute important pieces that consolidate the current model of artemisinin resistance. However, the exact role of K13 at the cytostomal collar remains undefined. Whether other proteins of the K13 compartments also play a role in cytostome formation remains to be determined. In addition, the study does not address whether the formation of abnormal cytostomes is also seen in artemisinin-resistant K13 mutant field isolates and is a general mechanism underlying resistance to artemisinin.

    3. Reviewer #2 (Public Review):

      Summary of major findings:<br /> The manuscript "The Plasmodium falciparum artemisinin resistance-associated protein Kelch 13 is required for formation of normal cytostomes" authored by Tutor et al. provides evidence that Kelch13 is necessary for the formation and maintenance of cytostomes. The group provides compelling evidence using multiple state-of-the-art microscopy imaging techniques to demonstrate that when Kelch13 is mislocalized to the nucleus, cytostomes are decreased, cytostome morphology is aberrant, and there are decreased levels of heme within the parasite.

      Impact of the study:<br /> Mutations in Kelch13 have been associated with artemisinin resistance. The biological function of Kelch13 has been a question of great interest. Kelch13 was shown to associate with proteins in the endocytic machinery although not with clathrin. It was previously shown that Kelch13 mutants have decreased levels of hemoglobin digestion-derived peptides, decreased Kelch13 protein (although levels are not decreased at all asexual stages), and decreased heme. Here, the authors show that when Kelch13 is mislocalized, there are decreased numbers of properly-formed cytostomes that lead to decreased heme within parasites. Although not formally demonstrated, it is thus possible that there is decreased subsequent heme-mediated activation of artemisinin, which would explain the connection between Kelch13 and artemisinin resistance.

    4. Reviewer #3 (Public Review):<br /> <br /> Tutor et al. present their work on Kelch13/K13 from Plasmodium falciparum, the causative agent of malaria. This protein is involved in resistance against artemisinin (ART), one of the most commonly used drugs to treat malaria. Despite having identified the mutation in K13 that leads to resistance to ART, the exact molecular mechanism, function of K13, and impact of the K13 mutations still need to be elucidated. This is where the authors step in to investigate the relationship between endocytosis and K13, as well as the impact of depleting the protein using knock-sideway (KS). Using light microscopy, the authors demonstrate how K13-YFP forms a pore associated with fluorescently labeled dextran, which is taken up into tubules that move toward the digestive vacuole. This tubule formation is not sensitive to jasplakinolide (JAS) treatment. Using electron microscopy, they show that K13 is localized at the dark contrast border of the cytostome, and knocking down K13 leads to the disruption of the cytostome structure. Upon removal of K13, the structure changes, and the opening enlarges. The impact of KS induction on the cytostome was quantified using TEM and tomography. The authors also provide reconstructions of the cytostome in both induced and non-induced parasites. Finally, they measure the impact of KS on haem degradation. These data provide clear information on the function of K13 in cytostome formation and the implication of this structure in endocytosis for Plasmodium falciparum.

      The conclusions of this paper are well supported by the data, but some data analysis should be clarified and extended, and some complementary experiments would further strengthen the authors' claims.

    1. eLife assessment

      This paper reports valuable results regarding the potential role and time course of the prefrontal cortex in conscious perception. Although the sample size is small, the results are clear and convincing, and strengths include the use of several complementary analysis methods. The behavioral test includes subject report so the results do not allow for distinguishing between theories of consciousness; nevertheless, results do advance our understanding of the contribution of prefrontal cortex to conscious perception.

    2. Reviewer #1 (Public Review):

      This is a clear and rigorous study of intracranial EEG signals in the prefrontal cortex during a visual awareness task. The results are convincing and worthwhile, and strengths include the use of several complementary analysis methods and clear results. The only methodological weakness is the relatively small sample size of only 6 participants compared to other studies in the field. Interpretation weaknesses that can easily be addressed are claims that their task removes the confound of report (it does not), and claims of primacy in showing early prefrontal cortical involvement in visual perception using intracranial EEG (several studies already have shown this). Also the shorter reaction times for perceived vs not perceived stimuli (confident vs not confident responses) has been described many times previously and is not a new result.

    3. Reviewer #2 (Public Review):

      The authors attempt to address a long-standing controversy in the study of the neural correlates of visual awareness, namely whether neurons in prefrontal cortex are necessarily involved in conscious perception. Several leading theories of consciousness propose a necessary role for (at least some sub-regions of) PFC in basic perceptual awareness (e.g., global neuronal workspace theory, higher order theories), while several other leading theories posit that much of the previously reported PFC contributions to perceptual awareness may have been confounded by task-based cognition that co-varied between the aware and unaware reports (e.g., recurrent processing theory, integrated information theory). By employing intracranial EEG in human patients and a threshold detection task on low-contrast visual stimuli, the authors assessed the timing and location of neural populations in PFC that are differentially activated by stimuli that are consciously perceived vs. not perceived. Overall, the reported results support the view that certain regions of PFC do contribute to visual awareness, but at time-points earlier than traditionally predicted by GNWT and HOTs.

      Major strengths of this paper include the straightforward visual threshold detection task including the careful calibration of the stimuli and the separate set of healthy control subjects used for validation of the behavioral and eye tracking results, the high quality of the neural data in six epilepsy patients, the clear patterns of differential high gamma activity and temporal generalization of decoding for seen versus unseen stimuli, and the authors' interpretation of these results within the larger research literature on this topic. This study appears to have been carefully conducted, the data were analyzed appropriately, and the overall conclusions seem warranted given the main patterns of results.

      Weaknesses include the saccadic reaction time results and the potential flaws in the design of the reporting task. This is not a "no report" paradigm, rather, it's a paradigm aimed at balancing the post-perceptual cognitive and motor requirements between the seen and unseen trials. On each trial, subjects/patients either perceived the stimulus or not, and had to briefly maintain this "yes/no" judgment until a fixation cross changed color, and the color change indicated how to respond (saccade to the left or right). Differences in saccadic RTs (measured from the time of the fixation color change to moving the eyes to the left or right response square) were evident between the seen and unseen trials (faster for seen). If the authors' design achieved what they claim on page 3, "the report behaviors were matched between the two awareness states ", then shouldn't we expect no differences in saccadic RTs between the aware and unaware conditions? The fact that there were such differences may indicate differences in post-perceptual cognition during the time between the stimulus and the response cue. Alternatively, the RT difference could reflect task-strategies used by subjects/patients to remember the response mapping rules between the perception and the color cue (e.g., if the YES+GREEN=RIGHT and YES+RED=LEFT rules were held in memory, while the NO mappings were inferred secondarily rather than being actively held in memory). This saccadic RT result should be better explained in the context of the goals of this particular reporting-task.

      Nevertheless, the current results do help advance our understanding of the contribution of PFC to visual awareness. These results, when situated within the larger context of the rapidly developing literature on this topic (using "no report" paradigms), e.g., the recent studies by Vishne et al. (2023) Cell Reports and the Cogitate consortium (2023) bioRxiv, provide converging evidence that some sub-regions of PFC contribute to visual awareness, but at latencies earlier than originally predicted by proponents of, especially, global neuronal workspace theory.

    4. Reviewer #3 (Public Review):

      The authors report a study in which they use intracranial recordings to dissociate subjectively aware and subjectively unaware stimuli, focusing mainly on prefrontal cortex. Although this paper reports some interesting findings (the videos are very nice and informative!) the interpretation of the data is unfortunately problematic for several reasons. I will detail my main comments below. If the authors address these comments well, I believe the paper may provide an interesting contribution to further specifying the neural mechanisms important for conscious access (in line with Gaillard et al., Plos Biology 2009).

      The main problem with the interpretation of the data is that the authors have NOT used a so-called "no-report paradigm". The idea of no report paradigms is that subjects passively view a certain stimulus without the instruction to "do something with it", e.g., detect the stimulus, immediately or later in time. Because of the confusion of this term, specifically being related to the "act of reporting", some have argued we should use the term no-cognition paradigm instead (Block, TiCS, 2019, see also Pitts et al., Phil Trans B 2018). The crucial aspect is that, in these types of paradigms, the critical stimulus should be task-irrelevant and thus not be associated with any task (immediately or later). Because in this experiment subjects were instructed to detect the gratings when cued 600 ms later in time, the stimuli are task relevant, they have to be reported about later and therefore trigger all kinds of (known and potentially unknown) cognitive processes at the moment the stimuli are detected in real-time (so stimulus-locked). You could argue that the setup of this delayed response task excludes some very specific report related processes (e.g., the preparation of an eye-movement), which is good, however this is usually not considered the main issue. For example when comparing masked versus unmasked stimuli (Gaillard et al., 2009 Plos Biology), these conditions usually also both contain responses but these response related processes are "averaged out" in the specific contrasts (unmasked > masked). In this paper, RT differences between conditions (that are present in this dataset) are taken care of by using this delayed response in this paper, which is a nice feature for that and is not the case for the above example set-up.

      Given the task instructions, and this being merely a delayed-response task, it is to be expected that prefrontal cortex shows stronger activity for subjectively aware versus subjectively unaware stimuli. Unfortunately, given the nature of this task, the novelty of the findings is severely reduced. The authors cannot claim that prefrontal cortex is associated with "visual awareness", or what people have called phenomenal consciousness (this is the goal of using no-cognition paradigms). The only conclusion that can be drawn is that prefrontal cortex activity is associated with accessing sensory input: and hence conscious access. This less novel observation has been shown many times before and there is also little disagreement about this issue between different theories of consciousness (e.g., global workspace theory and local recurrency theories both agree on this).

      The best solution at this point seems to rewrite the paper entirely in light of this. My advice would be to state in the introduction that the authors investigate conscious access using iEEG and then not refer too much to no-cognition paradigm or maybe highlight some different strategies about using task-irrelevant stimuli (see Canales-Johnson et al., Plos Biology 2023; Hesse et al., eLife 2020; Hatamimajoumerd et al Curr Bio 2022; Alilovic et al., Plos Biology 2023; Pitts et al., Frontiers 2014; Dwarakanth et al., Neuron 2023 and more). Obviously, the authors should then also not claim that their results solve debates about theories regarding visual awareness (in the "no-cognition" sense, or phenomenal consciousness), for example in relation to the debate about the "front or the back of the brain", because the data do not inform that discussion. Basically, the authors can just discuss their results in detail (related to timing, frequency, synchronization etc) and relate the different signatures that they have observed to conscious access.

      I think the authors have to discuss the Gaillard et al PLOS Biology 2009 paper in much more detail. Gaillard et al also report a study related to conscious access contrasting unmasked and masked stimuli using iEEG. In this paper they also report ERP, time frequency and phase synchronization results (and even Granger causality). Because of the similarities in approach, I think it would be important to directly compare the results presented in that paper with results presented here and highlight the commonalities and discrepancies in the Discussion.

      In the Gaillard paper they report a figure plotting the percentage of significant frontal electrodes across time (figure 4A) in which it can be seen that significant electrodes emerge after approximately 250 ms in PFC as well. It would be great if the authors could make a similar figure to compare results. In the current paper there are much more frontal electrode contacts than in the Gaillard paper, so that is interesting in itself.

      In my opinion, some of the most interesting results are not highlighted: the findings that subjectively unaware stimuli show increased activations in the prefrontal cortex as compared to stimulus absent trials (e.g., Figure 4D). Previous work has shown PFC activations to masked stimuli (e.g., van Gaal et al., J Neuroscience 2008, 2010; Lau and Passigngham J Neurosci 2007) as well as PFC activations to subjectively unaware stimuli (e.g., King, Pescetelli, and Dehaene, Neuron 2016) and this is a very nice illustration of that with methods having more detailed spatial precision. Although potentially interesting, I wonder about the objective detection performance of the stimuli in this task. So please report objective detection performance for the patients and the healthy subjects, using signal detection theoretic d'. This gives the reader an idea of how good subjects were in detecting the presence/absence of the gratings. Likely, this reveals far above chance detection performance and in that case I would interpret these findings as "PFC activation to stimuli indicated as subjectively unaware" and not unconscious stimuli. See Stein et al., Plos Biology 2021 for a direct comparison of subjectively and objectively unaware stimuli.

      In Figure 7 of the paper the authors want to make the case that the contrast does not differ between subjectively aware stimuli and subjectively unaware stimuli. However so far they've done the majority of their analyses across subjects, and for this analysis the authors only performed within-subject tests, which is not a fair comparison imo. Because several P values are very close to significance I anticipate that a test across subjects will clearly show that the contrast level of the subjectively aware stimuli is higher than of the subjectively unaware stimuli, at the group level. A solution to this would be to subselect trials from one condition (NA) to match the contrast of the other condition (NU), and thereby create two conditions that are matched in contrast levels of the stimuli included. Then do all the analyses on the matched conditions.

      Related, Figure 7B is confusing and the results are puzzling. Why is there such a strong below chance decoding on the diagonal? (also even before stimulus onset) Please clarify the goal and approach of this analysis and also discuss/explain better what they mean.

      I was somewhat surprised by several statements in the paper and it felt that the authors may not be aware of several intricacies in the field of consciousness. For example a statement like the following "Consciousness, as a high-level cognitive function of the brain, should have some similar effects as other cognitive functions on behavior (for example, saccadic reaction time). With this question in mind, we carefully searched the literature about the relationship between consciousness and behavior; surprisingly, we failed to find any relevant literature." This is rather problematic for at least two reasons. First, not everyone would agree that consciousness is a high-level cognitive function and second there are many papers arguing for a certain relationship between consciousness and behavior (Dehaene and Naccache, 2001 Cognition; van Gaal et al., 2012, Frontiers in Neuroscience; Block 1995, BBS; Lamme, Frontiers in Psychology, 2020; Seth, 2008 and many more). Further, the explanation for the reaction time differences in this specific case is likely related to the fact that subjects' confidence in that decision is much higher in the aware trials than in the unaware trials, hence the speeded response for the first. This is a phenomenon that is often observed if one explores the "confidence literature". Although the authors have not measured confidence I would not make too much out of this RT difference.

      I would be interested in a lateralized analysis, in which the authors compare the PFC responses and connectivity profiles using PLV as a factor of stimulus location (thus comparing electrodes contralateral to the presented stimulus and electrodes ipsilateral to the presented stimulus). If possible this may give interesting insights in the mechanism of global ignition (global broadcasting), supposing that for contralateral electrodes information does not have to cross from one hemisphere to another, whereas for ipsilateral electrodes that is the case (which may take time). Gaillard et al refer to this issue as well in their paper, and this issue is sometimes discussed regarding to Global workspace theory. This would add novelty to the findings of the paper in my opinion.

    1. eLife assessment

      This paper is of interest to researchers and policy makers involved in cervical cancer prevention. The paper provides insight into how the Covid19 pandemic accelerated changes in organized cervical cancer screening. The claim that self-sampling led to a major improvement of test coverage seems somewhat exaggerated and alternative hypotheses to those provided by the authors on the population who chose self-sampling are possible. Nonetheless, this is a valuable piece of work given the scope of the intervention(s) and the precedent it sets i.e. a crisis can in fact accelerate positive changes in screening that have been academic possibilities rather than practical realities.

    2. Reviewer #1 (Public Review):

      During the Covid19 pandemic, most cervical cancer screening programs were temporarily put on hold. The authors describe how Swedish health authorities dealt with this situation by implementing primary self-sampling and by launching a campaign with concomitant vaccination and screening. Besides, they show that the coverage of the screening program was one year after the start of the pandemic at pre-pandemic levels.

      Strengths of the paper are the clear presentation of the steps taken by the Swedish health authorities and the high quality of the presented screening coverage data which could be obtained directly from the screening registry. However, the paper would benefit from more in-depth analyses because the presented data raise questions. The number of invitations was >30 percent lower in the first year of the pandemic (Figure 1), but the screening coverage was only 4-5 percent lower. In the second year of the pandemic (year 2021), coverage was back at pre-pandemic levels, but the role of primary self-sampling in restoring screening coverage is a bit unclear. It is obvious that primary self-sampling made it possible to invite women again for screening during the pandemic, but there is no data on acceptance of primary self-sampling. Besides, the increase in coverage in year 2021 was only 4% and it is not clear whether such a modest increase could also have been achieved without primary self-sampling. In addition to self-sampling, the authors describe the launch of a concomitant vaccination and screening campaign. This is an interesting initiative but the authors do not show data on the coverage of this campaign in the target age range.

    3. Reviewer #3 (Public Review):

      The authors report on the nature of interventions that were applied to aid and improve engagement in cervical screening, brought about by the SARS CoV Pandemic in Sweden.

      I appreciate that the impact of these interventions, given that they are recent, will take some time to quantify but the description (and reach) of the policy changes that occurred in a short amount of time is of significant interest to the screening community. The piece on HPV Even Faster is particularly novel; I am not aware of another example of where this has been enacted within a routine programme.

      The authors make reference to (15) where the reader can find greater details relating to the population who received the offer of self sampling (and the nature of the device). However I was a little confused (in this stand alone piece) as to who the self sampling group constituted exactly. Did this group not include pregnant women, women invited for first screen or women on non routine recall?

      The authors state that "the most likely explanation for the large increase in population coverage seen is that the sending of self-sampling kits resulted in improved attendance in particular among previously non-attending women" - why is this written as speculation at this stage (?) is it not possible to attribute directly the contribution made by self sampling, or is this in hand?

      While self sampling is certainly an option that can support uptake and enfranchisement in cervical screening - its overall performance is fundamentally contingent on the number of women who then comply with follow up should the HPV test be positive; it is not simply about who returns the sample. It would have been of interest to see the proportion of women who did comply with follow up.

    1. eLife assessment

      This paper addresses the important question of presynaptic homeostasis and convincingly demonstrates antagonistic interactions between Spinophilin and Syd-1 in this process. It also provides a useful hypothesis for the downstream mechanisms.

    2. Reviewer #1 (Public Review):

      The study by Ramesh et al identifies key components that support presynaptic plasticity (PHP) at Drosophila glutamatergic synapses: an accepted model for their mammalian equivalents. Specifically, they identify that PHP relies on the antagonism between Spinophilin (Spn) and Syd-1 (a Rho GTPase activating protein) to dynamically alter F-actin (de)polymerisation to facilitate increased synaptic vesicle release, thus supporting PHP. A pull-down of Spn identifies additional proteins including Mical, the over-expression of which is sufficient to rescue the excessive actin stabilisation present in an Spn loss-of-function mutant. The studies relate the mechanistic understanding of Spn to aversive mid-term olfactory memory formation formed in the mushroom bodies.

      Collectively, this study represents an important addition to the understanding of PHP and its involvement in the formation of memory. The experiments presented are carefully done and the conclusions drawn are appropriate. A potential criticism is that the study spans two big areas (PHP and memory) and that each may have been better considered as separate studies. However, this is a stylistic concern and not one that influences the insights presented by this study.

    3. Reviewer #2 (Public Review):

      The manuscript by Ramesh et al builds upon prior studies from the Sigrist group to examine synergistic interactions between the Spinophilin (Spn) and Syd-1 synaptic proteins and their role in regulating presynaptic homeostatic plasticity at Drosophila larval NMJs and adult olfactory memory in the Mushroom Body (MB). The authors show synergistic interactions between the two proteins in these processes, where late PHP and long-term memory are abolished in Spn mutants, but restored upon reduction of Syd-1 function in the mutants. The authors go on to show that Spn appears to act in PHP by regulating a late stage in AZ remodeling and longer-term increases in the readily releasable SV pool by controlling actin polymerization/dynamics through the Mical protein. Although key aspects of the overall bigger picture have been published before (Mical's role in PHP, antagonism between Spn and Syd-1 in AZ development, AZ remodeling in MB-dependent memory), the current paper ties together many of these observations into a bigger picture of how PHP plasticity at the NMJ is established and provides support for a role for PHP-required proteins in promoting long-term memory in the adult MB through effects on AZ structure and AZ protein content/amount. The study also provides new links to the role of Spn in regulating local synaptic actin dynamics and how this alters the readily releasable pool and SV release. Some points of note are provided below.

      1. I'm a bit confused about the time course experiments the authors describe that seem to be contradictory in Figures 1 and 2. The authors indicate control animals transiently increase BRP AZ levels during PHP at 10 mins, but by 30 minutes this increase is gone, even though PHP remains. As such, the data in these early figures suggests increases in BRP AZ levels may support an early aspect of the PHP effect (though I note this appears controversial, as other data indicate blocking the rapid AZ remodeling by several manipulations such as Arl8 transport disruption, permits early PHP, but disrupts late PHP). In contrast, the authors show that Spn mutants do not display AZ BRP increase at 10 mins, and still show early PHP, but lack late PHP. I assume the early PHP does not require AZ remodeling or an increase in the RRP at this early time point?

      2. In relation to point 1 above, the time course seems different in MB neurons, where the AZ remodeling (noted by increases in AZ BRP) seems to take 2-3 hours. Do the authors have any ideas into why the time course of PHP AZ remodeling at larval NMJs can occur in 10 minutes, but MB neuron remodeling seems to take hours?

      3. Could the lack of rapid BRP accumulation during early PHP in Spn mutants be secondary to the larger # of AZs in those mutants and a known rate-limiting amount of BRP available that might not be enough to go to the extra AZs?

      4. There isn't any validation of the Spn co-IP results shown in Figure 3 through other assays, and a lot of proteins are being pulled down. I can't see some of these being real (mitochondrial translation proteins? - how could Spn gain access to the inside of the mitochondria since it's a cytosolic protein?). As such, I don't know how to value that huge group of pull-down interactions without further validation, making it difficult to sort out how relevant these really are. The genetic validation of similar phenotypes in the Mical mutant, together with rescues, supports that interaction. Not sure about the rest of that list.

      5. Are the authors worried about the fact that the Actin-GFP line they use to look at synaptic actin dynamics is driven by a GAL4, and the 2nd top hit of their Spn IP pull downs are translation regulators? Could the changes in actin-GFP they see between control and Spn mutants have anything to do with a different translation of the exogenous UAS-actin-GFP? Would have been helpful to do an endogenous stain for actin levels with an anti-actin antibody so no transcription/translation issues of a transgene would be at play. This would be easy to do for the quantification of total actin levels at the synapse.

      6. Are Mical levels normalized in the Spn, Syd1 double mutants, given PHP is recovered?

    1. eLife assessment

      Moises and Harel generate an important set of novel molecular tools in African turquoise killifish, an innovative model to study aging biology. The new solid tools described in this paper can boost this buddying model system for broad biotechnological applications. The authors showcase the efficacy of their tools in the context of peptide hormones involved in growth and gonad development. The killifish community will greatly benefit from these novel tools and the relevance of the developed methods will likely go beyond the killifish community.

    2. Reviewer #1 (Public Review):

      Moises and Harel develop an impressive set of novel molecular tools in African turquoise killifish, which include hormone tagging by a self-cleavable fluorescent reporter, intramuscular electroporation for ectopic transgene expression and a doxycycline-inducible system. All these tools are per se fundamental technological innovations in killifish. The authors apply their advanced techniques to modulate growth and gonad development in killifish, showing that the methods work and that it is possible to modulate these fundamental developmental milestones through the use of their molecular tools.

      Strengths:

      The tools developed are effective, convincing and will likely be adopted as a reference for future work, beyond the field of peptide hormones. I congratulate the authors for their ingenuity and resourcefulness. The figures are clear and of high standard.

      Weaknesses:

      The manuscript does not obviously follow a question-driven flow and the authors do not make a compelling case about the necessity of developing such platform.<br /> The manuscript should be framed as a tool/resource, showcasing the interventions with gh and fshb to support the tool.

      The manuscript is not thoroughly edited and the authors should check and review extensively for improvements to the use of English. Overall, I find a disconnect between the way in which the manuscript is written and the quality of the figures. While the figures have a very high quality standard, the Abstract, Introduction and Discussion are not doing justice to the work done.

    1. eLife assessment

      This study addresses an important question in the field of antimicrobial chemotherapy: whether combinations of enzyme inhibitors that select for mutations that confer resistance to one inhibitor and at the same time increased sensitization to the other inhibitor can provide a path towards mitigating resistance risks. The authors here investigated one such combination of inhibitors of Plasmodium falciparum DHODH (dihydroorotate dehydrogenase), finding that despite "collateral sensitivity", it was still possible to select a mutation that mediated resistance to both inhibitors without any change in parasite fitness. Additional cross-susceptibility and structural modeling strengthen this study, which is performed to a high technical standard and presents a convincing body of data.

    2. Reviewer #1 (Public Review):

      The usual strategy to combat antimicrobial drug resistance is to administer a combination of two drugs with distinct mechanisms. An alternative, however, would be to use two drugs that attack the same target, if resistance to one is incompatible with resistance to the other. The authors previously studied parasites resistant to the dihydroorotate dehydrogenase (DHODH) inhibitor DSM265 through an E182D mutation and found that resistance to another inhibitor, IDI-6273, resulted in a reversion to wild-type. Here, they screened various other inhibitors and found that TCMDC-125334 is more active on DSM265-resistant parasites than the wild-type. In this case, however, it was possible for the parasites to become resistant to both inhibitors, either by increasing the copy number of DSM-265-resistant DHODH genes (with a C276Y mutation) or by the emergence of a different mutation. The selection of wild-type parasites with both compounds resulted in resistance but this took considerably longer than for either compound alone. (The actual frequency of double resistance emergence was not measured.)

      Overall the results suggest that for DHODH, when pre-existing resistant parasites are selected with another inhibitor, the results will depend on both the initial mutation and the new inhibitor. The data are solid and convincing and suggest that DHODH has considerable scope for resistance development. The observations do have relevance for other inhibitors and/or enzyme drug targets. However from the data so far, the sweeping statements that the authors make concerning double resistance, in general, are not supported.

      The formatting of the Figures requires some improvement and in some cases, more details of the statistical analyses are needed.

    3. Reviewer #3 (Public Review):

      'Collateral sensitivity' occurs when drug-resistance mutations render a drug target more sensitive to inhibition by another drug, which has been previously described in some detail for malaria parasite dihydroorotate dehydrogenase (DHODH - see refs 36, 46, and 47, for example). Although it has been suggested that combinations of such drugs could potentially suppress the emergence of resistance, cross-resistance-associated mutation (or copy-number variation, CNV) could render such combination strategies ineffective. In the current study, the authors assess a new pairing of DHODH-targeting drugs. Cross-resistant parasites with DHODH mutation or CNV arise following either sequential or combined drug selection, suggesting that the drug combination described would likely fail to effectively suppress the emergence of resistance.

      The strength of the study is that it describes, for a particular drug combination, different mutations associated either with collateral sensitivity or with cross-resistance, and the authors conclude that "combination treatment with DSM265 and TCMDC-125334 failed to suppress resistance". They go on to say that this "brings into question the usefulness of pursuing further DHODH inhibitors." More specific interpretations and implications of the study are as follows:<br /> a. Other combinations may also fail but there may be combinations that can effectively suppress resistance. A more exhaustive analysis of mutational space will likely be required to determine which combinations if any, would be predicted to succeed in a clinical setting.<br /> b. It was previously reported that "a combination of [DHODH] wild-type and mutant-type selective inhibitors led to resistance far less often than either drug alone. ... Comparative growth assays demonstrated that two mutant parasites grew less robustly than their wild-type parent, and the purified protein of those mutants showed a decrease in catalytic efficiency, thereby suggesting a reason for the diminished growth rate" (Ref 46). Also, "selection with a combination of Genz-669178, a wild-type PfDHODH inhibitor, and IDI-6273, a mutant-selective PfDHODH inhibitor, did not yield resistant parasites" (Ref 36). It is possible that these previously tested combinations would also yield cross-resistant mutants if selected further.<br /> c. Although increased DHODH copy number "confers only moderately reduced susceptibility" to the drug used for selection and although these clones were not assessed here for cross-resistance, it seems likely that CNV may represent a general mechanism that could undermine other collateral resistance strategies.

    1. eLife assessment

      The valuable study by Dumeaux et al examines the transcriptional response to antifungal treatment in the major opportunistic human fungal pathogen Candida albicans. Using solid methodology, including a novel droplet-based single cell transcriptomics platform, the authors report that fungal cells exhibit heterogeneity in their transcriptional response to antifungal drug treatment. The ability to study the trajectories of individual cells in a high-throughput manner provides a novel perspective on studying the emergence of drug tolerance and resistance in fungal pathogens.

    2. Reviewer #1 (Public Review):

      This study applies state-of-the-art single-cell transcriptome analysis to investigate the nature of drug tolerance, a phenomenon distinct from drug resistance, and a problem of considerable importance in the treatment of C. albicans infections. The authors first show that their transcriptomics platform can reveal sub-populations of untreated cells that display distinct transcription profiles related to metabolic and stress responses that are coupled with cell cycle regulation. They note the consistency of these findings with previous work indicating connections between cell cycle phase and expression of genes related to stress responses and metabolism and argue that this validates their experimental approach, which relies on a complex statistical analysis of sparse data from a relatively small number of single cells. They then proceed to analyze drug-treated cells, mostly focusing on fluconazole (FCZ; which targets ERG11, thus disrupting sphingolipid biosynthesis and membrane integrity) and examining individual cells at 2-, 3-, and 6-days following treatment. Their primary finding is the identification of two major classes of cells, one of which they call the α response, characterized by high ribosomal protein (RP) gene expression and the absence of either heat shock or hyperosmotic stress gene expression as well as low expression of glycolytic, carbohydrate reserve pathway, and histone genes. The second survival state on day 2 (called the β response) instead displays low RP gene expression and high heat-shock stress response. Interestingly, the proportion of β cells clearly increases on day 3. In addition, responses to caspofungin (CSP) and rapamycin (RAPA) are examined and compared to FCZ or untreated cells. The main conclusion that the authors draw from their data is that the initial α response transitions to the β response, which is similar to a recently characterized ribosome assembly stress response (RASTR) in the budding yeast S. cerevisiae. They argue that the transcriptional state in α cells provokes the transition to the β state.

      This manuscript presents an enormous amount of complex data whose significance will be difficult to evaluate for those (e.g., this reviewer) not immersed in the specialized analytical techniques used here. Taken at face value, however, the experimental findings are consistent with the authors' main conclusions. Nevertheless, and consistent with the complexity of the responses observed, there are many findings that remain to be explored in mechanistic detail and for which conclusions are less precise.

    3. Reviewer #2 (Public Review):

      In this manuscript, Dumeaux et al. assess the heterogeneous cellular response of the fungal pathogen Candida albicans to antifungal agents, using single-cell RNA sequencing. The researchers develop and optimized single-cell transcriptomics platform for C. albicans, and exploit this technique to monitor the cellular response to treatment with three distinct antifungal agents. Through this analysis, they identify two distinct subpopulations of cells that undergo differential transcriptomic responses to antifungal treatment: one involving upregulation of translation and respiration, and the other involving stress responses. This work monitors how different and prolonged antifungal exposure alters and shifts fungal cell populations between these responses. This is an innovative study that exploits novel single-cell transcriptomic techniques to address a very interesting question regarding the heterogeneous nature of the fungal response to antifungal drug treatment. This work optimizes a protocol for single-cell RNA sequencing, which is a significant contribution to the fungal research community and will bolster future research efforts in this area. The identification of two distinct subpopulations of fungal cells with differential responses to antifungal treatment is an exciting and novel finding. While there are aspects of this manuscript that are of significant interest, there are also limitations to this work. The research is framed as a method to study antifungal drug tolerance, but it is not clear how it does so, based on the methods. This work also compares very different populations of cells (rapidly growing untreated cells compared with cells grown in antifungal for several days), making it difficult to assess the role of antifungal treatment specifically in this analysis. This manuscript is also written with a great deal of highly technical language that makes it difficult to dissect the major findings and outcomes from the study.

    4. Reviewer #3 (Public Review):

      The authors described their extensive single-cell analysis of Candida undergoing (sub-inhibitory) antibiotic treatment versus no treatment. To do so, the authors used a microfluidics platform they had previously developed, and they optimized, characterized, and validated it for this particular application. Their findings included: (a) the transcription of untreated cells is driven mostly by cell cycle phase, (b) treated cells can be clustered into several major groups and a few outlier groups that the authors termed comets, (c) cells undergoing FCZ treatment can adopt one of two different states (possibly bistability). I found the results interesting and the approach to be sound, and much of the results confirmed my prior expectations. The authors provide a detailed depiction of what is going on in the transcriptome during sub-inhibitory treatment, although this did not always lead to a mechanistic explanation. The clinical relevance was unclear to me beyond a proof of concept application for single-cell transcriptomics. In my opinion, an interesting follow-up would be to follow the transcriptional trajectory of lineages undergoing antimicrobial switching (on and off). The main issues I identified were the author's use of the term tolerance versus resistance, interpretation of "comets", clustering approach, description of fitness, and comparison between time points.

    1. eLife assessment

      This manuscript addresses the important issue of hemodynamic response function (HRF) variability across brain areas and will be valuable to researchers who use fMRI and other types of functional imaging that rely on neurovascular coupling. Using simulations and experiments, the authors provide solid evidence that differences in the HRF can impact spectrum-based metrics such as ALFF and fALFF. A better understanding of the variability of the HRF is critical for the proper interpretation of activation onset times and of differences observed in clinical populations where both neural and vascular alterations can be expected.

    2. Reviewer #1 (Public Review):

      The authors first show in simulated data that differences in the speed of the HRF are reflected in the power spectra of the BOLD signal obtained during oscillatory stimulation at different frequencies. They then identified voxels that were fast or slow responders in data obtained from the primary visual cortex and LGN during visual stimulation and found that the fast and slow groups exhibited the same differences in power spectra observed in the simulations. Moreover, resting state data obtained separately from the same areas also exhibited these spectral differences. In contrast, the onset time of a response to a breath hold was less able to differentiate between fast and slow voxels.

      The combination of simulations and experiments in this work provides evidence that power spectra from rs-fMRI can provide information about the HRF in different locations across the brain. However, the simulated HRFs differ in amplitude and duration as well as latency, and all of these features can affect the power spectrum. The authors show that differences remain in the power spectra for amplitude-normalized HRFs, which strengthens their work. However, the entire premise of the work is that the actual HRFs in the brain can be modeled using the range of shapes that were simulated. As the authors point out, we know little about the actual HRF in much of the brain, and it may be that this model does not adequately represent HRFs in other regions. At a minimum, it would be useful to disentangle the effects of latency and duration of the response, in addition to amplitude, because with the current model early onset voxels also have shorter response durations. It is not hard to imagine that a brain area might have a rapid onset but a long duration of HRF, and the power spectrum in this case may look more like that of a slow responder. The current approach was validated in the visual system, which has been the basis for much of what we know about HRFs, and it may not be as accurate in other areas of the brain. This is admittedly a difficult issue to address, but merits consideration as a limitation.

      Despite my skepticism of the general applicability of the technique, it remains a significant advance in understanding the variability of HRFs in the brain. The authors make a strong case that cerebrovascular reactivity as measured in response to a breath hold does not accurately capture all of the aspects of neurovascular coupling, an important finding. The work also clearly shows that differences in fALFF or other power-based metrics can reflect differences in neurovascular coupling rather than neural activity, something that is widely suspected but commonly ignored in the interpretation of fALFF data. We still have far to go to fully understand neurovascular coupling throughout the brain and under various conditions, and this manuscript contributes to our knowledge of how two investigative tools perform at the task.

    3. Reviewer #2 (Public Review):

      This study highlights a connection between the power spectra of fMRI signals and the temporal dynamics of the hemodynamic response function (HRF). Using visual stimulation experiments and resting-state scans, the spectral features of resting-state fMRI signals in V1 and LGN are found to have a significant relationship to the relative timing of HRF responses during the task.

      Overall, I found this to be an interesting and clearly written study, with high-quality data. The connection between BOLD signal spectra and vascular responses is not discussed in much of the resting-state fMRI literature, and represents an important message and consideration. While the connection between the amplitude of resting-state BOLD fluctuations and the amplitude of task HRFs has been investigated in the past, I am not aware of prior work that had considered the timing aspect. The present comparison between resting-state spectra and breath-holding task responses also provides useful data about the hemodynamic information carried by these two conditions.

      The present experiments were conducted at 7T with high temporal resolution and focused on a visual experiment. The generalization of the findings to other task conditions, brain regions, and acquisition parameters would be a valuable future step. Understanding the translation to other datasets would be a practical consideration for researchers who are considering applying this method. Regarding the evaluation of the classification models, it currently appears possible that the train/test sets might contain closely spaced and thus correlated voxels. Accounting for this effect could help to better support the conclusions of this analysis. More discussion about the neural or vascular basis of the slow- versus fast-HRF voxels could also bring further insights to the work (for instance, the location of the fast and slow V1 voxels with respect to functional boundaries and vascular anatomy).

    1. eLife assessment

      This valuable study shows that Rab12 is required for LRRK2 activation. However, while some of the data are compelling, some claims, especially the ones related to LRRK2's membrane association are not supported. Addressing discrepancies between figures (pointed out by reviewers) and re-writing certain sections will greatly improve this manuscript.

    2. Reviewer #1 (Public Review):

      This careful study reports the importance of Rab12 for Parkinson's disease associated LRRK2 kinase activity in cells. The authors carried out a targeted siRNA screen of Rab substrates and found lower pRab10 levels in cells depleted of Rab12. It has previously been reported that LLOME treatment of cells breaks lysosomes and with time, leads to major activation of LRRK2 kinase. Here they show that LLOME-induced kinase activation requires Rab12 and does not require Rab12 phosphorylation to show the effect.

      1. Throughout the text, the authors claim that "Rab12 is required for LRRK2 dependent phosphorylation" (Page 4 line 78; Page 9 line 153; Page 22 line 421). This is not correct according to Figure 1 Figure Supp 1B - there is still pRab10. It is correct only in relation to the LLOME activation. Please correct this error.

      2. The authors conclude that Rab12 recruitment precedes that of LRRK2 but the rate of recruitment (slopes of curves in 3F and G) is actually faster for LRRK2 than for Rab12 with no proof that Rab12 is faster-please modify the text-it looks more like coordinated recruitment.

      3. The title is misleading because the authors do not show that Rab12 promotes LRRK2 membrane association. This would require Rab12 to be sufficient to localize LRRK2 to a mislocalized Rab12. The authors DO show that Rab12 is needed for the massive LLOME activation at lysosomes. Please re-word the title.

    3. Reviewer #2 (Public Review):

      This study shows that rab12 has a role in the phosphorylation of rab10 by LRRK2. Many publications have previously focused on the phosphorylation targets of LRRK2 and the significance of many remains unclear, but the study of LRRK2 activation has mostly focused on the role of disease-associated mutations (in LRRK2 and VPS35) and rab29. The work is performed entirely in an alveolar lung cell line, limiting relevance for the nervous system. Nonetheless, the authors take advantage of this simplified system to explore the mechanism by which rab12 activates LRRK2. In general, the work is performed very carefully with appropriate controls, excluding trivial explanations for the results, but there are several serious problems with the experiments and in particular the interpretation.

      First, the authors note that rab29 appears to have a smaller or no effect when knocked down in these cells. However, the quantitation (Fig1-S1A) shows a much less significant knockdown of rab29 than rab12, so it would be important to repeat this with better knockdown or preferably a KO (by CRISPR) before making this conclusion. And the relationship to rab29 is important, so if a better KD or KO shows an effect, it would be important to assess by knocking down rab12 in the rab29 KO background.

      Secondly, the knockdown of rab12 generally has a strong effect on the phosphorylation of the LRRK2 substrate rab10 but I could not find an experiment that shows whether rab12 has any effect on the residual phosphorylation of rab10 in the LRRK2 KO. There is not much phosphorylation left in the absence of LRRK2 but maybe this depends on rab12 just as much as in cells with LRRK2 and rab12 is operating independently of LRRK2, either through a different kinase or simply by making rab10 more available for phosphorylation. The epistasis experiment is crucial to address this possibility. To establish the connection to LRRK2, it would also help to compare the effect of rab12 KD on the phosphorylation of selected rabs that do or do not depend on LRRK2.

      A strength of the work is the demonstration of p-rab10 recruitment to lysosomes by biochemistry and imaging. The demonstration that LRRK2 is required for this by biochemistry (Fig 4A) is very important but it would also be good to determine whether the requirement for LRRK2 extends to imaging. In support of a causal relationship, the authors also state that lysosomal accumulation of rab12 precedes LRRK2 but the data do not show this. Imaging with and without LRRK2 would provide more compelling evidence for a causative role.

      The authors also touch base with PD mutations, showing that loss of rab12 reduces the phosphorylation of rab10. However, it is interesting that loss of rab12 has the same effect with R1441G LRRK2 and D620N VPS35 as it does in controls. This suggests that the effect of rab12 does not depend on the extent of LRRK2 activation. It is also surprising that R1441G LRRK2 does not increase p-rab10 phosphorylation (Fig 2G) as suggested in the literature and stated in the text.

      Most important, the final figure suggests that PD-associated mutations in LRRK2 and VPS35 occlude the effect of lysosomal disruption on lysosomal recruitment of LRRK2 (Fig 4D) but do not impair the phosphorylation of rab10 also triggered by lysosomal disruption (4A-C). Phosphorylation of this target thus appears to be regulated independently of LRRK2 recruitment to the lysosome, suggesting another level of control (perhaps of kinase activity rather than localization) that has not been considered.

    4. Reviewer #3 (Public Review):

      Increased LRRK2 kinase activity is known to confer Parkinson's disease risk. While much is known about disease-causing LRRK2 mutations that increase LRRK2 kinase activity, the normal cellular mechanisms of LRRK2 activation are less well understood. Rab GTPases are known to play a role in LRRK2 activation and to be substrates for the kinase activity of LRRK2. However, much of the data on Rabs in LRRK2 activation comes from over-expression studies and the contributions of endogenously expressed Rabs to LRRK2 activation are less clear. To address this problem, Bondar and colleagues tested the impact of systematically depleting candidate Rab GTPases on LRRK2 activity as measured by its ability to phosphorylate Rab10 in the human A549 type 2 pneumocyte cell line. This resulted in the identification of a major role for Rab12 in controlling LRRK2 activity towards Rab10 in this model system. Follow-up studies show that this role for Rab12 is of particular importance for the phosphorylation of Rab10 by LRRK2 at damaged lysosomes. Increases in LRRK2 activity in cells harboring disease-causing mutants of LRRK2 and VPS35 also depend (at least partially) on Rab12. Confidence in the role of Rab12 in supporting LRRK2 activity is strengthened by parallel experiments showing that either siRNA-mediated depletion of Rab12 or CRISPR-mediated Rab12 KO both have similar effects on LRRK2 activity. Collectively, these results demonstrate a novel role for Rab12 in supporting LRRK2 activation in A549 cells. It is likely that this effect is generalizable to other cell types. However, this remains to be established. It is also likely that lysosomes are the subcellular site where Rab12-dependent activation of LRRK2 occurs. Independent validation of these conclusions with additional experiments would strengthen this conclusion and help to address some concerns that much of the data supporting a lysosome localization for Rab12-dependent activation of LRRK2 comes from a single method (LysoIP). Furthermore, there is a discrepancy between panel 4A versus 4D in the effect of LLoMe-induced lysosome damage on LRRK2 recruitment to lysosomes that will need to be addressed to strengthen confidence in conclusions about lysosomes as sites of LRRK2 activation by Rab12.

    1. eLife assessment

      This exceptional work substantially advances our understanding of the mechanics of the Reissner's fibre (RF) by performing in-vivo experiments that track and analyze the behavior of the RF when it is cut and the behavior of ciliated cells touching the RF when contact is interrupted. The data is valuable and the conclusions are compelling. The work will be of broad interest to many research communities including developmental neuroscience and cilia biology.

    2. Reviewer #1 (Public Review):

      This manuscript provides novel and intriguing experiments that aim to elucidate the mechanical properties of the Reissner fiber (RF) and to probe its interactions with the motile cilia in the central canal of the spinal cord. Using in vivo imaging in larval zebrafish, the authors show that the RF is under tension and oscillates dorsoventrally. Importantly, ablation of the RF triggered retraction and relaxation of the fiber cut ends. The retraction speed depends on where the fiber was ablated, with fastest retraction in the rostral side, indicating that tension in the RF builds up rostrally. The authors, based on observations from live imaging of intact and ablated RF and central canal, conjecture that numerous ependymal motile monocilia, that are tilted caudally and interact frequently with the RF, contribute to RF heterogenous tension via weak interactions.

      The work is important. The experiments are thorough and intricate. The findings are fascinating and open up the prospect for future investigations and models. I'm particularly curious as to what future experiments can be used to test the hypothesis put forward by the authors about the role of cilia-fiber interactions in the RF mechanical properties and function.

    3. Reviewer #2 (Public Review):

      The present manuscript by the Claire Wyart group analyses the behaviour of Reissner's fibre (RF) when it is cut, as well as the behaviour of cells touching RF when contact is interrupted. They show that RF is under tension that is higher in the rostral than in the caudal spinal cord. One of the proposed mechanisms is a caudally oriented movement of the cilia of ependymal radial glials cells (ERG) that is inherent rather than caused by the contact with RF. Kolmer Agduhr neurons that are also CSF contacting (CSF-cN), alter their activity when contact is lost through laser ablation of RF.

      This is an interesting paper - RF has long been proposed to be a source of signalling molecules in the development and physiological function of neural cells in the spinal cord. Cilia are the main centre of signalling activity in ciliated cells (e.g. for sonic hedgehog signalling) and the fact that ERG cilia are in direct contact with RF is intriguing. Presumably, signalling molecules could be directly transferred from RF to ERG at the contact points.

      Functionally, CSF-cN are augmenting spinal cord intrinsic sensory feedback on body curvature. This had been shown in vitro/ex vivo, but not clearly evaluated in the living animal. The data shown here demonstrate a possible mechanism for how the feedback can be mediated through contact with RF. This is of fundamental interest to understand the functioning of a locomotor network that is under evolutionary pressure to function early, since fish hatch at 3 days post fertilisation.

      Interestingly, the authors propose (and discuss against the relevant literature) that the presence of RF in the central canal can influence the flow of the CSF, which should be investigated in further work.

      Overall, the results are clearly presented, and methods are thoroughly given, including some indication on the reduction of bias (by blinding movies before analysis). The authors also clearly state the limitations of their work, mostly derived from optical limitation (size of the RF in the larval fish, and speed of the recording in the laser-equipped microscope). This doesn't affect the fundamental statements.

    4. Reviewer #3 (Public Review):

      This manuscript by Bellegarda et al. examined the in vivo dynamic behavior of the Reissner fiber and its interactions with cilia and sensory neurons in the central canal of zebrafish larvae. The authors accomplished this by performing live imaging with a transgenic reporter zebrafish line in which the fiber is GFP-tagged and by finely tracking the movement of the fiber. Interestingly, they discovered that the fiber undergoes a dynamic vibratory-like movement along the dorsoventral axis. The authors then utilized a pulsed laser to precisely cut the fiber, which frequently resulted in a fast retraction behavior and a loss of calcium activity in sensory neurons in the central canal called CSF-CNs. Mechanical modeling of the elastic properties of the fiber indicated that the fiber is a soft elastic rod with graded tension along the rostrocaudal axis. Finally, by performing live imaging of motile cilia and the fiber in the central canal, they found that the two interact in close proximity and that cilia motility is affected when the fiber was cut. The authors concluded that the Reissner fiber is a dynamic structure under tension that interacts with sensory neurons and cilia in the central canal.

      Strengths:<br /> 1. The study utilizes state-of-the-art microscopy techniques and beautiful transgenic zebrafish tools to characterize the in vivo behavior of the Reissner fiber and found that it exhibits surprising dynamic movements along the dorsal-ventral axis. This observation has important implications for the physiology and function of the Reissner fiber.

      2. By performing a series of clever laser cutting experiments, the authors reveal that the Reissner fiber is under tension in the central canal of zebrafish. This finding provides direct experimental evidence to support the hypothesis that the Reissner fiber functions in a biomechanical manner during spinal cord development and body axis straightening.

      3. By developing a mechanical model of the Reissner fiber and its retraction behavior, the authors estimate the elastic properties of the fiber and found that it is more akin to an elastic polymer rather than a stiff rod. This is a useful finding that illuminates the biophysical properties of the fiber.

      4. Through calcium and cilia imaging studies, the authors demonstrate that the Reissner fiber likely interacts with motile cilia and regulates the activity of ciliated sensory neurons (CSF-CNs). The authors propose a model in which fiber-cilia interactions may occur via weak interactions or frictional forces. This model is plausible and opens several new doors for additional investigation.

      Weaknesses:<br /> 1. All the live imaging experiments appear to be performed with animals paralyzed via the injection of a chemical agent (bungarotoxin). Does paralysis and/or bungarotoxin negatively impact the behavior of the Reissner fiber? Some data from non-paralyzed animals would ameliorate this concern.

      2. Although the authors convincingly demonstrate that the Reissner fiber is under graded tension, it remains unclear what is the relevance and function of tension on this structure. The photoablation data presented do not delineate between the relevance of the fiber being intact or tension on the fiber as cutting the fiber impacts both. Is fiber tension required for body straightening? At the site of fiber photoablation, does a spinal curvature develop? If cultured, do the ablated animals exhibit a scoliotic phenotype?

      3. One of the most potentially impactful conclusions of the paper is that the Reissner fiber interacts with cilia, but the evidence is insufficient to support this. Although some motile cilia are near the fiber (Figure 3A), many cilia are not near the fiber. The provided images and videos do not clearly demonstrate that cilia physically contact or influence the behavior of the Reissner fiber. Further, the data is lacking to conclude that the Reissner fiber directly impacts cilia motility as they did not observe an overall statistically significant difference before and after ablation (Supplemental Figure 1A). Higher magnification, higher resolution, higher acquisition rate and/or colocalization analyses of fiber-cilia interactions could alleviate this concern.

      4. Similarly, how does the Reissner fiber interact with CSF-CN sensory neurons? The authors suggest that the fiber interacts with CSF-CN sensory neurons by modulating their spontaneous calcium activity via weak interactions or frictional forces from motile ciliated ependymal radial glial cells. While the calcium imaging data of the CSF-CNs is convincing and sound, the exact nature of the fiber-neuron interaction is unclear. Do cilia or apical extensions on CSF-CN sensory neurons sense the fiber or forces through a mechanosensing or chemosensing mechanism? There is some additional confusion as the authors appear to focus their cilia experiments on ependymal radial glial cells in section 4, rather than CSF-CNs. The addition of an illustrative cartoon would add clarity.

      Overall, the conclusions of the study are well supported by the data presented. However, the strength of the conclusions could be enhanced by additional controls, alternative experimental approaches and clarifications.

      This manuscript is an important contribution to the fields of spinal cord development and body axis development, which are fundamental questions in neurobiology, developmental biology, and musculoskeletal biology. In recent years, the Reissner fiber and motile cilia function have been linked to cerebrospinal fluid flow signaling and body straightening, but the precise form and function of the fiber remain unclear. This study provides new insight into the dynamic and biophysical properties of the Reissner fiber in vivo in zebrafish and proposes a model in which the fiber interacts with cilia and sensory neurons. This study provides novel insight into the cellular mechanisms that underlie the pathogenesis of disorders such as idiopathic scoliosis.

    1. eLife assessment

      This manuscript reports the fundamental finding that an oligomeric protein kinase, CaMKII, can be phosphorylated by another molecule of the holoenzyme in a manner that does not involve subunit exchange. The evidence for the main conclusion is compelling, supported by several independent experiments. If independently confirmed in future, the study will stand as having provided a novel regulatory mechanism for the autophosphorylation of this kinase. The work will be of broad interest to molecular and cellular neuroscientists as well as biochemists.

    2. Reviewer #1 (Public Review):

      The potential role of the CaMKII holoenzyme in synaptic information processing, storage, and spread has fascinated neuroscientists ever since it has been described that self-phosphorylation of CaMKII at T286 (pT286) can maintain the kinase in an activated state beyond the initial Ca2+ stimulus that induced kinase activation and pT286. The current study by Lučić et al utilizes biochemical and biophysical methods to re-examine two pT286 mechanisms and finds:<br /> (1) that a previously proposed activation-induced subunit exchange within the holoenzyme can not provide pT286 maintenance or propagation; and<br /> (2) that pT286 can occur not only within a holoenzyme but also between two holoenzymes, at least at sufficiently high concentrations.

      For the observation regarding the subunit exchange, the authors go above and beyond to demonstrate that a previously proposed activation-induced subunit exchange does not actually occur in their hands and that the previous appearance of such a subunit exchange may instead be due to activation-induced interactions between the kinase domains of separate holoenzymes. This provides important clarification, as the imagination about the possible functions of this subunit exchange has been running wild in the literature.

      By contrast, pT286 between holoenzymes at sufficiently high concentrations was largely predicted by the previously reported concentration-dependence of pT286 between monomeric truncated CaMKII (although these previous experiments did not rule out that such pT286 could have been excluded for intact full-length holoenzymes). Notably, the reaction rate reported here for pT286 between two holoenzymes is more than two orders of magnitude slower compared to the previously described rate of the pT286 reaction within a holoenzyme.

      In summary, this study contains two somewhat disparate parts: (1) one technical tour-de-force to provide evidence that argues against activation-induced subunit exchange, which was a tremendous effort that provides influential novel information, and (2) another set of experiments showing the somewhat predictable potential for pT286 between holoenzymes, but without indication for the functional relevance of this rather slow reaction. Unfortunately, in the current/initial title of the manuscript, the authors chose to emphasize the weaker part of their findings.

    3. Reviewer #2 (Public Review):

      This well-written manuscript provides a technical tour-de-force to provide a novel mechanism for sustaining CaMKII autophosphorylation through an interholoenzyme reaction mechanism the authors term inter-holoenzyme phosphorylation (IHP). The authors use molecular engineering to create designer molecules that permit detailed testing of the proposed interholoenzyme reaction mechanism. By catalytically inactivating one population of enzymes, they show using standard assays that the inactive enzyme can be phosphorylated by active holoenzymes. They go on to show that in cells, the inactive enzyme is phosphorylated only in the presence of co-expressed active CaMKII and that this does not appear to be due to active and inactive subunits mixing within the same holoenzyme. The authors suggest reasons for why previous experiments failed to expose IHP and in some experiments provide evidence that reproduces and then extends earlier studies. Some noted differences from earlier experiments are the reaction temperature, the time course of the reactions, and that significantly higher concentrations of the inactive (substrate) kinase in the present study amplify the IHP. These are plausible reasons for earlier studies not finding significant evidence for IHP and the presented data is well-controlled and of high quality.

      The authors then take on the idea of subunit exchange employing multiple strategies. Using genetic expansion, they engineer an unnatural amino acid into the hub domain of the kinase (residue 384). In the presence of the photoactivatable crosslinker BZF and UV illumination, a ladder of subunits was generated indicating intraholoenzyme crosslinks were established. Using this cross-linked enzyme, presumably incapable of subunit exchange, the authors show significant phosphorylation of the kinase-dead mutant. This further supports that IHP is the cause of phosphorylation and not subunit exchange. Extending these experiments, they could not find evidence when CaMKIIF394BZF was mixed with the kinase-dead mutant and exposed to UV light, that there was evidence of the kinase-dead subunits exchanged into CaMKIIF394 (active) enzymes.

      With an entirely different approach, the authors use isotopic labeling of different pools of wt CaMKII (N14 or N15) followed by bifunctional cross-linking and mass spec to assess potential intra- and inter-holoenzyme contacts. Several interesting findings came of these studies detailed in Figure 4, mapped in detail in Figure 5, and extensively documented in supplementary tables. Critically, numerous cross-links were found between different domains of the enzyme (catalytic, regulatory, hub) that are themselves a nice database of proximity measurements, but critical to the hypothesis, no heterotypic cross-links were found in the hub domains at any activated state or time point of incubation. This data supports two findings, that catalytic domains come into close proximity between holoenzymes when activated, supporting the potential for IHP, but that no subunit exchange occurs.

      The authors then pursue the approach used originally to provide evidence of subunit mixing, single molecule-based fluorescence imaging. Using pools of CaMKII labeled with spectrally separable dyes, the authors reproduce the earlier findings (Stratton et al, 2016) showing that under activating conditions, but not basal conditions, colocalized spots were detected. Numerous controls were done that confirm the need for full activation (Ca2+/CaM + Mg2+/ATP) to visualize co-localized CaMKII holoenzymes. Extending these studies, the authors mix holoenzymes, fully activate them, and after sufficient time for subunit exchange (if it occurs), the reactions were quenched, and then samples were analyzed. The result was that no evidence of dual-colored holoenzymes was present; if subunits had mixed between holoenzymes, dual-colored spots should have been evident after quenching the reactions. This was not the case. Further, experiments repeated with pools of differentially labeled kinase dead enzymes produced no colocalization, as predicted, if activation of the catalytic domains is necessary to establish IHP.

      Finally, the authors employ mass photometry to investigate the potential for interholoenzyme interactions. At basal conditions, only a mass peak consistent with CaMKII dodecamers was evident. Upon activation, a small fraction of dimeric complexes was evident (with Ca2+/CaM bound) but the majority of the peak was a dodecamer with 12 associated CaM molecules, and importantly, a significant fraction of a mass population was found consistent with a pair of holoenzymes with associated CaM. As an aside, the holoenzyme population appeared to be modestly destabilized as evidence of a minor fraction of dimers appeared as the authors diluted the enzyme, but the pools of holoenzyme and pairs of holoenzymes (with CaM) remained the dominant species when activated under all three enzyme concentrations assessed. Supporting the importance of activation for interactions between holoenzymes, the catalytically dead kinase even under activating conditions, shows no evidence of dimers of holoenzymes.

      Each of the approaches is well-controlled, the data is of uniformly high quality, and the authors' interpretations are generally well-supported.

    4. Reviewer #3 (Public Review):

      CaMKII is a multimeric kinase of great biologic interest due to its crucial roles in long-term memory, cardiac pacemaking, and fertilization. CaMKII subunits organize into holoenzymes comprised of 12-14 subunits, adopting a donut-like, double-ringed structure. In this manuscript, Lucic et al challenge two models in the CaMKII field, which are somewhat related. The first is a longstanding topic in the field about whether the autophosphorylation of a crucial residue, Thr286, can be phosphorylated between intact holoenzymes (inter-holoenzyme phosphorylation). The second is a more recent biochemical finding, which tested the long-running theory that CaMKII exchanges subunits between holoenzymes to create mixed oligomers. These two models are connected by the idea that subunit exchange could facilitate phosphorylation between subunits of different holoenzymes by allowing subunits to integrate into a different holoenzyme and driving transphosphorylation within the CaMKII ring. Here, the authors attempt to show that one intact holoenzyme phosphorylates another intact holoenzyme at Thr286. The authors also provide evidence suggesting that subunit exchange is not occurring under their conditions, and therefore not driving this phosphorylation event. The authors propose a model where instead of exchanging subunits, two holoenzymes interact via their kinase domains to enable transphosphorylation at Thr286 without integrating into the holoenzyme structure. In order for the authors to successfully convince readers of all three facets of this new model, they need to provide evidence that 1) transphosphorylation at Thr286 happens when subunit exchange is blocked, 2) subunit exchange does not occur under their conditions, and 3) there are interactions between kinases of different holoenzymes that lead to productive autophosphorylation at Thr286.

      Strengths:<br /> The authors have designed and performed a battery of cleverly designed and orthogonal experiments to test these models. Using mutagenesis, they mixed a kinase-dead mutant with an active kinase to ask whether transphosphorylation occurs. They observe phosphorylation of the kinase-dead variant in this experiment, which indicates that the active kinase must have phosphorylated it. A few key questions arise here: 1) whether this phosphorylation occurred within a single CaMKII holoenzyme ring (which is the canonical mechanism for Thr286 phosphorylation), 2) whether the phosphorylation occurred between two separate holoenzyme rings, and 3) why was this not observed in previous literature? To address questions 1 and 2, the authors implemented an innovative strategy introducing a genetically-encoded photocrosslinker in the oligomerization domain, which when crosslinked using UV light, should lock the holoenzyme in place. The rate of phosphorylation was the same when comparing uncrosslinked and crosslinked CaMKII variants, indicating that phosphorylation is occurring between holoenzymes, rather than through a subunit exchange mechanism that would require some type of disassembly and reassembly (presumably blocked by crosslinking). The 3rd question remains as to why this has not been previously observed, as it has not been for lack of effort. The authors mention low temperature and low concentration as culprits, however, Bradshaw et al, JBC v. 277, 2002 carry out a series of careful experiments that indicated that autophosphorylation at T286 is not concentration-dependent (meaning that the majority of phosphorylation occurs via intra-holoenzyme), and this is done over a concentration and temperature range. It is possible that due to the mutants used in the current manuscript, it allows for the different behavior of the kinase-dead domains, which will have an empty nucleotide-binding pocket. Further studies will need to elucidate these details, and importantly, understand what physiological conditions facilitate this mechanism.

      The most convincing data that subunit exchange does not occur is from the crosslinking mass spectrometry experiment. The authors created mixtures of 'light' and 'heavy' CaMKII holoenzymes, either activated or not and then used a Lys-Lys crosslinker (DSS) to trap the enzyme in its final state. The results of this experiment indicate that subunit exchange is not occurring under their conditions. A caveat here is that there are not many lysines at hub-hub interfaces, which is the crux of this experiment. If there is no subunit exchange under their conditions, how does transphosphorylation occur between holoenzymes? The authors show very nice mass photometry data indicating that there are populations of 24-mers, which corresponds to a double-holoenzyme. Paired with the data from their crosslinking mass spectrometry which shows crosslinks between kinase domains of different holoenzymes, this indicates that perhaps kinases between holoenzymes do interact, and they do so in a competent manner to allow transphosphorylation to occur.

      Weaknesses:<br /> The authors should be commended for performing three orthogonal experiments to test whether CaMKII holoenzymes exchange subunits to form heterooligomers. However, there are technical issues that dampen the strength of the results shown here. For simplicity, let's consider that CaMKII holoenzymes are comprised of two stacked hexameric rings. It has been proposed that the stable unit of CaMKII assembly and perhaps also disassembly and subunit exchange is a vertical dimer unit (comprised of one subunit from each hexameric ring). In the UV crosslinking data shown in this paper, the authors have a significant number of monomers, some crosslinked dimers (of which there are two populations), and fewer higher-order oligomers. To effectively block subunit exchange, robust crosslinking into hexamers is necessary, which the authors have not done. Incomplete crosslinking results in smaller species that can still exchange (and/or dissociate), confounding the results of this experiment. In addition, Figure 3 shows a trapping experiment, where if the exchange was occurring, there would be an oligomeric band in Lane 8, which is visible and highlighted with a blue arrow by the authors. This result is explained by nonspecific UV effects, however by eye it is not clear if there is an equivalent band in lane 10. The overall issue here is inefficient crosslinking.

      The authors also employ a single-molecule TIRF experiment to further interrogate subunit exchange. Upon inspection of the TIRF images, it is not clear that the authors are achieving single molecule resolution (there are evident overlapping and distorted particles). The analysis employed here is Pearson's correlation coefficient, which is not sufficient for single molecule analysis and would not account for particle overlap, particles that are too bright, and/or particles that are too dim. For example, an alternative explanation for the authors' results is that activation results in aggregation (high correlation), and subsequent EGTA treatment leads to dissociation at these low concentrations (low correlation). However, further experimentation and analysis are necessary.

      Taken together, the authors have provided important food for thought regarding inter-holoenzyme phosphorylation and subunit exchange. However, given the shortcomings discussed here, it remains unclear exactly what mechanisms are at play within and between CaMKII holoenzymes once activated.

    1. eLife assessment

      This is an important analysis of two sleep datasets in children and adolescents that contributes to our understanding of sleep spindle and slow oscillation dynamics during development and is expected to be of interest to interdisciplinary fields including development and sleep. The analyses are solid and adequately complex to capture the changes in sleep spindle to slow oscillation coupling between the age groups. However, the paper would be strengthened by performing the same analyses in an adult sample to sufficiently characterize the maturation of sleep spindles and their coupling to slow oscillations.

    2. Reviewer #1 (Public Review):

      During NonREM sleep, two major oscillations, the slow oscillation (SO) and the sleep spindle, have been shown to interact, putatively to support memory consolidation. These oscillations and their interrelation have been shown to change during development. The authors reanalyse two datasets in children and adolescents. One is longitudinal, assessed at 8-11 years and 14-18 years, the other is cross-sectional, assessed at 5-6 years. The manuscript reports several interesting findings. They identify three types of spindles, canonical slow and fast spindles as well as "age adjusted" fast spindles. They show that fast spindles are modulated by the slow oscillation more in the older children and relate this improved modulation to a sleep-spindle maturation index. The authors use many highly complex data analysis tools and apply them to different transformations of the data, which they explain in great detail. The manuscript is written clearly although it is at times very technical. The findings could be highly interesting to the field of sleep research, as they nicely examine the developmental trajectory of spindles and their coupling to the SO. Although the manuscript makes use of two adequate samples of children and adolescents, they do not compare their findings to adults. In addition, the maturation index is not well justified and the authors could do more to show that the "age adjusted" fast spindles actually develop into fast spindles. The analysis also does not take sex into account, which could be affecting findings in puberty. In general, there are some analyses that could be added to make the findings clearer. For example, it would be great to show averages of the detected spindles to show how they may or may not differ. More descriptive data in form of figures would also help readers understand the complex analyses that are reported (i.e., spectrogram and SO phase locked activity in the spindle bands). Finally, children have been reported to have superior declarative memory consolidation, which itself has been closely linked to spindle-SO coupling. It would be great to have a more broad discussion, how the current findings are related to other developmental changes in the field of sleep (and memory).

    3. Reviewer #2 (Public Review):

      The article by Joechner et al is a reanalysis of a large cohort data-set on sleep oscillation development. By combining an analysis with fixed frequencies derived from adults with adaptive frequency ranges, they highlight that initially spindle oscillations are slower and it takes until mid adolescence for spindles to be more adult like. Further, those spindles that already have adult-like frequency ranges also show the other properties known from adults. These results are intriguing and the analysis is well-done and thorough. I only have minor comments on how the article could be improved.

      Some additional analysis that would complement the current findings: in Fig 1 it would be good to include the adult-like slow frontal spindles for comparison (similar as the inclusion for the centro parietal ones). Further, providing distributions could let the readers have some valuable insight into the events. Could the authors combine all events and show 3D scatter plots with frequency X amplitude X duration of each spindle event? And then either color code the events from different age groups or have them in separate plots. Additionally, the frequency cut-off for adult-events could be added to the plot. This would likely show nicely how the events shift in their properties over age and thus slowly reach adult-like characteristics.

      On page 2. Line 17 the authors state that spindles align ripples. While this is the case, the interaction between these oscillations are more complex. Ripples will also occur before the spindle and the ripples before spindles have been shown to be causally related to memory consolidation. Please cite Maingret et al Nat Neurosci 2016. Further, the authors should also discuss other rodent work for example Garcia et al Frontiers 2022, which also investigates the development of spindles.

    4. Reviewer #3 (Public Review):

      Joechner and their co-authors performed an extensive analysis of two existing datasets from sleeping children aged between 5 to 18 years. By identifying discrete events of slow oscillations (SOs) and (fast) sleep spindles they examined not only the developmental changes of these distinct sleep grapho-elements. They also took a closer look at their interplay, e.g., to what extend sleep spindles are co-occurring with slow oscillation up-states, as this coupling is thought to underlie sleep-dependent memory consolidation.

      The authors found that both sleep spindles and slow oscillation undergo a change across the young age, e.g., while sleep spindles increased in frequency approaching the typical 12-16 Hz range found in adults, slow oscillation showed a shift in occurrence patterns from posterior to anterior sites. Likewise, the coupling of fast spindles within slow oscillation up-states manifested with age, which is almost non-existing in 5- to 6-year-old children. However, and most intriguingly, a coupling analysis based on the adult-like 12-16 Hz range revealed an already existing SO-spindle phase-relation across all age ranges. Altogether, this data nicely demonstrates the trajectory of sleep spindles and SOs in children and highlights the almost inherent coupling between SOs and "adult" sleep spindles. In my view, these results not only provide a good overview of a healthy development but also interesting food for thought regarding the function of SO-spindle coupling in healthy or clinical development.

      Overall, this work is well-written, and the performed analyses are well conceptualized. Hence, there are one general and a few minor aspects that could be addressed to hopefully strengthen this manuscript a bit further.

      The biggest aspect that was striking is the shear amount of data reported, e.g., a supplement with 28 tables is too extensive. The authors should consider reducing a few aspects.<br /> For example, the authors employ a linear mixed effects model and report coefficient etc. in the supplement. However, in the main text, the authors mainly report ANOVA-based results. Obviously, a LMM and an ANOVA are equivalent, however, focusing on one approach could streamline everything.<br /> Another example is the assessment of spindle frequency via the discrete events: First spindle peak frequency is derived via power spectra. Using the then individually identified peaks, discrete events are detected. Shouldn't it be obvious that these events show the same behavior with regard to their frequency?<br /> As a final example, the authors first report changes in fast spindle properties across age and, e.g., find an increase in frequency towards 12-16 Hz adult range. They then repeat the whole analysis in the 12-16 Hz range and examine the "distance" to the individualized results. It should again be obvious that this approach comes to the same conclusion, a smaller distance in older children. Even more obvious is the conclusion "Hence, it appears as if fast centro-parietal SPs become more dominant and adult-like in their frequency and amplitude characteristics in older children" because it describes a normal development of a healthy child. Altogether, the authors could streamline a few aspects by removing hidden redundancies and focus on the - in my view - central aspect of an inherent 12-16 Hz coupling across all ages.

    1. eLife assessment

      This study presents a valuable finding on the associations and causal relationship between second primary cancers and the initial diagnosis of a primary cancer via using a large database. The evidence supporting the claims of the authors is solid. The work will be of interest to cancer clinicians.

    2. Reviewer #1 (Public Review):

      The authors used pan-cancer Standardized Incidence Ratio analyses and Mendelian Randomization analysis to reach the causal relationships between first primary cancers and second primary cancers, proving that a primary cancer may cause another type of primary cancer. The results supported that pharynx cancer, ovary cancer, kidney cancer may cause non-Hodgkin lymphoma, soft tissue cancer, lung cancer and myeloma, respectively. This research provides a useful direction for further elucidation of profound mechanisms of secondary primary tumors, and guide the community to attach importance to the prevention of secondary primary tumors. According to previous researches, the number of patients with multiple primary cancers is growing rapidly and second solid tumors are a leading cause of mortality among several populations of long-term survivors, which shed light on the significance of this work.

      The methods of the work are logically rigorous, which revealed the incidence relationship among numerous types of cancers using SEER database analyses and further confirmed the causal relationship between first primary cancers and second primary cancers through MR analysis utilizing GWAS as an exposure database and UK Biobank as an outcome database. Then, 2 outlier-detected methods were used and validate the harmonization between SIR and MR analyses, making the results more solid.

      Nonetheless, SEER SIR analyses might be affected by confounding factors of screening and did not represent the whole population. In addition, too few SNPs were included in part of cancer types mentioned in the research, such as larynx, stomach and male breast cancer.

    3. Reviewer #2 (Public Review):

      This study investigates the associations and causal relationship between second primary cancers and the initial diagnosis of a primary cancer, utilizing a large-scale database. The study's unique contribution lies in its combination of pan-cancer analysis and the incorporation of Mendelian randomization, which adds novelty and enhances the value of the research.

      Furthermore, the findings of this study have the potential to provide valuable insights into important clinical considerations, such as patients' prognosis, treatment decisions, and survivorship care.

    1. eLife assessment

      This important study combines engineered mesenchymal stem cells together with mouse models of kidney injury to determine the ability of these cells to reduce kidney damage upon acute kidney injury. The evidence supporting the claims is solid, although the inclusion of more than one type of stem cell and the use of male mice which are more prone to acute kidney injury, would strengthen the study. This work will be of interest to both basic scientists and clinicians working on mechanisms of kidney injury and repair.

    2. Reviewer #1 (Public Review):

      For a gaseous therapeutic agent such as NO, delivery to the site and release to the injured area are both required for efficacy. Previous work has focused on hydrogels for delivery. The authors engineered a combined gene/cell therapy plus a pharmaceutical approach to NO delivery. Engineered MSC produced a mutant beta-galactosidase (B-GALH363A) that when a prodrug is administered, will release NO locally.

      One can imagine applications involving such a novel concept for gaseous signaling molecule delivery to include other kinds of cells, other prodrugs, other gaseous agents, and other injury types. In this elegant study, the concept has been explored deeply in one potential application, making it a landmark contribution to the field of regenerative medicine.

      Limitations of the current study are that the mice utilized were C57/Bl6 females, the most resistant sex and strain to kidney injury. Another limitation is the use of human placental MSCs only; as such we do not know if other MSCs will perform equally well.

    3. Reviewer #2 (Public Review):

      In this Manuscript, Huang et al generated engineered MSC (eMSC) to produce mutant b-GALH363A, and when stimulated with a pro-drug (MGP) they can release NO. These cells were tested in vivo in a mouse model of AKI. When MGP is systemically administrated in AKI mice, it can induce eMSC to release NO in a precise and spatiotemporal manner, possibly enhancing the therapeutic efficacy of these stem cells.

      The authors have conducted a very interesting study. The results are likely of interest to the renal scientific community, especially in the context of acute kidney injury.

      Weaknesses are present. Methods (animals, groups, time points, cell lines, bulk RNA-seq, etc.) are not clearly described and details are missing. Legends are not clear, and some Figures do not clearly represent the results discussed.

    4. Reviewer #3 (Public Review):

      Mesenchymal stem cells have been shown to have potent immunomodulatory and regenerative properties and have been tested and tried in kidney transplantation. In a previous paper, the authors of this paper reviewed the beneficial actions of nitric oxide (NO) on the beneficial action of MSC. In this manuscript, they describe a method to generate NO in the therapeutic MSC. While NO donors like the short-acting nitrates have been used for angina pectoris patients few therapeutic approaches have been published aiming at the local delivery of NO to specific tissues or organs like the kidney. Gene therapy with adenoviral vectors, overexpressing the eNOS gene itself failed due to the fact that the eNOS enzyme, when overexpressed quickly runs out of sufficient co-factors like BH4. As a consequence, the enzyme uncouples and becomes cytotoxic due to the generation of peroxynitrate. Hence, the current strategy to generate NO in the MSC itself is novel and interesting.

      The authors first describe the cryoprotective effects and antioxidant effects of NO generation in MSC in vitro and subsequently in vivo in a mouse model of ischemia-reperfusion injury that may reflect acute kidney injury (or ischemia associated with kidney transplantation) in patients. While the MSC are transplanted intracortical on a local position in the kidneys, the manuscript describes surprising effectivity on serum creatinine, ureum, casts, and protection of brush border. Also, upon immunohistochemical analyses, fibrosis, and kidney injury markers decrease. Most likely there is a strong paracrine effect. It is unfortunate that the control "PBS + MGP" is lacking to exclude some low-grade background conversion of the compound with subsequent release of NO. MGP only is tested however, studies in kidney sections with state-of-the-art EPR, give the authors the wanted control.

      The paper provides an interesting proof of concept for a novel therapeutic approach. However, in the clinical arena, some questions remain involving the survival of the MSC after transplantation and the introduction of novel antigens associated with the engineered cells

    1. eLife assessment

      This important study identifies the functional consequence of myelination of interneuronal axons on circuit function by showing that 4.1B deletion leads to altered myelination in a subset of interneurons and altered intrinsic and synaptic physiological parameters. The authors' conclusions about how myelination of inhibitory axons affects physiological properties are based on solid evidence using a combination of imaging and electrophysiological approaches.

    2. Reviewer #1 (Public Review):

      The Authors of this study have investigated the consequence of knocking out protein 4.1B on hippocampal interneurons. They observed that in 4.1B KO mice, the myelinization of axons of PV and SST interneurons was altered. In addition, the molecular organization of the nodal, heminodal, and juxtaparanodal parts of the interneuron axons was disrupted in 4.1B KO mice. Further, the authors found some changes in spiking features of SST, but not PV interneurons as well as synaptic inhibition recorded in CA1 pyramidal cells. Lastly, 4.1B KO mice showed some impairment in spatial memory.

      Strengths<br /> One of the strengths of this MS is the multilevel approach to the question of how myelinization of interneuron axons can contribute to hippocampal functions. Further, the cell biological results support the claim of the reorganization of channel distributions at axonal nodes.

      Weaknesses<br /> 1. Although the authors acknowledge that SST is expressed in different GABAergic cell types in the hippocampus, they claim that OLM cells, which express SST are subject to changes in 4.1B KO mice. However, this claim is not supported by data. Both OLM cells and GABAergic projection cells expressing SST have many long-running axons in the stratum radiatum, where the investigations have been conducted (e.g. Gulyas et al., 2003; Jinno et al., 2007). Thus, the SST axons can originate from any of these cell types. In addition, both these GABAergic cells have a sag in their voltage responses upon negative current injections (e.g. Zemankovics et al., 2010), making it hard to separate these two SST inhibitory cell types based on the single-cell features. In summary, it would be more appropriate to name the sampled interneurons as SST interneurons. Alternatively, the authors may want to label intracellularly individual interneurons to visualize their dendrites and axons, which would allow them to verify that the de-myelinization occurs along the axons of OLM cells, but not SST GABAergic projection neurons.

      2. Although both the cellular part and the behavioral part are interesting, there is no link between them at present. The changes observed in spatial memory tests may not be caused by the changes in the axonal de-myelinization of hippocampal interneurons. Such a claim can be made only using rescue experiments, since changes in 4.1B KO mice leading to behavioral alterations may occur i) in other cell types and ii) in other regions, which have not been investigated.

    3. Reviewer #2 (Public Review):

      In this study, Pinatel et al. address the role of interneuron myelination in the hippocampus using a 4.1B protein mouse knockout model. They show that deficiency in 4.1B significantly reduces myelin in CA1 stratum radiatum, specifically myelin along axons of parvalbumin and somatostatin hippocampal interneurons. In addition, there are striking defects in the distribution of ion channels along myelinated axons, with misplacement of Na channel clusters along the nodes of Ranvier and the heminodes, and a pronounced decrease in potassium channels (Kv1) at juxtaparanodes. The axon initial segments of SST are also shorter. Because the majority of myelinated axons in the stratum radiatum of the hippocampus belong to PV and SST interneurons such profound changes in myelination are expected to affect interneuronal function. Interestingly, the authors show that PV basket cells' properties appear largely unaffected, while there are substantial changes in stratum oriens O-LM cells. Inhibitory inputs to pyramidal neurons are also changed. Behaviorally, the 4.1B KO mice exhibit deficits in spatial working memory, supporting the role of interneuronal myelination in hippocampal function. This study provides important insights into the role of myelination for the function of inhibitory interneurons, as well as in the mechanisms of axonal node development and ion channel clustering, and thus will be of interest to a broad audience of circuit and cellular neuroscientists. However, the claims of the specificity of the reported changes in myelination need to be better supported by evidence.

      Strengths:<br /> The authors combine a wide array of genetic, immunolabeling, optical, electrophysiological, and behavioral tools to address a still unresolved complex problem of the role of myelination of locally projecting inhibitory interneurons in the hippocampus. They convincingly show that changing myelination and ion channel distribution along nodes and heminodes significantly impairs the function of at least some interneuron types in the hippocampus and that this is accompanied by behavioral deficits in spatial memory.

      Regarding the organization of myelinated axons, the lack of 4.1B causes striking changes at the nodes of Ranvier that are convincingly and beautifully presented in the Figures. While the reduction in Kv1 in 4.1B KO mice has been previously reported, the mislocalization of sodium channels at the nodes and heminodes had only been observed in developing but not adult spinal cords. This difference in the dependence of the sodium channel distribution on 4.1B in adult hippocampus vs spinal cord may hold important clues for the varying role of myelin along axons of different neuronal types.

      The manuscript is very well written, the discussion is comprehensive, and provides detailed background and analysis of the current findings and their implications.

      Weaknesses:<br /> Because of the wide diversity of interneuron types in the hippocampus, and also the presence of myelinated axons from other neuron types as well, including pyramidal neurons, it is very difficult to disentangle the effects of the observed changes in the 4.1 B KO mouse model. While the authors have been careful to explore different possibilities, some of the claims of the specificity of the reported changes in myelination are not completely founded. For example, there is no compelling evidence that the myelination of axons other than the local interneurons is unchanged. The evidence strongly supports the claims of changes in interneuronal myelination, but it leaves open the question of whether 4.1B lack affects the myelination of hippocampal pyramidal neurons or of long-range projections.

      To be able to better interpret the changes in the 4.1B KO mice, knowledge of the distribution of 4.1B in the hippocampus of control mice will be very helpful. The authors state that 4.1B is observed in PV neurons but not in pyramidal neurons, however, the evidence is not convincing. Thus, the lack of immunolabeling at the pyramidal neuron cell bodies does not indicate that 4.1B is missing at the axonal level. The analysis also leaves out the question of whether 4.1 B is seen in the axons of somatostatin neurons.

    4. Reviewer #3 (Public Review):

      Pinatel and colleagues addressed a currently understudied topic in neurobiology, namely, the architecture and function of myelination in subsets of Parvalbumin (PV)- and Somatostatin (SST)-positive GABAergic hippocampal interneurons and its dependence on juxtaparanodal organizer proteins. In order to elucidate the structural and functional implications of interneuron myelination, the authors visualized inhibitory neurons by utilizing a Lhx2-tdTomato reporter line in combination with crucial cytoskeletal linker proteins such as Contactin2/TAG-1, Caspr2, and Protein 4.1B. They then applied a comprehensive set of histological, electrophysiological, and behavioral experiments to dissect the role these proteins play in proper myelination and function of PV- and SST-interneurons.

      The bulk of the study's data is based on immunofluorescence, which is presented in a number of figures comprised of high-quality images. As much as this is a strength of the study, the underlying image analysis as described in the methods falls short. All structural data rely on the measurements of physical parameters such as length of internodes, the distance between (juxta)paranode and node, the distance between node and myelin sheath, length of the axon initial segment (AIS), etc. In light of this, and considering the small physical dimensions of the nodal region in general, the methods remain unclear about the depth of 3D reconstruction/deconvolution applied to the samples. Measurements presented in the results show significant differences in sub-micrometer dimension, which at least according to the stated methods, are unlikely to be precise given that the confocal imaging parameters do not seem to reach Nyquist conditions. For a study in which a third of all data is aimed at elucidating (sub)micrometer changes, this is crucial and the study would benefit from a more rigorous method description by the authors.

      Another methodological weakness is the somewhat small n, and its incoherence across the experiments and therefore, the statistics performed in some of the experiments. Statistics are based on either n for animals, or n for individual data points from several animals. Why is not all data represented as mean/animal? Also, the sampling in general with n = 3 animals is borderline acceptable; in some cases, it seems that only 2 animals were used, and in others, no number is given at all (please refer to author comments for details). This needs to be addressed, either by explaining why so few animals were used, or by adding more data from individual animals. Assigning structures (AIS, nodes) as n results in overstating effects, since especially for AIS, there is significant heterogeneity in the length across neurons from the same type, and this is masked when 100 AIS are considered as individual n instead 100 AIS per animal, and the animal is (correctly) the n. Since the study seems to switch back and forth between these assignments, it would be helpful to level these data across all experiments unless there are specific reasons not to do so, which then need to be explained. As outlined in the methods, all values are given as means {plus minus} SEM; this needs to be corrected for those cases where the standard deviation is the appropriate choice (e.g. all graphs showing n = individual structure, and not the mean of an animal).

      As far as the analysis of geometrical AIS changes is concerned, the method section should be extended to address how, if at all, AIS length and position were analyzed in 3D, also considering the somewhat "spotty" immunosignal outlined in Fig. 8D. The observed AIS length change is then discussed in the context of a study conducted in a pharmacological model of myelin loss, however, that particular study (Hamada & Kole, 2015) found not only a length change but a position change after cuprizone-induced AIS plasticity. The authors should therefore discuss this finding in a bit more detail than simply stating "Adaptation of the AIS has been reported in the cuprizone chemical model of demyelination" (p. 14, ll. 512).

      Similarly to the points made about structural data above, the data from electrophysiological recordings should be presented in such a way that e.g. the number of cells and/or animals is readily accessible from the graph or legend. In its current form, this information - while available - needs to be pieced together from in-text information supplemented by figure legends. Sometimes, the authors do not include the number of animals behind individual cell data (for details please see author comments). Please carefully review all figures and edit accordingly.

      The behavioral data presented in the study is interesting, but the conclusions drawn are not supported by the data presented, as many unknown factors remain in place that could contribute to the observed phenotype.

    1. eLife assessment

      This is an important study that uses chromatin accessibility as a measure to determine the impact of neuronal activity on the state of chromatin regulatory elements in striatal neurons. The authors provide convincing evidence of how Pdyn gene expression is highly dependent on a distal regulatory genomic region both at basal and upon neuronal activation in this particular system, a mechanism conserved as well in human neuronal cells. Although some findings are not novel, this paper ties previous findings all together in one place and uses the analysis to then identify a functionally relevant and conserved enhancer for the prodynorphin gene with potential relevance for neuropsychiatric disorders beyond basic cellular neuroscience.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this manuscript, the authors use ATAC-seq to find regions of the genome of rat embryonic striatal neurons in culture that show changes in regulatory element accessibility following stimulation by KCl-mediated membrane depolarization. The authors compare 1hr and 4hr transcriptomes to see both rapid and late response genes. When they look at ATAC-seq data they see no changes in accessibility at 1hr but strong changes at 4hr. The differentially accessible sites were enriched for the AP-1 site, suggesting regulation by Fos-Jun family members, and consistent with the requirement for IEG expression, anisomycin blocked the increase in accessibility. To test the functional importance of this regulation the authors focus on a putative enhancer 45kb upstream of the activity-induced gene encoding the neuromodulator dynorphin (Pdyn). To test the function of this region, the authors recruited CRISPRi to the site, which blocked KCL-dependent Pdyn induction, or CRISPRa, which selectively increased Pdyn expression in the absence of KCl. Finally, the authors reanalyze other human and rat datasets to show cell-type specific function of this enhancer correlated to Pdyn expression.

      Strengths:<br /> The idea that stimuli that induce expression of Fos in neurons can change the accessibility of regulatory elements bound from Fos has been shown before, but almost all the data are from hippocampal neurons so it is nice to see the different cell type used here. The most interesting part of the study is the identification of the Pdyn enhancer because of the importance of this gene product in the function of striatal neurons. Overall the conclusions appear to be well supported by the data.

      Weaknesses:<br /> The timing and the location of the accessibility changes are meaningfully different from other similar studies, which should be discussed. The authors provide good data for the function of a single enhancer near Pdyn, but could contextualize this with respect to other regulatory elements nearby.

    3. Reviewer #2 (Public Review):

      In this manuscript, the authors characterize activity-dependent transcriptional and epigenetic changes at two different time points (1h and 4hrs) after neuronal activation using rat striatal primary cultures. They show that while at 1h post-stimulation mostly a selective set of IEGs are up-regulated, at 4hrs a wider set of genes, identified as late-response genes (LRGs), are upregulated, with distinct functional signatures. By using ATAC-seq, the authors show how chromatin accessibility is mostly spared at 1h post-stimulation, while a prominent set of differentially accessible regions (DARs) could be identified at 4hrs post-stimulation, enriched in motifs for TFs upregulated at their initial time-point. These chromatin changes appear to be dependent on the earlier translation of proteins, as they are avoided when neuronal cultures are pre-treated with the protein synthesis inhibitor Anisomycin. Afterwards, the authors characterized a set of regulatory regions of a particular LRG, Pdyn, associated with neuropsychiatric disorders, by using CRISRPR to activate or inactivate an enhancer that increases its accessibility at 4hrs post-stimulation, showing that the expression of Pdyn is highly dependent on this regulatory region both, at basal level and for its proper activity-dependent stimulation and that it is enriched in motifs for IEGs upregulated at 1h post-stimulation. Using publicly available data from human GABAergic and glutamatergic neurons similarly stimulated with KCl, the authors show that this enhancer is conserved in humans and that it is mostly modified in GABAergic neurons in response to neuronal stimulation, but not in glutamatergic neurons. Finally, the authors suggest that the regulatory role of the Pdyn enhancer they focus on it might be cell-type specific, as single-nuclei ATAC-seq data generated in rat Nucleus Accumbens (NAc) shows that its coaccessibility score together with Pdyn promoter is more prominent in Drd1- and Grm8-MSNs.

      Among the major strengths of the article, there is the generation of neuronal RNA-seq and ATAC-seq data in a model system, rat striatal neuronal cells, that hasn't been so broadly characterized as other more common ones such as mouse hippocampal neuronal cells and the functional characterization of an enhancer of the Pdyn gene that might be of interest for translational applications in which alterations of this gene might be occurring in neurological disorders.

      On the other hand, the manuscript presents several weaknesses to consider. First of all, at a conceptual level, most of the findings related to the induction of particular transcriptional programs upon neuronal activation the changes in chromatin state, and the need for protein translation for proper induction of LRGs have been broadly characterized previously in the literature (Tyssowski et al., Neuron, 2018; Ibarra et al., Mol. Syst. Biol., 2022; and also reviewed by Yap and Greenberg, Neuron, 2018). In addition, it is not so obvious why to focus on Pdyn gene regulatory regions among the thousands of genes upregulated and with modified chromatin landscape after neuronal activation. The authors highlight three particular traits of this gene as the reason to choose it, but those traits are probably shared by most of the genes that are part of the LRGs set.

      At the methodological level, some attention should be put into the timings chosen for generating the data. The authors claim that these time points (1h and 4hrs) identify the first (i.e IEGs) and second (i.e LRGs) waves of transcription. However, at 4hrs the highest over-expressed genes are still IEGs, as shown in the volcano plots of Figure 1B and 1C, showing a high overlap with up-regulated genes found at 1h (Figure 1D). This might suggest that the 4hrs time point is somewhere in between the first and second wave of transcription, probably missing some of the still-to-be-induced LRGs of the latest one.

      Finally, while only prosed as a suggestion, the assumption that from the data generated in this article, we can envision a mechanism by which AP-1 family of transcription factors interacts with the SWI/SNF chromatin remodeling complex is going too far, as no evidence is provided implicated SWI/SNF in the data presented in the manuscript.

    4. Reviewer #3 (Public Review):

      This work contributes to the literature characterizing early and late waves of transcription and associated chromatin remodeling following neuronal depolarization, here in cultured embryonic striatum. While they find IEG transcription 1h after depolarization, they find chromatin remodeling is slower (opening at the 4h time point). This may be due to chromatin at IEG regulatory regions already being open (in embryonic striatum), although previous work has found remodeling occurring at the 1h time point (in adult dentate gyrus). The authors next show that the chromatin remodeling that occurs at the late (4h) stage is largely in putative regulatory regions of the genome (rather than gene bodies), and is dependent on translation, which validates and extends the prior literature. The authors then transition from genome-wide basic neuroscience to focus on a specific gene of interest, prodynorphin (Pdyn), and a putative enhancer they identify from their chromatin analysis. They target CRISPR-activating and -inhibiting complexes to the putative enhancer and demonstrate that accessibility of this locus is necessary and sufficient for Pdyn transcription. They then show that at least one PDYN enhancer is conserved from rodents to humans, and is only activity-regulated in human GABAergic but not glutamatergic neurons. Finally, the authors generate snATAC-seq and show Pdyn gene and enhancer activity are also cell-type-specific in the rat striatum. The Pdyn work in particular is thorough and novel.

      Strengths:<br /> This work integrates multiple cutting-edge methods (multiple forms of genome-wide sequencing, combining new and published data across species, applying new forms of bioinformatic analysis, and targeted epigenome editing) to repeatedly and convincingly demonstrate these waves of chromatin remodeling and transcription. The figures and visual representations of data in particular set a new standard for the field. Although several findings within this paper are not novel, this paper ties previous findings all together in one place and goes on to show potential relevance for neuropsychiatric disorders beyond basic cellular neuroscience. The conclusions are mostly supported by the data.

      Results and conclusions that would benefit from clarification/extension.<br /> 1. Throughout the paper, the authors emphasize a "temporal decoupling" of transcriptional and chromatin response to depolarization, based on a lack of significant chromatin changes at 1h, despite IEG transcription. However, previous publications show significant chromatin remodeling at 1h (e.g. Su et al., NN 2017 in adult dentate gyrus) or 2h (Kim et al., Nature 2010; Malik et al., NN 2014 in cultured embryonic cortical neurons). The discussion briefly mentions this contrast, but it remains difficult to conclude decisively whether there is temporal decoupling when such decoupling is not found consistently. If one is to make broad conclusions about basic neural chromatin response to depolarization, it would be ideal to know under which conditions there is temporal decoupling, or if this is a region-specific phenomenon.

      2. The UMAP analysis is a novel way to probe transcription factor enrichment, but it's unclear what this is actually showing. The authors sought to ask whether "DARs could be separated based on transcription factor motifs in these regions." However, the motifs present in any genomic stretch are fixed based on genomic sequence, so it seems like this analysis might be asking whether certain motifs are more likely to be physically clustered together in the genome, in activity-regulated regions (rather than certain transcription factors acting in concert, as is implied in discussion). While still potentially interesting, this analysis does not seem to give much additional insight into activity-dependent chromatin remodeling beyond the motif enrichment analysis already performed. Nevertheless, to draw stronger conclusions, it would be necessary to compare clustering to a random set of genomic regions of the same length/size to interpret the clustering here. It would also be useful to know whether the ISL1 motif is also enriched in ubiquitously accessible genomic regions in the striatum (and not just DARs).

      3. The authors identify late-response gene enhancers by 3 criteria. However, only Pdyn was highlighted thereafter. How many putative DARs met these three criteria in striatum? Only Pdyn?

    1. eLife assessment

      This is an important study that characterized the activity of optogenetically identified dopaminergic and GABAergic neurons in the ventral tegmental area (VTA) in mice performing a memory-guided T-maze task. The authors show that subpopulations of dopaminergic and GABAergic neurons exhibited choice-related activity during the delay period, which was enhanced when the task requires short-term memory. The reviewers found that the results are surprising, novel, and convincing, while some relatively minor issues were pointed out regarding the data presentation and analysis.

    2. Reviewer #1 (Public Review):

      Midbrain dopamine neurons have attracted attention as a part of the brain's reward system. A different line of research, on the other hand, has shown that these neurons are also involved in higher cognitive functions such as short-term memory. However, these neurons are thought not to encode short-term memory itself because they just exhibit a phasic response in short-term memory tasks, which cannot seem to maintain information during the memory period. To understand the role of dopamine neurons in short-term memory, the present study investigated the electrophysiological property of these neurons in rodents performing a T-maze version of short-term memory task, in which a visual cue indicated which arm (left or right) of the T-maze was associated with a reward. The animal needed to maintain this information while they were located between the cue presentation position and the selection position of the T-maze. The authors found that the activity of some dopamine neurons changed depending on the information while the animals were located in the memory position. This dopamine neuron modulation was unable to explain the motivation or motor component of the task. The authors concluded that this modulation reflected the information stored as short-term memory.

      I was simply surprised by their finding because these dopamine neurons are similar to neurons in the prefrontal cortex that store memory information with a sustained activity. Dopamine neurons are an evolutionally conserved structure, which is seen even in insects, whereas the prefrontal cortex is developed mainly in the primate. I feel that their findings are novel and would attract much attention from readers in the field. But the authors need to conduct additional analyses to consolidate their conclusion.

    3. Reviewer #2 (Public Review):

      The authors phototag DA and GABA neurons in the VTA in mice performing a t-maze task, and report choice-specific responses in the delay period of a memory-guided task, more so than in a variant task w/o a memory component. Overall, I found the results convincing. While showing responses that are choice selective in DA neurons is not entirely novel (e.g. Morris et al NN 2006, Parker et al NN 2016), the fact that this feature is stronger when there is a memory requirement is an interesting and novel observation.

      I found the plots in 3B misleading because it looks like the main result is the sequential firing of DA neurons during the Tmaze. However, many of the neurons aren't significant by their permutation test. Often people either only plot the neurons that are significant, or plot with cross-validation (ie sort by half of the trials, and plot the other half).

      Relatedly, the cross-task comparisons of sequences (Fig, 4,5) are hampered by the fact that they sort in one task, then plot in the other, which will make the sequences look less robust even if they were equally strong. What happens if they swap which task's sequences they use to order the neurons? I do realize they also show statistical comparisons of modulated units across tasks, which is helpful.

      Overall, the introduction was scholarly and did a good job covering a vast literature. But the explanation of t-maze data towards the end of the introduction was confusing. In Line 87, I would not say "in the same task" but "in a similar task" because there are many differences between the tasks in question. And not clear what is meant by "by averaging neuronal population activities, none of these computational schemes would have been revealed. " There was trial averaging, at least in Harvey et al. I thought the main result of that paper related to coding schemes was that neural activity was sequential, not persistent. I think it would help the paper to say that clearly. Also, I'm not aware it was shown that choice selectivity diminishes when the memory demand of the task is removed - please clarify if that is true in both referenced papers. If so, an interpretation of this present data could be found in Lee et al biorxiv 2022, which presents a computational model that implies that the heterogeneity in the VTA DA system is a reflection of the heterogeneity found in upstream regions (the state representation), based on the idea that different subsets of DA neurons calculate prediction errors with respect to different subsets of the state representation.

      I am surprised only 28% of DA neurons responded to reward - the reward is not completely certain in this task. This seems lower than other papers in mice (even Pavlovian conditioning, when the reward is entirely certain). It would be helpful if the authors comment on how this number compares to other papers.

    1. eLife assessment

      This study proposes a deep learning-based segmentation pipeline of fetal brain MRI, with parcellation based on a newly implemented atlas. This represents an important contribution to the field of developmental neuroscience and pediatric neuroimaging, especially as the pipeline and atlas are publicly available. The evidence for the pipeline robustness and atlas relevance is convincing given the extensive validations provided and the very high-quality ground truth dataset. Although beyond the state of the art, the study would benefit from further comparisons with existing methods and additional evaluations of the framework generalizability according to image quality, subject age or brain abnormalities.

    2. Reviewer #1 (Public Review):

      Main contributions / strengths

      The authors propose a process to improve the ground truth segmentation of fetal brain MRI via a semi-supervised approach based on several iterations of manual refinement of atlas label propagations. This procedure represents an impressive amount of work, likely resulting in a very high-quality ground truth dataset. The corrected labels (obtained from multiple different datasets) are then used to train the final model which performs the brain extraction and tissue segmentation tasks. We also acknowledge the caution paid by the authors regarding the future application of their pipeline to unseen datasets.

      The conclusions of this paper are mostly well supported by data, but some aspects of the analysis and validation procedure need to be clarified and extended. In addition, the article would greatly benefit from providing further descriptions of crucial aspects of the study.

      Main limitations and potential improvements

      1) New nomenclature/atlas not sufficiently described/justified.

      The proposed nomenclature and atlas are one of the main contributions of this work. We clearly acknowledge the importance for the community of such a contribution. The definition of any nomenclature implies that decisions were taken regarding the acceptable level of ambiguity in the identification of the boundary between neighboring anatomical structures with respect to the gradient in the intensities in the MRI. It is acceptable (and probably inevitable) to set relatively arbitrary criteria in ambiguous regions, providing that these criteria are explicitly stated. The explicit statement of the decisions taken is essential in particular for better interpretation of residual segmentation inaccuracies in application studies.

      As a matter of comparison, the postnatal atlas and nomenclature were based on the Albert protocol, which is described in extensive detail. While such a complete description might fall beyond the scope of this work, we believe that an additional description of the nomenclature and protocol, allowing reproduction the manual segmentation on external datasets is required, at least for most ambiguous junctions between structures. For instance, the boundaries across substructures within the DGM are difficult to visualize on the exemplar subjects shown in Fig. 5 and Fig. 6.

      Please provide additional precision on how the following were defined: boundaries between lateral ventricles and cavum; between cavum and CSF; the delineation of 3rd and 4th ventricles; the definition of the vermis, especially its junctions with the cerebellum and the brainstem.<br /> How are these boundaries impacted by the changes in the image intensities related to tissue maturation?

      We would also greatly appreciate an extension of the qualitative comparison with the two most commonly used protocols (Albert and FETA), for instance, why didn't the authors isolate the hippocampus/amygdala structure? And then how is the boundary between gray and white matter defined in this region?

      2) More detailed comparison with FETA for some structures would be informative despite obvious limitations.

      More specifically, the GM should have a very similar definition. In the "Impact of anomalies' section (page 7) the authors compare their results with the dice score from the FETA challenge and conclude that the difference "highlights the advantages of using high-quality consistent ground truth labels for training". The better performances (from ~0.78 to ~0.88) might be mostly due to the improvement of the ground truth (of the test set). This could be confirmed by observing the ground truth from FETA of the GM for a few cases for which the dice shows a strong increase in performance with respect to FETA. Note that the gain in performance is appreciable even if it is due to a better ground truth.

      3) Improvement of the ground truth labels is an important contribution of this work, thus we would appreciate a more quantitative description of the impact of the manual correction, such as reporting the change in the dice score induced by the correction.

      Quantification of the refinement process would help to better evaluate the relevance of the proposed approach in future studies e.g. introducing a different nomenclature. More specifically, a marked change would be expected after the first training when there is a switch (and refinement) from the registration-propagated labels to the ones predicted by the DL model (as shown in Fig. 5, the changes are quite strong). Again a dice score indicating quantitatively how much improvement results from each iteration would be informative. In the same line, is the last iteration of this process needed or did the authors observe a 'stabilization' (i.e. less manual editing needing to be performed)?

      4) The testing / training data-splitting strategy is not sufficiently detailed and difficult to follow. The following points deserve clarification:

      a) Why did the authors select only four sites for the test set (out of six studies presented in the 3.1 section)?

      b) Data used for training: in the first step the authors selected 200 for label propagation and selected only the best 100. In the second stage, the predictions are computed for all training/validation sets (380) and only 200 are selected. When the process was iterated, why did the authors select only 200 out of the 380? Are the same subjects selected across iterations?<br /> Were the acquisition parameters / gestational age controlled for each selection? If yes please specify the distributions precisely.

      Did the authors control the potential imbalanced proportion that is present in the dataset (more subjects from dHCP for instance)? (line 316, 100 subjects were selected from only three centers. Why only three? Did the authors keep the same sub-site for other stages?)

      c) "The testing dataset includes 40 randomly selected images from four different acquisition protocols" which shows that attention was paid to variations in the scanning parameters, which is of crucial importance. However, no precision is provided regarding the gestational age of this dataset, which impedes the interpretation since a potential influence of age on the accuracy of the segmentation would be problematic. Indeed, the authors mention that the manual correction deserved special attention for late GA (>34 weeks). Please specify precisely the age distribution across the 10 subjects of each of the four acquisition protocols. In addition, the qualitative results shown in Fig.6 and subsection "Impact of GA at scan" are not sufficient and an additional result table reporting the same population and metrics as in Table 2, but dissociating younger versus older fetuses, would be much more informative to rule out potential bias related to gestational age.

      d) The definition of the ground truth labels for the test set is not described.

      We understand (from the result) that the ground truth for the test set is defined by manual refinement of the atlas label propagated. This should be explicitly described on page 5 after the "Preparation of training datasets" section.

      5) The validation of segmentation accuracy based on the volumetry growth chart is invalid.

      In Section "4.3. Growth charts of normal fetal brain development", since manual corrections were involved, the reported results cannot be considered as a validation of the segmentation pipeline. Regarding the validation of the segmentation pipeline, the quantitative and qualitative results provided in Table 2 and the corresponding text and figures seem sufficient to us (providing our concerns above are addressed, especially regarding the impact of the gestational age).

      The growth charts are still valuable to support the validity of the nomenclature and segmentation protocol, but then why are the growth charts computed only for some structures? Reporting the growth chart and statistical evaluation of the impact of acquisition settings using ANCOVA for all the substructures from the proposed protocol would be expected here, in particular for the structures for which the delineation might be ambiguous such as the cavum, the vermis, and DGM substructures such as the thalamus.

      Finally, please provide further details on the type and amount of manual correction needed for computing the growth charts.

      6) MRI data was acquired only on Phillips scanners.

      We acknowledge the efforts to maximize heterogeneity in the MRIs,e.g. with both 1.5T and 3T scanners, variations in TE and image resolution, but still, all MRIs included in this study were acquired using the SSTSE sequence on Phillips scanners. The study does not include any MRI acquired on Siemens nor GE scanners, and no image was acquired using the balance-FFE/TRUFISP/FIESTA type sequence. This might limit generalizability.

    3. Reviewer #2 (Public Review):

      This work presents a new, automated, deep learning-based segmentation pipeline for fetal cerebral MRI based on the anatomical definitions of the new fetal atlas of the Developing Human Connectome Project. The authors' new software pipeline demonstrated robust performance across different acquisition protocols and gestational age ranges, reducing the need for manual refinement. To provide ground truth data for training their deep learning network, the authors employed a semi-supervised approach, in which atlas labels were propagated to the datasets, and they were corrected manually.

      This work stands out for its extensive training on a large number of datasets, it achieves precise anatomical definition through a refined brain tissue parcellation protocol, and it evaluates the segmentation results against growth curves, allowing for a comprehensive assessment of fetal brain development. Due to the fact that abnormal anatomy was largely unobserved by the segmentation network, it is highly likely, however, that the BOUNTI pipeline would lead to some incorrect segmentations in subjects with moderate to large ventriculomegaly, as well as in cases of malformations of the corpus callosum, brainstem or neural tube defects. Further work is required for BOUNTI to generalize its application to pathological brains, as the vast majority of fetal cerebral MRI cases in clinical practice involve such abnormalities rather than normal brain development. This step is crucial for facilitating the clinical translation of BOUNTI. The algorithm is publicly available and works without limitations on datasets acquired in other centers.

    4. Reviewer #3 (Public Review):

      This work provides a novel framework for semi-automatic segmentation and parcellation of brain tissues from fetal magnetic resonance imaging (MRI) by fusing an advanced deep learning technique and manual correction by experts. Over the broad age spectrum spanning newborns to adults, several fully-automatic segmentation/parcellation techniques have been proposed, showing robust, reliable performance across MR images with varying imaging quality. Unlike other age groups, however, scanning of the fetal brain is conducted in the womb; thus, there are additional and unique challenges, such as ambiguous positioning of the fetal brain, the surrounding maternal tissue in the fetal MRI, and fetal and maternal motion. These challenges in fetal MRI have collectively served as important bottlenecks in developing robust, reliable automatic segmentation/parcellation frameworks to date. This paper proposes a methodological framework for the segmentation and parcellation of fetal MRI scans using a two-step deep learning model, each for segmentation and parcellation. It is also noteworthy that the validity of the proposed framework has been extensively tested over different datasets with different image quality and different recording parameters, so the robust generalizability of the framework over other fetal MRI datasets is clearly suggested.

      Strengths:

      In general, a novel design framework, with separation of segmentation and parcellation schemes under each deep learning model, provides ample room for improving the model performance, as suggested by the results of this study. In addition, thanks to the flexibility in the model design (e.g., the choice of deep learning model) and parameters (e.g., manual correction step during training), an identical or similar framework can be easily extended to other datasets for different age groups or diagnostic groups/brain disorders. Another strength is the minimal requirement of human interaction after the training stage as significant time and effort of manual correction is often required following the automatic segmentation of fetal MR images. Lastly, thorough investigation of the inter-dataset generalizability of the proposed segmentation/parcellation framework will be well-received by the fetal neuroscience community.

      Weakness:

      The main weakness of this paper is the vague definition of the scientific novelty. By design, this paper is a technical study. The technical advancement claimed by the authors is a novel design of deep learning and a two-step deep-learning framework; each for segmentation and parcellation. There have been, however, other deep learning studies, and some share nearly identical model architecture to the one published by Asis-Cruz et al. (Frontiers in Neuroscience, 2022). As such the conceptual improvement in terms of deep learning model architecture is overstated. Regarding the separate framework for segmentation and parcellation, the conventional preprocessing protocol (e.g., Draw-EM; Makropoulos et al. IEEE Transactions on Medical Imaging, 2014) already presented a similar concept. Overall, it is unclear what unique technical advances have been made in the current paper.

      A second weakness of the work is the insufficient comparison to other conventional published methods. While the authors' claim that there is no "universally accepted" protocol for fetal brain segmentation/parcellation is at least partially true, Draw-EM, which was originally designed for neonatal brain segmentation, has been widely and successfully utilized in many fetal MRI studies, as discussed by the authors. Instead of a direct comparison to Draw-EM, the authors only performed a descriptive comparison using two exemplar MRI scans. It is unclear whether the superior performance of the proposed framework in these selected scans would be generalizable to others. Similarly, the authors claim that the proposed deep-learning-based segmentation/parcellation framework required minimal time for manual post-preprocessing refinement (1-3 mins), compared to 1-3 hours in another study using Draw-EM (Story et al. Neuroimage: Clinical, 2021). Again, this may not represent a fair comparison considering that the intensity/precision of manual refinement may differ depending on the different goals/objectives of other studies.

    1. eLife assessment

      The authors provide a valuable analysis of what neural circuit mechanisms enable varying the speed of retrieval of sequences, as needed for say reproducing motor patterns. Their use of heterogeneous plasticity rules to allow external currents to control speed of sequence recall is a novel alternative to other mechanisms proposed in the literature. They perform a solid characterization of relevant properties of recall via simulations and theory, which would benefit from a better mapping to biologically plausible mechanisms.

    2. Reviewer #1 (Public Review):

      While there are many models for sequence retrieval, it has been difficult to find models that vary the speed of sequence retrieval dynamically via simple external inputs. While recent works [1,2] have proposed some mechanisms, the authors here propose a different one based on heterogeneous plasticity rules. Temporally symmetric plasticity kernels (that do not distinguish between the order of pre and post spikes, but only their time difference) are expected to give rise to attractor states, asymmetric ones to sequence transitions. The authors incorporate a rate-based, discrete-time analog of these spike-based plasticity rules to learn the connections between neurons (leading to connections similar to Hopfield networks for attractors and sequences). They use either a parametric combination of symmetric and asymmetric learning rules for connections into each neuron, or separate subpopulations having only symmetric or asymmetric learning rules on incoming connections. They find that the latter is conducive to enabling external inputs to control the speed of sequence retrieval.

      Strengths:<br /> The authors have expertly characterised the system dynamics using both simulations and theory. How the speed and quality of retrieval varies across phases space has been well-studied. The authors are also able to vary the external inputs to reproduce a preparatory followed by an execution phase of sequence retrieval as seen experimentally in motor control. They also propose a simple reinforcement learning scheme for learning to map the two external inputs to the desired retrieval speed.

      Weaknesses:<br /> 1. The authors translate spike-based synaptic plasticity rules to a way to learn/set connections for rate units operating in discrete time, similar to their earlier work in [5]. The bio-plausibility issues of learning in [5] carry over here, for e.g. the authors ignore any input due to the recurrent connectivity during learning and effectively fix the pre and post rates to the desired ones. While the learning itself is not fully bio-plausible, it does lend itself to writing the final connectivity matrix in a manner that is easier to analyze theoretically.

      2. While the authors learn to map the set of two external input strengths to speed of retrieval, they still hand-wire one external input to the subpopulation of neurons with temporally symmetric plasticity and the other external input to the other subpopulation with temporally asymmetric plasticity. The authors suggest that these subpopulations might arise due to differences in the parameters of Ca dynamics as in their earlier work [29]. How these two external inputs would connect to neurons differentially based on the plasticity kernel / Ca dynamics parameters of the recurrent connections is still an open question which the authors have not touched upon.

      3. The authors require that temporally symmetric and asymmetric learning rules be present in the recurrent connections between subpopulations of neurons in the same brain region, i.e. some neurons in the same brain region should have temporally symmetric kernels, while others should have temporally asymmetric ones. The evidence for this seems thin. Though, in the discussion, the authors clarify 'While this heterogeneity has been found so far across structures or across different regions in the same structure, this heterogeneity could also be present within local networks, as current experimental methods for probing plasticity only have access to a single delay between pre and post-synaptic spikes in each recorded neuron, and would therefore miss this heterogeneity'.

      4. An aspect which the authors have not connected to is one of the author's earlier work:<br /> Brunel, N. (2016). Is cortical connectivity optimized for storing information? Nature Neuroscience, 19(5), 749-755. https://doi.org/10.1038/nn.4286<br /> which suggests that the experimentally observed over-representation of symmetric synapses suggests that cortical networks are optimized for attractors rather than sequences.

      Despite the above weaknesses, the work is a solid advance in proposing an alternate model for modulating speed of sequence retrieval and extends the use of well-established theoretical tools. This work is expected to spawn further works like extending to a spiking neural network with Dale's law, more realistic learning taking into account recurrent connections during learning, and experimental follow-ups. Thus, I expect this to be an important contribution to the field.

    3. Reviewer #2 (Public Review):

      Sequences of neural activity underlie most of our behavior. And as experience suggests we are (in most cases) able to flexibly change the speed for our learned behavior which essentially means that brains are able to change the speed at which the sequence is retrieved from the memory. The authors here propose a mechanism by which networks in the brain can learn a sequence of spike patterns and retrieve them at variable speed. At a conceptual level I think the authors have a very nice idea: use of symmetric and asymmetric learning rules to learn the sequences and then use different inputs to neurons with symmetric or asymmetric plasticity to control the retrieval speed. The authors have demonstrated the feasibility of the idea in a rather idealized network model. I think it is important that the idea is demonstrated in more biologically plausible settings (e.g. spiking neurons, a network with exc. and inh. neurons with ongoing activity).

      Summary

      In this manuscript authors have addressed the problem of learning and retrieval sequential activity in neuronal networks. In particular, they have focussed on the problem of how sequence retrieval speed can be controlled?<br /> They have considered a model with excitatory rate-based neurons. Authors show that when sequences are learned with both temporally symmetric and asymmetric Hebbian plasticity, by modulating the external inputs to the network the sequence retrieval speed can be modulated. With the two types of Hebbian plasticity in the network, sequence learning essentially means that the network has both feedforward and recurrent connections related to the sequence. By giving different amounts of input to the feed-forward and recurrent components of the sequence, authors are able to adjust the speed.

      Strengths<br /> - Authors solve the problem of sequence retrieval speed control by learning the sequence in both feedforward and recurrent connectivity within a network. It is a very interesting idea for two main reasons: 1. It does not rely on delays or short-term dynamics in neurons/synapses 2. It does not require that the animal is presented with the same sequences multiple times at different speeds. Different inputs to the feedforward and recurrent populations are sufficient to alter the speed. However, the work leaves several issues unaddressed as explained below.

      Weaknesses<br /> - The main weakness of the paper is that it is mostly driven by a motivation to find a computational solution to the problem of sequence retrieval speed. In most cases they have not provided any arguments about the biological plausibility of the solution they have proposed e.g.:

      -- Is there any experimental evidence that some neurons in the network have symmetric Hebbian plasticity and some temporally asymmetric? In the references authors have cited some references to support this. But usually the switch between temporally symmetric and asymmetric rules is dependent on spike patterns used for pairing (e.g. bursts vs single spikes). In the context of this manuscript, it would mean that in the same pattern, some neurons burst and some don't and this is the same for all the patterns in the sequence. As far as I see here authors have assumed a binary pattern of activity which is the same for all neurons that participate in the pattern.

      -- How would external inputs know that they are impinging on a symmetric or asymmetric neuron? Authors have proposed a mechanism to learn these inputs. But that makes the sequence learning problem a two stage problem -- first an animal has to learn the sequence and then it has to learn to modulate the speed of retrieval. It should be possible to find experimental evidence to support this?

      -- Authors have only considered homogeneous DC input for sequence retrieval. This kind of input is highly unnatural. It would be more plausible if the authors considered fluctuating input which is different from each neuron.

      -- All the work is demonstrated using a firing rate based model of only excitatory neurons. I think it is important that some of the key results are demonstrated in a network of both excitatory and inhibitory spiking neurons. As the authors very well know it is not always trivial to extend rate-based models to spiking neurons.

      I think at a conceptual level authors have a very nice idea but it needs to be demonstrated in a more biologically plausible setting (and by that I do not mean biophysical neurons etc.).

    1. eLife assessment

      This important study addresses the fundamentally unresolved question of why many thousands of small-effect loci contribute more to the heritability of a trait than the large-effect lead variants. The authors explore resource competition within the transcriptional machinery as one possible explanation with a simple theoretical model, concluding that the effects of resource competition would be too small to explain the heritability effects. The topic and approximation of the problem are very timely and offer an intuitive way to think about polygenic variation, but the analysis of the simple model appears to be incomplete, leaving the main claims only partially supported.

    2. Reviewer #1 (Public Review):

      This study explores whether the extreme polygenicity of common traits can be explained in part by competition among genes for limiting molecular resources (such as RNA polymerases) involved in gene regulation. The authors hypothesise that such competition would cause the expression levels of all genes that utilise the same molecular resource to be correlated and could thus, in principle, partly explain weak trans-regulatory effects and the observation of highly polygenic architectures of gene expression. They study this hypothesis under a very simple model where the same molecule binds to regulatory elements of a large number m of genes, and conclude that this gives rise to trans-regulatory effects that scale as 1/m, and which may thus be negligible for large m.

      The main limitation of this study lies in the details of the mathematical analysis, which does not adequately account for various small effects, whose magnitude scales inversely with the number m of genes that compete for the limiting molecular resource. In particular, the fraction of "free" molecule (which is unbound to any of the genes) also scales as 1/m, but is not accounted for in the analysis, making it difficult to assess whether the quantitative conclusions are indeed correct. Second, the questions raised in this study are better analysed in the framework of a sensitivity or perturbation analysis, i.e., by asking how *changes* in expression level or binding affinity at one gene (rather than the total expression level or total binding affinity) affect expression level at other genes.

      Thus, while the qualitative conclusion that resource competition in itself is unlikely to mediate trans-regulatory effects and explain highly polygenic architectures of gene expression traits probably holds, the mathematical reasoning used to arrive at this conclusion requires more care.

      In my opinion, the potential impact of this kind of analysis rests at least partly on the plausibility of the initial hypothesis- namely whether most molecular resources involved in gene regulation are indeed "limiting resources". This is not obvious, and may require a careful assessment of existing evidence, e..g., what is the concentration of bound vs. unbound molecular species (such as RNA polymerases) in various cell types?

    3. Reviewer #2 (Public Review):

      The question the authors pose is very simple and yet very important. Does the fact that many genes compete for Pol II to be transcribed explain why so many trans-eQTL contribute to the heritability of complex traits? That is, if a gene uses up a proportion of Pol II, does that in turn affect the transcriptional output of other genes relevant or even irrelevant for the trait in a way that their effect will be captured in a genome-wide association study? If yes, then the large number of genetic effects associated with variation in complex traits can be explained but such trans-propagating has effects on the transcriptional output of many genes.

      This is a very timely question given that we still don't understand how, mechanistically, so many genes can be involved in complex traits variation. Their approach to this question is very simple and it is framed in classic enzyme-substrate equations. The authors show that the trans-propagating effect is too small to explain the ~70% of heritability of complex traits that are associated with trans-effects. Their conclusion relies on the comparison of the order of magnitude of a) the quantifiable transcriptional effects due to Pol II competition, and b) the observed percentage of variance explained by trans effects (data coming from Liu et al 2019, from the same lab).

      The results shown in this manuscript rule out that competition for limited resources in the cell (not restricted to Pol II, but applicable to any other cellular resource like ribosomes, etc) could explain the heritability of complex traits.

    4. Reviewer #3 (Public Review):

      Human complex traits including common diseases are highly polygenic (influenced by thousands of loci). This observation is in need of an explanation. The authors of this manuscript propose a model that competition for a single global resource (such as RNA polymerase II) may lead to a highly polygenic architecture of traits. Following an analytical examination, the authors reject their hypothesis. This work is of clear interest to the field. It remains to be seen if the model covers the variety of possible competition models.

    1. eLife assessment

      This is a useful study of the connection between the ubiquitin ligase protein deltex and the wingless signaling pathway. Two different links are inferred from genetic interactions in vivo between loss-of-function mutations and overexpression. While providing useful in vivo physiological context, the approach is necessarily incomplete in so far as it cannot distinguish between direct and indirect mechanisms.

    2. Reviewer #1 (Public Review):

      This study presents a genetic and molecular analysis of the role of the cytoplasmic ub ligase Deltex (Dx) in regulating the Drosophila Wingless (Wg) pathway in the larval wing disc. The study exploits the strength of the fly system to uncover a series of genetic interactions between dx and wg and fz allele that support a role for Dx upstream of the Wg pathway. These are paired with molecular evidence that dx lof alleles lower Wg protein in 'source' cells at the DV margin, and that Dx associates with Arm and lowers its levels in a manner that can be rescued by pharmacological inhibition of the proteasome. The genetic data are solid but subject to alternative explanations based on the authors' model that Dx both inhibits and activates the pathway, and the published link between Dx and its target Notch, which regulates wg transcription. The molecular data are suggestive but need follow-up tests of the model to prove that Dx mediates poly-ub of Arm, and the degree to which Dx shares this role with the validated Arm E3 ligase Slmb. Overall, the story is very interesting but has mechanistic gaps that lead to speculative models that require more rigorous study to clarify the mechanism. Dx sharing a role in Arm degradation with the Slmb/APC destruction would have important implications for the many Wg/Wnt regulated processes in development and disease.

    3. Reviewer #2 (Public Review):

      The manuscript investigates the connections between the ubiquitin ligase protein deltex and the wingless pathway. Two different connections are proposed, one is the function of deltex to modulate the gradient of wingless diffusion and hence modulate the spatial pattern of wingless pathway targets, which regulate at different thresholds of wingless concentration. The second is a direct interaction between deltex and armadillo, a downstream component of the wingless pathway. Deltex is proposed to cause the degradation of armadillo resulting in suppression of wingless pathway activity. The results and conclusions of the manuscript are interesting and for the most part, novel, although previously published work linking Notch and deltex to wingless signal regulation, and endocytosis to wingless gradient formation could be more extensively discussed. However neither of the two parts of the manuscript seem in themselves sufficiently complete, and combining both parts together therefore seems to lack focus.

      The main issue with the manuscript is that many of the conclusions are inferred from genetic interactions in vivo between loss of function mutants and overexpression. While providing useful in vivo physiological context, this type of approach struggles to be able to make definitive conclusions on whether an interaction is due to a direct or indirect mechanism, as the authors themselves conclude at the end of section 2.3. The problem is confounded by the fact that there is already documented much cross-talk between the Notch signaling pathway and wingless at the transcriptional level, and deltex is already a Notch modulator that can alter wingless mRNA expression (See Hori et al 2004). Deltex in addition to promoting a ligand-independent Notch signal can also induce expression of Notch ligand, allowing further non-autonomous Notch activation and subsequent cell autonomous cis-inhibition of the initial deltex-induced signal. The dynamics and outcomes of the Notch signal response to deltex in vivo are therefore already very complicated to interpret before even considering unraveling indirect (via Notch) and direct interactions with wingless, although the two possibilities are not mutually exclusive.

    1. eLife assessment

      This study provides an important cell atlas of the gill of the mussel Gigantidas platifrons using a single nucleus RNA-seq dataset, a resource for the community of scientists studying deep sea physiology and metabolism and intracellular host-symbiont relationships. The work, which offers solid insights into cellular responses to starvation stress and molecular mechanisms behind deep-sea chemosymbiosis, is of relevance to scientists interested in host-symbiont relationships across ecosystems.

    2. Reviewer #1 (Public Review):

      Wang et al have constructed a comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes.

      Wang et al sample mussels from 3 different environments: animals from their native methane-rich environment, animals transplanted to a methane-poor environment to induce starvation, and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the upregulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them.

      Strengths:<br /> This paper makes available a high-quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and the collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors do an excellent job of making all their data and analysis available, making this not only an important dataset but a readily accessible and understandable one.

      The authors also use a diverse array of tools to explore their data. For example, the quality of the data is augmented by the use of in situ hybridizations to validate cluster identity and KEGG analysis provides key insights into how the transcriptomes of bacteriocytes change.

      The authors also do a great job of providing diagrams and schematics to help orient non-mussel experts, thereby widening the audience of the paper.

      Weaknesses:<br /> One of the main weaknesses of this paper is the lack of coherence between the images and the text, with some parts of the figures never being referenced in the body of the text. This makes it difficult for the reader to interpret how they fit in with the author's discussion and assess confidence in their analysis and interpretation of data. This is especially apparent in the cluster annotation section of the paper.

      Another concern is the linking of the transcriptomic shifts associated with starvation with changes in interactions with the symbiotes. Without examining and comparing the symbiote population between the different samples, it cannot be concluded that the transcriptomic shifts correlate with a shift to the 'milking' pathway and not other environmental factors. Without comparing the symbiote abundance between samples, it is difficult to disentangle changes in cell state that are due to their changing interactions with the symbiotes from other environmental factors.

      Additionally, conclusions in this area are further complicated by using only snRNA-seq to study intracellular processes. This is limiting since cytoplasmic mRNA is excluded and only nuclear reads are sequenced after the organisms have had several days to acclimate to their environment and major transcriptomic shifts have occurred.

    3. Reviewer #2 (Public Review):

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways.

      A major strength of this study includes the successful application of advanced single-nucleus techniques to a non-model, deep-sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep-sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons.

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design. In this area, I would appreciate more in-depth discussion of these impacts when interpreting the data.

      Because cells from multiple individuals were combined before sequencing, the in situ transplantation experiment lacks clear biological replicates. This may potentially result in technical variation (ie. batch effects) confounding biological variation, directly impacting the interpretation of observed changes between the Fanmao, Reconstitution, and Starvation conditions. It is notable that Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. It is not clear whether this is due to a technical factor impacting sequencing or whether these numbers are the result of the unique biology of Fanmao cells. Furthermore, from Table S19 it appears that while 98% of Fanmao cells survived doublet filtering, only ~40% and ~70% survived for the Starvation and Reconstitution conditions respectively, suggesting some kind of distinction in quality or approach.

      There is a pronounced divergence in the relative proportions of cells per cell type cluster in Fanmao compared to Reconstitution and Starvation (Fig. S11). This is potentially a very interesting finding, but it is difficult to know if these differences are the expected biological outcome of the experiment or the fact that Fanmao cells are much more sparsely sampled. The study also finds notable differences in gene expression between Fanmao and the other two conditions- a key finding is that bacteriocytes had the largest Fanmao-vs-starvation distance (Fig. 6B). But it is also notable that for every cell type, one or both comparisons against Fanmao produced greater distances than comparisons between Starvation and Reconstitution (Fig. 6B). Again, it is difficult to interpret whether Fanmao's distinctiveness from the other two conditions is underlain by fascinating biology or technical batch effects. Without biological replicates, it remains challenging to disentangle the two.

    4. Reviewer #3 (Public Review):

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand the fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change.

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement.

      The one particular area for clarification and improvement surrounds the concept of a proliferative progenitor population within the gill. The authors imply that three types of proliferative cells within gills have long been known, but their study may be the first to recover molecular markers for these putative populations. The markers the authors present for gill posterior end budding zone cells (PEBZCs) and dorsal end proliferation cells (DEPCs) are not intuitively associated with cell proliferation and some additional exploration of the data could be performed to strengthen the argument that these are indeed proliferative cells. The authors do utilize a trajectory analysis tool called Slingshot which they claim may suggest that PEBZCs could be the origin of all gill epithelial cells, however, one of the assumptions of this analysis is that differentiated cells are developed from the same precursor PEBZC population.

      However, these conclusions do not detract from the overall significance of the work of identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles or there may be independent ways in which organisms have been able to solve these problems.

    1. eLife assessment

      The authors provide a fundamental resource, detailing genetic variation of nutrient-responsive islet calcium regulation in mice through the lens of proteomics. The evidence for the mechanisms identified using this resource is compelling and strongly supported by integration with results from genome-wide association studies in humans. The construction of a streamlined and searchable web interface for the data will maximize their accessibility and utilization by the community.

    2. Reviewer #1 (Public Review):

      This paper looks at nutrient-responsive Ca++ flux in islet cells of eight genetically diverse mouse strains. The investigators correlate Ca++ flux with insulin secretory capacity, demonstrating that calcium parameters in response to different nutrients are a better predictor of insulin secretory capacity than average calcium. They also correlate Ca++ flux with previously collected islet protein abundance followed by integration with human genome-wide association studies. This integration allows them to identify a sub-set of proteins that are both relevant to human islet function and that may play a causal role in regulating islet Ca++ oscillations. All data have been deposited in a searchable public database. There are many strengths to this paper. To my knowledge, this is the first work to assess the genetics of nutrient-responsive Ca++ flux in islets. Given the importance of Ca++ for beta cell insulin secretion, this work is of high importance. Investigators also use the founders of two powerful genetic mouse models: the diversity outbred and collaborative cross, opening up several avenues of future research into the genetics of Ca++ flux. By looking at multiple parameters of Ca++ flux, investigators are able to start to understand which parameters may be driving low or high insulin secretion. Integration with protein abundance and human GWAS has allowed identification of proteins with known roles in insulin secretory capacity, as well as several novel regulators, again opening up several avenues of future research. Finally, the public database is likely to be useful to multiple investigators interested in following up specific protein targets or in conducting future genetic studies.

    3. Reviewer #2 (Public Review):

      This is an interesting paper from a reputable group in the field of islet physiology. The authors have provided the results from extensive studies, which will contribute to the knowledge of islet dysfunction and diabetes pathophysiology. The authors studied "the human orthologues of the correlated mouse proteins that are proximal to the glycemia-associated SNPs in human GWAS". This implies two assumptions - (1) human and mouse proteins do not differ in terms of islet physiology and calcium signaling; (2) the proteins proximal to the SNPs are the causal factors for functional differences, though the SNPs could affect protein/gene function distant from the SNPs.

    1. Author Response

      Many thanks for the detailed and sometimes sharp, yet appropriate criticism of our study. It was an incentive for us to carry out additional analyses and to devote more effort to an elaboration of concepts. The outcome is that the results have changed slightly and that we now give more space to a discussion of concepts. We first address here the points raised by more than one reviewer before responding to comments contributed by individual reviewers.

      The points raised can be divided into three thematic groups, 1) conceptual issues, 2) experimental and analytical questions, and 3) comments challenging the novelty of our results. On the first theme, we think it is essential to make a clear distinction between the conceptual and observational domains. As such, the criteria defining a “mirror neuron” and what is meant by the term "mirror mechanism" belong to the conceptual domain. This understanding of terms requires agreement among scientists, but is not experimentally testable. Unfortunately, there is no agreement on how to define a “mirror neuron” and what is meant by “mirror mechanism”. Thus, for the present work, the only option is to refer to specific definitions or to use our own, definitions which try to capture what others, and here most importantly Rizzolatti and colleagues, probably meant. We have adjusted the introduction in an attempt to convey our understanding and usage of the two terms in a hopefully comprehensible manner. Briefly, we use a definition for "mirror neuron" that we take from the first paragraph of the results section of Gallese et al. (Brain, 1996). We do not consider the "properties of mirror neurons" described in that paper as defining a mirror neuron (MN). Classifying neurons as MNs only on the basis of the presence of a modulation of discharge rate during an executed and an observed action compared with a baseline is a common practice also in other single neuron studies on MNs, consistent with this definition. Regarding "mirror mechanism", we refer to Rizzolatti and Sinigaglia (2016) and make a distinction between a broad and a strict definition. Given our finding that there are almost no F5 MNs whose activity during observation is a motor representation according to our strict definition of a mirror mechanism, and also given the problem that the term “mirror mechanism” itself is not uniformly understood, the question arises whether and how the term "mirror neuron" should be used in the future. The answer to this may vary and belongs to the conceptual domain. We briefly address this question at the end of the discussion of the revised manuscript.

      From that understanding of terms, conceptual hypotheses are to be distinguished, which of course must allow experimental predictions, i.e., must be falsifiable. We now distinguish more clearly between a "representation hypothesis" and an "understanding hypothesis". Both hypotheses focus on F5 MNs and are based on the strictly defined mirror mechanism. We test the “representation hypothesis” in our study, and just because it is the basis for the “understanding hypothesis”, falsifying the “representation hypothesis” would allow us to conclude that the “understanding hypothesis” is not valid. In contrast, confirmation of the “representation hypothesis” would not, of course, allow us to conclude that the “understanding hypothesis” holds. That would really be circular reasoning (this conclusion was drawn by some and rightly criticized). However, support for the “representation hypothesis” would be the necessary prerequisite for the “understanding hypothesis” to be true. These two hypotheses take up the original argument that a certain understanding of observed actions could follow from an equality of action-specific F5 MN activity during execution and observation. Because we considered the data on equality of action- specific F5 MN activity to be insufficient, we designed this study. Since our result largely argues against the "representation hypothesis" and thus against the "understanding hypothesis," we now discuss alternative concepts for the function of F5 MNs in more detail. It should be noted here that our fourth concept ("goal-pursuit-by-actor") could well represent the observed action without contradiction to our broad definition of a mirror mechanism, which in principle could also serve a subjective experience (which could be conceived as a kind of understanding). The way we structure the concepts in the discussion of this revised manuscript is, in our opinion, a useful overview of the concepts. The third concept is new in this context. We would like to emphasize that we focus on F5 MNs and intentionally avoid a discussion of mirror neurons beyond F5 in this paper. With the data from this study, we cannot say anything about MNs outside of F5.

      Regarding the key question of how the "understanding hypothesis" is testable, or whether it may not be testable at all, we agree, of course, that for the conclusion of whether F5 MNs contribute to perception, only a manipulation of F5 MNs can clarify it. We now say that explicitly in the introduction. We agree with reviewer #2 that "understanding" here is not limited to "action recognition" or "action categorization”, which in principle could be implemented by purely sensory processing. Therefore, we also do not believe that the approach proposed by reviewer #3, which builds on the distinction of actions, would allow for a critical examination of the "understanding hypothesis”. But we disagree that the "understanding hypothesis" is not testable at all. Operationalization is necessary. If we accept that we can measure certain visual or auditory perceptions of an animal by operationalization (e.g., the subjective visual vertical, see for example Khazali et al., PNAS, 2020), then we must also accept that we can, in principle, measure other subjective experiences by operationalization, such as pain or aiming at a goal or even the co- experience of pain. An example of how to approach this is the study by Carrillo et al. (Curr Biol, 2019), which reviewer #2 and colleagues discussed in a recent review article (Bonini et al., TCS, 2022).

      With regard to the second theme, experimental and analytical questions, we noticed while reading the comments that in our first version we did not distinguish clearly enough between statements about single neurons and statements about populations of neurons. Therefore, we now clearly separate single neuron analysis and population code analysis in the structure of the article. In view of the fact that statements about mirror neurons in the literature mostly refer to single neurons, we added extensive single neuron analyses, so that only now statistically reliable statements about single neurons are possible. This has led to the realization that the number of neurons with exclusively shared code is so small that these neurons should be considered a rare exception. Given the small number of time periods with shared code, we additionally tested against a hypothesis already rightly proposed as an alternative explanation by G. Csibra in 2005 (Mirror neurons and action observation: Is simulation involved? In: What do mirror neurons mean? Interdisciplines Web Forum 2005). We were able to reject this hypothesis based on two of three methods for testing for a shared code. This is the second piece of evidence besides the clustering of time periods with shared code already described in the first version that time periods with shared code cannot be considered random.

      We discuss in more detail the question of whether neurons that exhibit a shared code at least at times support the representation hypothesis. To this end, we additionally examined whether certain action segments are more frequently represented with a shared than with a non-shared code, whether neurons with shared code differ from those with non-shared code in anatomical location, and whether an accuracy can be achieved with a time bin-wise selection of neurons with shared code by population cross-task classifiers as with within-task classifiers in the whole population.

      Another issue was how to test for shared code and how to decide if a code has enough sharing. To answer the question, the exact hypothesis we intended to test here is crucial. The representation hypothesis states that the representation of the observed actions in F5 MNs corresponds to the representation as it occurs during the execution of the same actions. Therefore, the relationship between discharge rate and actions that holds during execution should also hold during observation, which is measurable with a classifier trained on execution trials and tested on observation trials. Moreover, the actions should not be more distinguishable during observation with a classifier other than the execution-trained classifier, because if that were so, it would mean that the representation of observed actions is different from that of executed actions. The detection of a cluster of time bins for which both conditions are satisfied confirms that it is possible to discover in this way the shared codes postulated by the representation hypothesis.

      With respect to concerns that the monkey may not have used the cue at all when the action was executed, we added a comparison with control trials with a non-informative cue and also compared the duration of the approach phase between the three actions. Regarding oculomotor behavior, we verified that the monkey had actually directed his gaze toward the action during action observation for all three actions.

      On the third issue, concerning the novelty of our results, we have now explained in more detail in the introduction why we felt it necessary to conduct a study we considered fundamental. As a result of our study, it can be clearly stated now that representations of observed actions as predicted by the strictly defined mirror mechanism are rare in F5 MNs, but nevertheless cannot be dismissed as random. This dispels the objection rightly raised by Csibra in 2005 and contradicts the currently prevailing view that such a representation can only be found at a population level. Even if these representations are ultimately explained by a concept other than the strictly defined mirror mechanism, their existence must be accounted for by any theory of the function of F5 neurons. Moreover, it is also shown that the observed actions are well discriminated with a non- shared code, at times even optimally. This contradicts the notion – which has been widespread for a long time since the work of Gallese et al. (Brain, 1996) – that mapping to motor representations in terms of broad congruence is simply not perfect. The applied cross-task decoding approach seems promising to test also in the future for a shared action code. Finally, reconsideration of alternative concepts has led us to highlight the possibility of a representation of a goal pursuit by the observer.

      Reviewer #1 (Public Review):

      The authors set out to investigate the hypothesis that mirror neurons in ventral premotor area F5 code actions in a common motor representation framework. To achieve this, they trained a linear discriminant classifier on the neural discharge of three types of action trials and test whether the thus trained classifier could decode the same categories of actions when observed. They showed that codes were fully matched for a small subset of neurons during the action epoch, while a wider set of "mirror neurons" showed only poorly matched codes for different epochs.

      This is one of the descriptions of our results, where we realized that in our first version we did not distinguish clearly enough between statements about single neurons and statements about populations of neurons. This prompted us to perform a detailed single neuron analysis.

      The authors controlled for potential visual object confounds by having identical objects be manipulated in three different ways and by having the animal carry out the motor execution in the dark. The main strength of the study lies in the clever decoding approach testing the matched tuning to behavioural categories in a model-free way. The central result is in the identification of the small sub-group of mirror neurons that show true matching during the execution epoch, which can dissociate the three types of action almost perfectly. This aligns well with some previous work while offering a novel avenue to identify and investigate those neurons. The underlying neuronal mechanism and behavioural relevance of these neurons remain an open question. It would have been interesting to understand better whether the specific motor representations at a recording site, for instance identified through microstimulation prior to recording (see Methods), the reaction times on individual trials or the specific gaze targets (object/hand) had a bearing on the decoding performance for a neuron/trial.

      We agree that these are interesting questions.

      In this study, the focus is on testing for a shared code according to a strictly defined mirror mechanism. We have now compared the anatomical locations of neurons with only time bins in which observed actions were discriminated with a shared code (according to one of the methods) to the locations of neurons with only time bins with non-shared code (see last paragraph in Results). We did not find any relevant difference and this is why one cannot expect topographically specific effects of microstimulation.

      We do not expect the reaction time (i.e., the time interval between LED onset and start button release, or the duration of the approach epoch) during execution or observation to have any effect on our results on shared coding as the analysis was based on relative time bins. The observed actions were predominantly distinguished late in the approach epoch, but especially in the manipulation epoch. At this time, reaction time is not expected to have a relevant influence.

      The relationship between gaze/eye position and the activity of mirror neurons, during execution or observation, is an interesting topic in itself. However, for testing for a shared code according to a strictly defined mirror mechanism, it is only relevant that the observing monkey actually observes the action. We have ensured this in our experiment by a fixation window and have now also confirmed that the monkey actually looked into the area of the object during all three actions (see Results, lines 209-219 in the manuscript with tracked changes).

      Ultimately, the uncovered matched mirror representations should in future experiments be tested with causal interventions and linked trial-by-trial to action selection performance.

      The authors put the focus of their discussion on the wider, less well-matched neuronal pool to support an action selection framework, which is of course a valid view and well established in motor representations. From a sensory perspective, sparse coding, as suggested by the small group of "true" mirror neurons identified with the decoding approach, should also be considered as the basis for a possible neuronal mechanism. A particular strength of the paper is that it could give new data and impetus to the important discussion about how motor and sensory coding frameworks come together in cortical processing.

      We have expanded the discussion considerably and also address the possibility of sparse coding.  

      Reviewer #2 (Public Review):

      The paper by Pomper and coworkers is an elegant neurophysiological study, generally sound from a methodological point of view, which presents extremely relevant data of considerable interest for a broad audience of neuroscientists. Indeed, they shed new light on the mirror mechanism in the primate brain, trying to approach its study with a novel paradigm that successfully controls for some important factors that are known to impact mirror neuron response, particularly the target object. In this work, a rotating device is used to present the very same object to the monkey or the experimenter, in different trials, and neurons are recorded while the monkey (motor response) or the experimenter (visual response) performed a different action (twist, shift, lift) cued by a colored LED.

      The results show that there is a small set of neurons with congruent visual and motor selectivity for the observed actions, in line with classical mirror neuron studies, whereas many more cells showed temporally unstable matched or even completely non-matched tuning for the observed and executed actions. Importantly, the population codes allow to accurately decode both executed and observed actions and, to some extent, even to cross-decode observed actions based on the coding principles of the executed ones.

      In my view, however, the original hypothesis that an observer understands the actions of others by the activation of his/her motor representations of the observed actions constitutes circular reasoning that cannot be challenged or falsified, as the author may want to claim. Indeed, 1) there is no causal evidence in the paper favoring or ruling out this hypothesis (and there couldn't be), 2) there is no independent definition (neither in this paper nor in the literature) of what "action understanding" should mean (or how it should be measured). Instead, the findings provide important and compelling evidence to the recently proposed hypothesis that observed actions are remapped onto (rather than matched with) motor substrates, and this recruitment may primarily serve, as coherently hypothesized by the authors, to select behavioral responses to others (at least in monkeys).

      1) One of the main problems of this manuscript is, in my view, a theoretical one. The authors follow a misleading, though very influential, proposal, advanced since the discovery of mirror neurons: if there are (mirror) neurons in the brain of a subject with an action tuning that is matched between observation and execution contexts, then the subject "understands" the observed action. This is clearly circular reasoning because the "understanding" hypothesis uniquely derives from the neuron firing features, which are what the hypothesis should explain. In fact, there is no independent, operational definition of the term "understanding". Not surprisingly there is no causal evidence about the role of mirror neurons in the monkey, and the human studies that have claimed to provide causal evidence of "action understanding" ended up using, practically, operational definitions of "recognition", "match-to-sample", "categorization", etc. Thus, "action understanding" is a theoretical flaw, and there is no way "to challenge" a theoretical flaw with any methodologically sound experiment, especially when the flaw consists of circular reasoning. It cannot be falsified, by definition: it must simply be abandoned. On these bases, I strongly encourage the authors to rework the manuscript, from the title to the discussion, by removing any useless attempt to falsify or challenge a circular concept and, instead, constructively shed new light on how mirror neurons may work and which may be their functional role.

      Please see the response to all.

      2) An important point to be stressed, strictly related to the previous one, concerns the definition of "mirror neuron". I premise that I am perfectly fine with the definition used by the authors, which is in line with the very permissive one adopted in most studies of the last 20 years in this field. However, it does not at all fulfill the very restrictive original criteria of the study in which "action understanding" concept was proposed (see Gallese et al. 1996 Brain): no response to object, no response to pantomimed action or tool actions, activation during execution in the dark and during the observation of another's action.

      We do not agree that the enumerated "very restrictive original criteria" emerge from the Gallese et al. (Brain, 1996) study. Except for the first paragraph in the results section, there is no clear statement on how mirror neurons should be defined.

      If the idea (which I strongly disagree with) was to simply challenge a (very restrictive) definition of mirroring (a very out-of-date one, indeed, and different from the additional implication of "action understanding"), the original definition of this concept should be at least rigorously applied. In the absence of additional control conditions, only the example neuron in Figure 2A could be considered a mirror neuron according to Gallese et al. 1996.

      We have the impression that the question does not distinguish clearly enough between the definition of "mirror neuron" and the definition of "mirror mechanism". In defining "mirror mechanism", we refer to the work of Rizzolatti and Sinigaglia (Nat Rev Neurosci, 2016). We do not think that this definition is out-of-date (see for example the 2018 article by Rizzolatti and Rozzi in Handbook of Clinical Neurology). If the term "mirror mechanism" is to be defined differently, then another term should be used for a new definition or an annotation should be added (such as "version 2"). This would be necessary to avoid unnecessary confusion resulting from unclear terms.

      Permissive criteria implies that more "non-mirror" neurons are accepted as "mirror": simply because they are permissively named "mirror", does not imply they are mirroring anything as initially hypothesized

      Even for a neuron that would be classified as a "mirror neuron" according to your previously stated "very restrictive original criteria”, it does not follow that it "mirrors” according to a mirror mechanism. And, of course, it is quite possible that more neurons do not "mirror” according to a mirror mechanism if one tests more neurons.

      (Example neuron in Fig 2B, for example, could be related to mouth, rather than hand, movements, since it responds strongly and similarly around the reward delivery also during the observation task, when the monkey should be otherwise still).

      We agree, it is not excluded that this neuron has a relation to mouth movements. However, since the neuron meets the conditions to be classified as a "mirror neuron", an additional relation to mouth movements would not be relevant. If mouth movements are to be an exclusion criterion, then this would have to be included and justified in the definition of a "mirror neuron".

      Clearly, these concerns impact all the action preference analyses. To practically clarify what I mean, it should be sufficient to note that 74% (reported in this study) is the highest percentage ever reported so far in a study of neurons with "mirror" properties in F5 (see Kilner and Lemon 2013, Curr Biol) and it is similar to the 68% recently reported by these same authors (Pomper et al. 2020 J Neurophysiol) with very similar criteria. Clearly, there is a bias in the classification criteria relative to the original studies: again, no surprise if by rendering most of the recorded neurons "mirror by definition" then they don't "mirror" so much. I suggest keeping the authors' definition but removing the pervasive idea to challenge the (misleading) concept of understanding.

      We think that it is very important to clearly separate "mirror neuron" from "mirror mechanism". And the question arises whether one should not include a mirroring criterion, which is derived from a definition of a mirror mechanism, in the definition of mirror neurons. We address this briefly in the discussion. Ultimately, the point of our study is to find out how many of the - if you want to put it that way - "permissively defined" mirror neurons actually “mirror”. And the answer depends on how one defines “mirror mechanism”. We provide an answer by resorting to a “strictly defined mirror mechanism”. We have now also given throughout the results section the percentages of neurons with certain properties with respect to all measured F5 neurons. This is a reference that allows comparisons among studies, provided that no neurons were directly discarded during recording, which we avoided in our study.

      3) It would be useful to provide more information on the task. Panel B in Figure 1 is the unique information concerning the type of actions performed by the monkey and the experimenter. Although I am quite convinced of the generally low visuomotor congruence, there are no kinematics data nor any other evidence of the statement "the experimental monkey was asked to pay attention to the same actions carried out by a human actor". First, although the objects were the same, the same object cannot be grasped or manipulated in the same way by a human and a macaque, even just because of the considerable difference in the size of their hands; this certainly changes the way in which monkeys' and experimenter's hands interact with the same object, and this is a quantifiable (but not quantified) source of visuomotor difference between observed and executed actions and a potential source of reduced congruency.

      We agree, of course, that there are kinematic differences in how a monkey and how a human manipulate the same object. We have not measured the kinematics and thus cannot make a systematic statement about this. We now report in the results section the rather incidental observation that already the reaching trajectories for the three actions differed and show corresponding differences in the timing of the approach epoch. However, for the question of this study, how many neurons are eligible to represent observed actions according to a strictly defined mirror mechanism, the kinematic repertoire of the observed actor is irrelevant. The reference is the F5 mirror neuron activity during the monkey's own action, i.e., how the monkey approaches the object with his hand, how he grasps it, and how he brings it to a certain target position and holds it there. The observed action, according to the strictly defined mirror mechanism, is to be mapped to this reference. Therefore, we did not collect kinematic data. But it is of course a possible explanation for a non-shared code if the strictly defined mirror mechanism does not apply.

      Second, there is little information about monkey's oculomotor behavior in the two conditions, which is known to affect mirror neuron activity when exploratory eye movements are allowed (Maranesi et al. 2013 Eur J Neurosci), potentially influencing the present findings: a {plus minus}7 (vertical) and {plus minus}5 (horizontal) window at 49 cm implies that the monkey could explore a space larger than 10 cm horizontally and 14 cm vertically, which is fine, but certainly leaves considerable freedom to perform different exploratory eye movements, potentially different among observed actions and hence capable to account for different "attention" paid by the monkey to different conditions and hence a source of neural variability, in addition to action tuning.

      We agree that the topic of the relationship between F5 MNs activity and eye movements is interesting. And we know from the work of Maranesi et al. (2013) that at least larger eye movements during action observation are related to the activity of F5 MNs. In our study, we ensured that the observing monkey was actually observing the action. For this purpose, we used a fixation window. We now additionally verified that the monkey really looked into the area of the object during all three actions (see Results, lines 209-219 in the manuscript with tracked changes). In our study, the fixation window was so small that the monkey could not see the face of the human actor, in contrast to the study of Maranesi et al. (2013). It was mainly the face that attracted the monkey's attention in that study (measured by gaze position). In our study, the risk that the gaze of observing monkey was out of the fixation window was high when he looked at the human actor's hand above the wrist. The execution of the action by the monkey took place in darkness. We did not use a fixation window because the monkey's own execution of the action can be assumed to direct his attention to the action.

      We cannot rule out the possibility that smaller eye movements during observation, larger eye movements during execution in darkness, covert shifts of spatial attention, or more generally attentional fluctuations have an influence on F5 MNs that might have counteracted a shared action code in our study. However, if this were the case, then the investigated hypothesis that the activity of F5 MNs during action observation is a motor representation according to the strictly defined mirror mechanism would also have to be rejected.

      4) Information about error trials and their relationship with action planning. The monkey cannot really "make errors" because, despite the cue, each object can be handled in a unique way. The monkey may not pay attention to the cue and adjust the movement based on what the object permits once grasped, depending on online object feedback. From the behavioral events and the times reported in Table 1, I initially thought that "shift" action was certainly planned in advance, whereas "lift" and "twist" could in principle be obtained by online adjustments based on object feedback; nonetheless, from the Methods section it appears that these times are not at all informative because they seem to depend on an explicit constraint imposed by the experimenters (in a totally unpredictable way). Indeed, it is stated that "to motivate the monkey even more to use the LED in the execution task, another timeout was active in 30% (rarely up to 100%) of trials for the time period between touch of object to start moving the object: 0.15 (rarely 0.1) for a twist and shift, 0.35 (rarely 0.3s) for a lift". This is totally confusing to me; I don't understand 1) why the monkey needed to be motivated, 2) how can the authors be sure/evaluate that the monkeys were actually "motivated" in this way, and 3) what kind of motor errors the monkey could actually do if any. If there is any doubt that the monkeys did actually select and plan the action in advance based on the cue, there is no way to study whether the activity during action execution truly reflects the planned action goal or a variety of other undetermined factors, that may potentially change during the trials. Please clarify.

      It is true that the three actions could in principle be performed without using the LED as an informative cue. While this is unlikely under the assumption that a monkey prefers the easiest and fastest way to get reward, it remains a possibility. For this reason, we introduced time constraints in a part of the trials. The selection of time constraints and the proportion of trials in which they were applied, was a pragmatic compromise between a time limit, at which the LED must be used as an informative cue for action selection in order to comply with the task, and a time span that allows the task to be completed even when overall motivation is low. The latter takes into account the general experimental experience that a monkey's engagement or motivation in such experiments varies across trials, sessions, and days. To evaluate whether the LED color was, indeed, used as a cue for action planning in the execution task, we randomly interleaved trials with a different LED, non-informative regarding the type of object, as a control in 5% of the trials. We compared the behavioral responses in trials with informative cues and those with a non-informative cue. The behavioral analysis established that both monkeys indeed used the informative cues to guide their choices (see Fig. 1D).

      Further evidence that the monkey used the cue for action selection and planning is the finding that the type of action was encoded before the release of the start button and then further during the approach phase, i.e., much earlier than somatosensory feedback about the manipulability of the object was available (see Fig. 3A and Fig. 6A).

      Regarding the question, which "motor errors" were possible: The answer can be found in the description of the cases in which a trial was aborted (see Material and methods): releasing the start button too early (< 100 ms after turning on the LED), manipulating the object too slowly after touching it (the time constraints mentioned), not holding the object until the reward was given, or not performing the task at all (10 s timeout).

      5) Classification analysis. There seems to be no statistical criterion to establish where and when the decoding is significantly higher than chance: the classifier performance should be formally analyzed statistically. I would expect that, in this way, both the exe-obs and the obs-exe decoding may be significant. Together with the considerations of the previous point 2 about the permissive inclusion criteria for mirror neurons, this is a remarkable (even quite unexpected) result, which would prove somehow contrary to what the authors claim in the title of the paper. The fact that in any classification the "within task" performance is significantly better than the "between task" performance does not appear in any way surprising, considering both the inclusive selection criteria for "mirror neurons" and the unavoidably huge different sources of input (e.g. proprioceptive, tactile, top-down, etc. afferences) between execution and observation. So, please add a statistical criterion to establish and show in the figures when and where the classifications are significantly above chance.

      We have added - in addition to the statistics already performed in the first version (Fig. 3A in the previous version, now Fig. 6A) - a number of analyses including statistics. This mainly concerns the analyses regarding a shared code at the single neuron level, in which we additionally tested against the null hypothesis proposed by Csibra in 2005 using permutation tests. And we have now also calculated confidence intervals for the population classifications that allow the comparison with chance level. We re-performed the classification analyses using eight-fold cross-validation. We also added a statistical analysis to the finding of clustering of time periods with shared code (Fig. 4). In Figure 5, we additionally compared the frequency of action segments with shared and non-shared codes, which is a descriptive, exploratory analysis. For this reason, it does not make sense to perform inferential statistics. Overall, these analyses represent a significant expansion of the analyses in the first version. We have done this primarily to arrive at statistically sound conclusions at the single neuron level.

      Regarding the comparison between within-task classification (o2o) and cross-task classification (e2o), it is important to keep in mind that the goal was to test the hypothesis that the activity of F5 MNs during action observation is a motor representation of the observed action according to the strictly defined mirror mechanism. This hypothesis requires both, 1) an above chance level accuracy of the e2o classifier and 2) no better accuracy of the o2o classifier as compared to the e2o classifier. If the o2o classifier were better, then the actions would not be represented as they are executed. And the reference in this hypothesis is the motor representation, that is, the code at execution. Thus, the direction e2o classification is the crucial one, not the reverse direction (o2e). One explanation for the fact that o2o shows better accuracy in the population may be the different sensory inputs mentioned above. In this case, the tested hypothesis has to be rejected and replaced by another one, which should then have a different name.

      Nevertheless, we also show the result of the o2e cross-task classification in Fig. 6 (yellow curve), which was already included in Fig. 3 of the first version. However, we do not address it in more detail in the main text because it is not relevant for the hypothesis to be tested. It is only a reportable additional result.

      6) "As the concept of a mirror mechanism posits that the observation performance can be led back to an activation of a motor representation, we restricted this analytical step to a comparison of the exe-obs and the obs-obs discrimination performance". I don't understand the rationale of this choice. The so-called "concept" of mirror mechanism in classical terms posits that mirror neurons have a motor nature and hence their functioning during observation should follow the same principle as during action execution. But this logical consideration has never been demonstrated directly (it is indeed costated by several papers), and when motor neurons are concerned (e.g. pyramidal tract neurons, see Kraskov et al. 2009) their behavior during action observation is by far more complex (e.g. suppression vs facilitation) than that hypothesized for classical "mirror neurons". Furthermore, when across-task decoding for execution and observation code has been used, both in neurophysiological (e.g. Livi et al. 2019, PNAS) and neuroimaging (Fiave et al. 2018 Neuroimage) data, the visual-to-motor direction typical produce better performance than the opposite one. Thus, I don't see any good reason not to show also (if not even just) the obs-exe results. Furthermore, I wonder whether it is considered the possible impact of a rescaling in the single neuron firing rate across contexts, as the observation response is typically less strong than the execution response in basically all brain areas hosting neurons with mirror properties, and this should not impact on the matching if the tuning for the three actions remains the same (e.g. see Lanzilotto et al. 2020 PNAS). The analysis shown in Figures 4 and 5 is, for the rest, elegant and very convincing - somehow surprising to me, as the total number of "congruent" neurons (7.5%) is even greater than in the original study by Gallese et al. (5.4%).

      As to the rationale of our approach, please see our response to the previous point.

      On the issue of rescaling: the hypothesis tested here requires that the F5 MNs activity on observation is a motor representation of the observed action. Hence, from the activity during observation the action should be just as readable as from the execution-related activity. If we had to use rescaling to find a shared code, then observed actions would not be represented in F5 MNs in the same way as on execution. Additional information on whether the action is being executed or observed would be needed. This would of course be possible in principle, but would contradict the hypothesis. And we then not only have the difficulty of which readout is the physiological one (here we make a parsimonious assumption with a linear readout), but we would have to make an additional assumption about rescaling. For this study, we have now chosen the solution of performing the action preference analysis on a single neuron level in a statistically clean way. This represents a very liberal form of rescaling, as it only tests whether the action with the highest or lowest discharge rate is the same when executed and observed. That is, if the result here is not fundamentally different, which is the case, then it can also be assumed that one does not get qualitatively different results for other forms of rescaling.

      7) The discussion may need quite deep revision depending on the authors' responses and changes following the comments; for sure it should consider more extensively the numerous recent papers on mirror neurons that are relevant to frame this work and are not even mentioned.

      The discussion has been thoroughly revised considering the comments raised and suggestions of this and the other two reviewers.

      Reviewer #3 (Public Review):

      Mirror neurons are a big deal in the neuroscience literature and have been for thirty years. I (and many others) remain skeptical of whether they serve the functions often attributed to them - specifically, whether they are motor planning neurons that contribute to understanding the actions of others. Testing their functions, therefore, is of great interest and importance. The present study, however, is not a cogent or convincing test. I do not think this study helps to answer the questions surrounding mirror neurons. It purports to provide a crucial test, that comes out mostly against the mirror neuron hypothesis, but the test has too many weaknesses to be convincing.

      Thank you for the clear words. We take from it, first of all, that in the first version of the manuscript we failed to convey the relevance of our study for the discussion of mirror neuron function. The concerns of this reviewer are in line with those of the others and are addressed in our response to all three reviewers.

      First, consider that the motor tuning and the visual tuning match "poorly." How poor or good must the match be before the mirror neuron hypothesis is rejected? I do not know, and the study does not help here. Even a "poor" match could contribute significantly to a social perception function.

      The specific hypothesis tested here assumes that an action-specific activity of F5 MNs evoked by observed actions corresponds to an action-specific activity of these actions if executed. The approach taken here to compare cross-task classification accuracy (execution-trained, tested in observation) with within-task classification accuracy (observation-trained, tested in observation) tests this hypothesis. The fact that we found a cluster of time periods of single neurons in which both accuracies are almost equal supports this approach and also the hypothesis for these time periods. In principle, of course, the decision for the presence of a difference or equality is always only a statistical statement and contains assumptions. For example, the assumption that a linear readout has physiological relevance enters here. But this problem exists in all studies that ultimately try to understand biological neuronal networks in order to explain perceptions and behavior. However, it is such studies that attempt to elucidate what information is contained in which neurons that set the stage for experiments that, in the optimal case, manipulate certain neurons in a particular way in order to then measure the behavior of an animal that is just right for those neurons.

      Second, the results remind me in some ways of other multi-modal responses in the brain. For example, in the visual area MST, neurons are tuned to optic flow fields that imply specific directions of self-motion. Many of the same neurons are tuned to vestibular signals that also imply specific directions of self-motion. But the optic flow tuning and the vestibular tuning are not perfectly matched. There is considerable slop and complexity in how the two tunings compare within individual neurons. That complexity is not evidenced against multi-modal tuning. Instead, it suggests a hidden-layer complexity that is simply not fully understood yet. Just so here, the fact that the apparent motor tuning and apparent visual tuning match "poorly" is not evidence against both a motor planning and a visual encoding function.

      We hope that it is now clearer, in contrast to the first version, that we tested a specific hypothesis that is only a prerequisite for the hypothesis of a very specific form of understanding. Referring to the example, the hypothesis analogous to ours would be that the representation of self-motion direction due to optic flow ("observation") corresponds to the representation of self-motion direction due to vestibular stimulation ("execution"). If it were then found that the self-motion direction due to optic flow cannot be predicted from a classifier trained on vestibular stimulation, and that another classifier trained on optic flow performs better, then the hypothesis would have to be rejected. This is then a reason to realize that "everything is a bit more complex" and to search for better explanations.

      Third, the animals are massively over-trained in three actions. They perform these actions and see them performed thousands of times toward the same object. Surely, if I were in the place of the monkey, every time I saw the object, I'd mentally imagine all three actions. As I saw a person act on the object, I'd mentally imagine the alternative two actions at the same time. Even if the mirror neuron hypothesis is strictly correct, this experiment might still find a confusion of signals, in which neurons that normally might respond mainly to one action begin to respond in a less predictable way during all three trial types.

      In our study, we tested a specific hypothesis related to the time an action is observed. Here, you suggest an alternative hypothesis. The question is whether this alternative hypothesis better explains the result of our study. The alternative hypothesis can be formulated as follows: the F5 MNs activity elicited by an observed action in this experiment corresponds to a mixture of the activities that occur when the other two actions are executed. This hypothesis is to be rejected because it fails to explain why a shared code occurs in single neurons and why cross-task population classifiers show an accuracy above chance level. A modified alternative hypothesis, which states that what is represented in the experiment during observation is a mixture of all three actions, cannot explain why the three actions are very well represented in the population and are optimally represented exactly when the target position of the object is reached.

      Fourth, the experiment relies on a colored LED that acts as an instructional cue, telling the monkey which action to perform. What is to stop the neurons from developing a cue-sensitive response, as in classic studies from Steve Wise and others in the premotor cortex? Perhaps the neuronal signal that the experimenters are trying to measure is partly obscured by other, complex responses influenced in some manner by the instructional cue?

      In principle, there is the possibility that purely sensory information is also represented in area F5, at least in some neurons or at certain points in time. We take your suggestion and discuss this as one of the alternative concepts (we call it "sensory concept"). However, several findings argue against this concept. For example, neural responses to cues usually represent the subsequent action, but not sensory information of the cue such as the color of the cue. In our study, it is evident from Figure 3A, 6A and 6B that during action execution, actions are discriminated even before the start button is released. Since this discrimination of actions occurs with a time delay after the cue and then increases continuously, this is evidence that the action to be executed is represented, but not the cue itself.

      Fifth, finally, and most importantly, the fundamental problem with this study is that it is correlational. Studies that purport to test the function of a set of neurons, and do so by use of correlational measurements, cannot provide strong answers. There are always half a dozen different interpretations and caveats, such as the ones I raised here. Both sides of a debate can always spin the results, and the arguments are never resolved. To test the mirror neuron hypothesis properly would require a causal study. For example, lesion area F5 and test if the monkey is less able to discriminate the actions of others. Or, electrically microstimulate in area F5 and test if the stimulation interferes (either constructively or destructively) with the task of discriminating the actions of others. Only in this way will it be possible to answer the question: do mirror neurons functionally participate in understanding the actions of others? The present study does not answer that question.

      We would like to reiterate that studies aimed at elucidating what information is contained in which neurons or areas are necessary to understand neural network processes and are a prerequisite for conducting well-considered experiments that measure behavioral effects through specific manipulation of the neural network. Without the work of Gallese, Rizzolatti and colleagues, the idea of associating F5 neurons with action understanding would not have occurred in the first place. The current tricky question is whether at all, and if so, to what understanding, to what perception, to what behavior that uses information about mental states of another, F5 MNs might be able to contribute. And for this, it helps to have a clearer idea of what information is contained in F5 MNs during action observation.

    1. eLife assessment

      This study presents a valuable dataset and tool that can aid in arthropathies' assessment, potentially enabling such evaluation to be done outside the lab. There is solid evidence supporting the comparison between the force plate and insole data, which can be strengthened by improvements in cross-validation, but the evidence for distinguishing disease signatures and elimination of walking speed as a factor is inconclusive and would need further analysis. This work will be of interest to physical therapists, clinicians, and researchers in the field of ankle/knee/hip osteoporosis and other lower limb joint diseases.

    2. Reviewer #1 (Public Review):

      This work aims to evaluate the use of pressure insoles for measurements that are traditionally done using force platforms in the assessment of people with knee osteoarthritis and other arthropathies. This is vital for providing an affordable assessment that does not require a fully equipped gait lab as well as utilizing wearable technology for personalized healthcare.

      Towards these aims, the authors were able to demonstrate that individual subjects can be identified with high precision using raw sensor data from the insoles and a convolutional neural network model. The authors have done a great job creating the models and combining an already available public dataset of force platform signals and utilizing them for training models with transferable ability to be used with data from pressure insoles. However, there are a few concerns, regarding substantiating some of the goals that this manuscript is trying to achieve.

      In addressing these concerns, if the results are further corroborated using the suggestions provided to the authors, this provides an exciting tool for identifying an individual's gait patterns out of a cluster of data, which is extremely useful for providing identifiable labels for personalized healthcare using wearable technologies.

    3. Reviewer #2 (Public Review):

      The authors aimed to investigate whether digital insoles are an appropriate alternative to laboratory assessment with force plates when attempting to identify the knee injury status. The methods are rigorous and appropriate in the context of this research area. The results are impressive, and the figures are exceptional. The findings of this study can have a great impact on the field, showing that digital insoles can be accurately used for clinical purposes. The authors successfully achieved their aims.

    4. Reviewer #3 (Public Review):

      In this manuscript, the authors describe the development of a machine-learning model to be used for gait assessment using insole data. They first developed a machine learning model using an existing, large data set of ground reaction forces collected during walking with force plates in a lab, from healthy adults and a group of people with knee injuries. Subsequently, they tested this model on ground reaction forces derived from insoles worn by a group of 19 healthy adults and a group of n=44 people with knee osteoarthritis (OA). The model was able to accurately identify individuals belonging to the knee OA group or the healthy group using the ground reaction forces during walking. Note: I do not have expertise on machine learning and will therefore refrain from reviewing the ML methods that were applied in this paper.

      Strengths: The authors successfully externally validated the trained model for GRF on insole data. Insole data carries potentially rich information, including the path of the CoP during the stance phase. The additional value of insoles over force plates in itself is clear, as insoles can be used independently of laboratory facilities. Moreover, insoles provide information on the COP path, which can have added value over other mobile assessment methods such as inertial sensors.

      Limitations: The second ML model, using only insole data to identify knee arthropathy from healthy subjects, was trained on a small sample of subjects. Although I have no background in ML, I can imagine that external validation in an independent and larger sample is needed to support the current findings.

      Gait speed has a major influence on the majority of gait-related outcomes. Slow or more cautious gait, due to pain or other causes, is reflected in vertical GRF's with less pronounced peaks. A difference in gait speed between people with pain in their knee (due to injury) and healthy subjects can be expected. This raises the question of what the added value of a model to estimate vertical GRF is over a simpler output (e.g. gait speed itself). Moreover, the paper does not elucidate what the added value of machine learning is over a simpler statistical model.

      In line with this issue, the current analyses are not strongly convincing me that the model described resulted in an identification of knee arthropathy-specific signature. Only knee arthropathy vs healthy (relatively young) subjects was compared, and we cannot rule out that this group only reflects general cautious, slow, or antalgic gait. As such, the data does not provide any evidence that the tool might be valuable to identify people with more or less severity of symptoms, or that the tool can be used to discriminate knee osteoarthritis from hip, or ankle osteoarthritis, or even to discriminate between people with musculoskeletal diseases and people with neurological gait disorders. This substantially limits the relevance for clinical (research) practice. In short, the output of the model seems to be restricted to "something is going on here", without further specification. Further development towards more specific aims using the insole data may substantially amplify clinical relevance.

    1. eLife assessment

      This study makes important contributions to our understanding of spinal locomotor circuits by manipulating the function of excitatory and inhibitory V2 interneurons and revelaing their role in locomotor control. The data collected and the methods used by the authors are solid and the authors suggest that V2 excitatory and inhibitory neurons have antagonistic functions in intralimb coordination. This work will be of broad interest for neuroscientists studying development and function of motor circuits.

    2. Reviewer #1 (Public Review):

      This is a well-written manuscript addressing a fundamental question regarding the functional organization of spinal circuits controlling the execution of locomotor movements. The authors take advantage of the power of mouse genetics to exploit the expression of Hes2 to study the function of the whole population of V2 interneurons. Previous studies could only focus on either the excitatory V2a or inhibitory V2b subpopulations. Here, by combining two different genetic manipulations based on either silencing or acute ablation of V2 interneurons with rigorous functional analysis the authors showed that V2 interneurons can act together to control interlimb coordination and antagonistically to regulate joint movements. The data are convincing and properly analyzed, the conclusions are in line with the results, and the limitations of the study are appropriately addressed. The discussion nicely frames the work in a conceptual framework that takes into account the current literature on the mode of operation of spinal motor circuits. There are a few weaknesses that should be addressed and would further improve what is already a very nice study.

      1) While previous work from the authors has consistently shown the validity and reliability of these neuronal silencing and ablation approaches, the study presents no data showing the efficiency and specificity of these genetic manipulations. These are critical parameters for interpreting the results and should be presented, especially considering that the strategies employed are susceptible to the limitations of a lineage-tracing approach. These data would also be important for the discussion section to interpret the differences between the two genetic models and could address some of the options proposed by the authors, as well as the possibility of incomplete and/or unexpected recombination.

      2) The authors suggest that the changes in interlimb coordination are "consistent with mice keeping the limbs closer to the body, limiting forward movements in the attempt to preserve body stability". A common reaction to body instability in quadrupeds is a widening of the limbs to lower the center of gravity: limbs are positioned further away from the body. Not quite sure whether I would be so certain of the interpretation that the observed phenotypes are due to body/postural instability. It is possible that the changes in gait are just a direct consequence of the inactivation of V2 interneurons. To clarify this issue, it could be useful to test whether other features of postural control are affected by perturbation of V2 neurons, for example, swimming and rearing analyses would provide interesting insights.

    3. Reviewer #2 (Public Review):

      The manuscript by Hayashi et al provides the characterization of a new mouse line that targets V2 neurons and demonstrates the locomotor consequences of manipulating the large V2 population. Prior work has examined the effects of silencing and/or ablation of the excitatory V2a and inhibitory V2b neuronal populations independently. Since the two populations are derived from the same V2 lineage but have opposite transmitter phenotypes, one may expect some common synaptic targets and/or similar or complementary functional roles that require excitatory/inhibitory balance. Overall, the value and importance of the study is that comparison of prior manipulations of the V2a and V2b populations (individually in prior studies) with the more global V2 manipulation (here) provides additional insights into spinal locomotor circuitry.

      The authors successfully generate a new Hes2cre mouse line that targets the V2 population with high accuracy. The characterizations as far as the specificity and efficiency of the line are compelling. This line is then used to examine the locomotor effects of, first, synaptically silencing all Hes2 neurons throughout the neuroaxis beginning in early development and, then, ablating spinal Hes2 neurons in the adult. The phenotypes of both groups of mice are quite similar, with some small exceptions. The most obvious disturbance in both is the shortened steps, faster step cycle, and more steps required to travel the same distance. As the authors point out, much of the phenotype may be due to a disruption in balance. Interestingly, the hyperextension that is characteristic of V2b neuronal ablation is lost when the function of V2a neurons is compromised as well, suggesting antagonistic functions of these populations in intralimb coordination.

      The experiments are rigorous and the data are clearly presented. The findings are interesting to consider in context with prior work. Some comparisons are difficult since gait is not considered and one of the major roles of spinal V2a neurons has been demonstrated to be speed/gait-dependent. The ipsilateral deficits are a major conclusion but some of the supporting data are not clearly derived (or there was an error in the figure?). The use of spinal restricted manipulation removes many of the potential confounds of the full Hes2 silencing. It is still, however, not possible to disentangle the local spinal circuit effects from altered proprioceptive input pathways or ascending information from the lumbar cord to the cervical regions or the brainstem. Although of value to inform future experiments, this impacts the strength of the conclusions that can be drawn.

    4. Reviewer #3 (Public Review):

      Hayashi et al., investigate the role of spinal neurons derived from the V2 progenitor domain. They identify a molecular marker, Hes2, specific to the V2 lineage in the spinal cord. The authors use this result to generate a new mouse line allowing specific access to the Hes2 lineage and show that this lineage is composed of excitatory V2a and inhibitory V2b spinal interneurons plus some populations of supraspinal neurons. Taking advantage of this new tool, they demonstrate that the developmental silencing of the Hes2 lineage leads to a disruption of mouse locomotor gait characterized by shorter strides and an increased cadence with no alteration of the alternation between flexion and extension. In addition, the authors show that the silencing of the Hes2 lineage also leads to an alteration of the interlimb coordination and a decreased capacity of the mice to achieve complex motor tasks. Using an intersectional genetic approach, the authors further demonstrate that the selective ablation of spinal V2 neurons in adult mice recapitulates the festination phenotype as well as the altered execution of complex motor tasks.

      By identifying a novel marker of the V2 lineage in the spinal cord and using this finding to generate a new mouse line Hayashi and colleagues suggest an intriguing interplay between excitatory and inhibitory V2 spinal neurons modulating differentially, multiple facets of motor behavior.

      The conclusions of this study are for the vast majority well supported by data. However, a few additional validations of the mouse model that is used and clarification about the methods of statistical analysis would improve the quality of this manuscript.

      1) Additional validations of the Hes2iCre mouse line generated and used in this study would improve the quality of the manuscript as well as shed light on the potential value of the use of the Hes2iCre mouse line for future investigations.

      - When reporting the cell population labeled by GFP in Hes2iCre; R26LSL-Sun1-GFP the authors need to report the number of animals on which these quantifications were performed to strengthen their conclusions (Figure 3C-E). Similarly, when showing the number of Hes2+, Chx10+ (V2a) and GATA3+ (V2b) neurons in Hes2iCre heterozygous vs homozygous the number of animals should be reported (Figure 3G; Figure S2E-F).

      - The numbers of Hes2+, Chx10+ (V2a) and GATA3+ (V2b) neurons in Hes2iCre heterozygous vs homozygous is reported. However, it would improve the validation of the mouse line, if the authors could provide a quantification of the numbers of Chx10+ and GATA3+ cells in heterogygous Hes2WT/iCre animals versus littermates lacking the Cre.

      - Although the study focus on spinal V2 neurons and the intersectional approach used in the last part of the paper is compelling, a better description of the supraspinal neurons that are part of the Hes2 lineage would give a better insight into the potential contribution of supraspinal Hes2 lineage to the motor phenotype described in Hes2-silenced mouse. In particular, an experiment showing if V2 (especially Chx10+ V2a) neurons from the medullary reticular nucleus are part of the Hes2 lineage would allow us to get a better grasp on the potential supraspinal effect of Hes2 neurons silencing.

      2) Adding a part in the methods explaining the statistical analysis applied is needed. In this part, the choices of the statistical analysis performed should be clearly explained and the assumptions stated. Although the intersectional genetic approach is challenging and does not allow for obtaining numerous animals, the use of parametric Student's t-tests on groups with only 4 animals is discussable and at least needs to be justified in the methods (results presented in Figure 6 and Figure S5). When the number of statistical units allows it, the normality of the distributions and the homoscedasticity should be tested prior to the use a parametric test. In some instances, tests taking into account the hierarchical structure of the data could be used. Furthermore, running statistical analysis on what seems to be a group of n=2 statistical units (Figure S3L) is not appropriate.

      3) Although this decision belongs to the authors, the use of the term "synergy" in the title and abstract might be misleading and might lead to confusion regarding the important outcome of this study. The authors show compelling evidence that the spinal ablation of the V2 lineage leads to a disruption of the ipsilateral coordination of body movements. However, as well explained by the authors, prior studies ablating individual V2a and V2b populations did not show any abnormal ipsilateral body coordination. This rather suggests a redundant or complementary function of inhibitory and excitatory V2 spinal neurons in spinal circuits, with the possibility for one individual population to compensate for the effect on the ipsilateral coordination following the ablation of the other population. Alternatively, "synergy" may suggest a simultaneous activity of V2a and V2b neurons that is not in the scope of this work.

    1. eLife assessment

      This study seeks to determine how synaptic relationships between principal cell types in the olfactory system vary with glomerulus selectivity and is therefore valuable to the sub-field. The methodology is solid, but technical limitations require that claims regarding local interneurons be tempered as they were grouped with other neuron types for analyses, and with only one sample from each glomerulus, it is difficult to assess the import of differences between glomeruli without measures of inter-animal variability.

    2. Reviewer #1 (Public Review):

      In this manuscript, Gruber et al perform serial EM sections of the antennal lobe and reconstruct the neurites innervating two types of glomeruli - one that is narrowly tuned to geosmin and one that is broadly tuned to other odours. They quantify and describe various aspects of the innervations of olfactory sensory neurons (OSNs), uniglomerlular projection neurons (uPNs), and the multiglomerular Local interneurons (LNs) and PNs (mPNs). They find that narrowly tuned glomeruli had stronger connectivity from OSNs to PNs and LNs, and considerably more connections between sister OSNs and sister PNs than the broadly tuned glomeruli. They also had less connectivity with the contralateral glomerluli. These observations are suggestive of strong feed-forward information flow with minimal presynaptic inhibition in narrowly tuned gomeruli, which might be ecologically relevant, for example, while making quick decisions such as avoiding a geosmin-laden landing site. In contrast, information flow in more broadly tuned glomeruli show much more lateralisation of connectivity to the contralateral glomerulus, as well as to other ipsilateral glomeruli.

      The data are well presented, the manuscript clearly written, and the results will be useful to the olfaction community. I wonder, given the hemibrain and FAFB datasets exist, whether the authors have considered verifying whether the trends they observe in connectivity hold across three brains? Is it stereotypic?

    3. Reviewer #2 (Public Review):

      The chemoreceptor proteins expressed by olfactory sensory neurons differ in their selectivity such that glomeruli vary in the breadth of volatile chemicals to which they respond. Prior work assessing the relationship between tuning breadth and the demographics of principal neuron types that innervate a glomerulus demonstrated that narrowly tuned glomeruli are innervated more projection neurons (output neurons) and fewer local interneurons relative to more broadly tuned glomeruli. The present study used high-resolution electron microscopy to determine which synaptic relationships between principal cell types also vary with glomerulus tuning breadth using a narrowly tuned glomerulus (DA2) and a broadly tuned glomerulus (DL5). The strength of this study lies in the comprehensive, synapse-level resolution of the approach. Furthermore, the authors implement a very elegant approach of using a 2-photon microscope to score the upper and lower bounds of each glomerulus, thus defining the bounds of their restricted regions of interest. There were several interesting differences including greater axo-axonic afferent synapses and dendrodentric output neuron synapses in the narrowly tuned glomerulus, and greater synapses upon sensory afferents from multiglomerular neurons and output neuron autapses in the broadly tuned glomerulus.

      The study is limited by a few factors. There was a technical need to group all local interneurons, centrifugal neurons, and multiglomerular projection neurons into one category ("multiglomerular neurons") which complicates any interpretations as even multiglomerular projection neurons are very diverse. Additionally, there were as many differences between the two narrowly tuned glomeruli as there were comparing the narrowly and broadly tuned glomeruli. Architecture differences may therefore not reflect differences in tuning breadth, but rather the ecological significance of the odors detected by cognate sensory afferents. Finally, some synaptic relationships are described as differing and others as being the same between glomeruli, but with only one sample from each glomerulus, it is difficult to determine when measures differ when there is no measure of inter-animal variability. If these caveats are kept in mind, this work reveals some very interesting potential differences in circuit architecture associated with glomerular tuning breadth.

      This work establishes specific hypotheses about network function within the olfactory system that can be pursued using targeted physiological approaches. It also identifies key traits that can be explored using other high-resolution EM datasets and other glomeruli that vary in their tuning selectivity. Finally, the laser "branding" technique used in this study establishes a reduced-cost procedure for obtaining smaller EM datasets from targeted volumes of interest by leveraging the ability to transgenically label brain regions in Drosophila.

    1. eLife assessment

      This is an important study about the mechanisms underlying our capacity to represent and hold recent events in our memory and how they are influenced by past experiences. A key aspect of the model put forward here is the presence of discrete jumps in neural activity with the posterior parietal region of the cortex. The strength of evidence is largely solid, with some weaknesses noted in the methodology. Both reviewers suggested ways in which this aspect of the model can to be tested further and resolve conflicts with previously published experimental results, in particular the study by Papadimitriou et al 2014 in Journal of Neurophysiology.

    2. Reviewer #1 (Public Review):

      This paper aims to explain recent experimental results that showed deactivating the PPC in rats reduced both the contraction bias and the recent history bias during working memory tasks. The authors propose a two-component attractor model, with a slow PPC area and a faster WM area (perhaps mPFC, but unspecified). Crucially, the PPC memory has slow adaptation that causes it to eventually decay and then suddenly jump to the value of the last stimulus. These discrete jumps lead to an effective sampling of the distribution of stimuli, as opposed to a gradual drift towards the mean that was proposed by other models. Because these jumps are single-trial events, and behavior on single events is binary, various statistical measures are proposed to support this model. To facilitate this comparison, the authors derive a simple probabilistic model that is consistent with both the mechanistic model and behavioral data from humans and rats. The authors show data consistent with model predictions: longer interstimulus intervals (ISIs) increase biases due to a longer effect over the WM, while longer intertrial intervals (ITIs) reduce biases. Finally, they perform new experiments using skewed or bimodal stimulus distributions, in which the new model better fits the data compared to Bayesian models.

      The mechanistic proposed model is simple and elegant, and it captures both biases that were previously observed in behavior, and how these are affected by the ISI and ITI (as explained above). Their findings help rethink whether our understanding of contraction bias is correct.

      On the other hand, the main proposal - discrete jumps in PPC - is only indirectly verified.

      The model predicts a systematic change in bias with inter-trial-interval. Unless I missed it, this is not shown in the experimental data. Perhaps the self-paced nature of the experiments allows to test this?

      The data in some of the figures in the paper are hard to read. For instance, Figure 3B might be easier to understand if only the first 20 trials or so are shown with larger spacing. Likewise, Figure 5C contains many overlapping curves that are hard to make out.

      There is a gap between the values of tau_PPC and tau_WM. First - is this consistent with reports of slower timescales in PFC compared to other areas? Second - is it important for the model, or is it mostly the adaptation timescale in PPC that matters?<br /> Regarding the relation to other models, the model by Hachen et al (Ref 43) also has two interacting memory systems. It could be useful to better state the connection, if it exists.

    3. Reviewer #2 (Public Review):

      Working memory is not error free. Behavioral reports of items held in working memory display several types of bias, including contraction bias and serial dependence. Recent work from Akrami and colleagues demonstrates that inactivating rodent PPC reduces both forms of bias, raising the possibility of a common cause.

      In the present study, Boboeva, Pezzotta, Clopath, and Akrami introduce circuit and descriptive variants of a model in which the contents of working memory can be replaced by previously remembered items. This volatility manifests as contraction bias and serial dependence in simulated behavior, parsimoniously explaining both sources of bias. The authors validate their model by showing that it can recapitulate previously published and novel behavioral results in rodents and neurotypical and atypical humans.

      Both the modeling and the experimental work is rigorous, providing compelling evidence that a model of working memory in which reports sometimes sample past experience can produce both contraction bias and serial dependence, and that this model is consistent with behavioral observations across rodents and humans in the parametric working memory (PWM) task.

      Evidence for the model advanced by the authors, however, remains incomplete. The model makes several bold predictions about behavior and neural activity, untested here, that either conflict with previous findings or have yet to be reported but are necessary to appropriately constrain the model.

      First, in the most general (descriptive) formulation of the Boboeva et al. model, on a fraction of trials items in working memory are replaced by items observed on previous trials. In delayed estimation paradigms, which allow a more direct behavioral readout of memory items on a trial-by-trial basis than the PWM task considered here, reports should therefore be locked to previous items on a fraction of trials rather than display a small but consistent bias towards previous items. However, the latter has been reported (e.g., in primate spatial working memory, Papadimitriou et al., J Neurophysiol 2014). The ready availability of delayed estimation datasets online (e.g., from Rademaker and colleagues, https://osf.io/jmkc9/) will facilitate in-depth investigation and reconciliation of this issue.

      Second, the bulk of the modeling efforts presented here are devoted to a circuit-level description of how putative posterior parietal cortex (PPC) and working-memory (WM) related networks may interact to produce such volatility and biases in memory. This effort is extremely useful because it allows the model to be constrained by neural observations and manipulations in addition to behavior, and the authors begin this line of inquiry here (by showing that the circuit model can account for effects of optogenetic inactivation of rodent PPC). Further experiments, particularly electrophysiology in PPC and WM-related areas, will allow further validation of the circuit model. For example, the model makes the strong prediction that WM-related activity should display 'jumps' to states reflecting previously presented items on some trials. This hypothesis is readily testable using modern high-density recording techniques and single-trial analyses.

      Finally, while there has been a refreshing movement away from an overreliance on p-values in recent years (e.g., Amrhein et al., PeerJ 2017), hypothesis testing, when used appropriately, provides the reader with useful information about the amount of variability in experimental datasets. While the excellent visualizations and apparently strong effect sizes in the paper mitigate the need for p-values to an extent, the paucity of statistical analysis does impede interpretation of a number of panels in the paper (e.g., the results for the negatively skewed distribution in 5D, the reliability of the attractive effects in 6a/b for 2- and 3- trials back).

    1. eLife assessment

      This study provides a clearly presented and thoughtfully analyzed single cell-resolution dataset of gene expression in wildtype and mutant zebrafish skin. These data are used by the authors to develop and test hypotheses about cell lineage relationships and signaling interactions between cell types in the skin, allowing them to identify roles for several signaling pathways and the hypodermis in scale and pigment cell development. These findings constitute a fundamental contribution to the field, and the rigor of the analyses make this manuscript compelling.

    2. Reviewer #1 (Public Review):

      In their study, Aman et al. utilized single cell transcriptome analysis to investigate wild-type and mutant zebrafish skin tissues during the post-embryonic growth period. They identified new epidermal cell types, such as ameloblasts, and shed light on the effects of TH on skin morphogenesis. Additionally, they revealed the important role of the hypodermis in supporting pigment cells and adult stripe formation. Overall, I find their figures to be of high quality, their analyses to be appropriate and compelling, and their major claims to be well-supported by additional experiments. Therefore, this study will be an important contribution to the field of vertebrate skin research.

    3. Reviewer #2 (Public Review):

      This work describes transcriptome profiling of dissected skin of zebrafish at post-embryonic stages, at a time when adult structures and patterns are forming. The authors have used the state-of-the-art combinatorial indexing RNA-seq approach to generate single cell (nucleus) resolution. The data appears robust and is coherent across the four different genotypes used by the authors.

      The authors present the data in a logical and accessible manner, with appropriate reference to the anatomy. They include helpful images of the biology and schematics to illustrate their interpretations.

      The datasets are then interrogated to define cell and signalling relationships between skin compartments in six diverse contexts. The hypotheses generated from the datasets are then tested experimentally. Overall, the experiments are appropriate and rigorously performed. They ask very interesting questions of interactions in the skin and identify novel and specific mechanisms. They validate these well.

      The authors use their datasets to define lineage relationships in the dermal scales and also in the epidermis. They show that circumferential pre-scale forming cells are precursors of focal scale forming cells while there appeared a more discontinuous relationship between lineages in the epidermis.

      The authors present transcriptome evidence for enamel deposition function in epidermal subdomains. This is convincingly confirmed with an ameloblastin in situ. They further demonstrate distinct expression of SCPP and collagen genes in the SFC regions.

      The authors then demonstrate that Eda and TH signalling to the basal epidermal cells generates FGF and PDGF ligands to signal to surrounding mesenchyme, regulating SFC differentiation and dermal stratification respectively.

      Finally, they exploit RNA-seq data performed in parallel in the bnc2 mutants to identify the hypodermal cells as critical regulators of pigment patterning and define the signalling systems used.

      Whilst these six interactions in the skin are disparate, the stories are unified by use of the sci-RNA-seq data to define interactions. Overall, it's an assembly of work which identifies novel and interesting cell interactions and cross-talk mechanisms.

      The paper provides robust evidence of cell interrelationships in the skin undergoing morphogenesis and will be a welcome dataset for the field.

    1. eLife assessment

      This important study utilizes the nematode C. elegans and mammalian cell culture to investigate the role of MML-1/Mondo in conserved regulation of metabolism and aging. The evidence supporting the conclusions is compelling in some areas, such as localization, upstream pathways, and conservation. It is still incomplete in other areas, such as longevity pathway analysis and the link between Mondo and the key downstream mitochondrial metabolic pathways identified. The paper will be of interest to a broad range of biologists studying aging, metabolism, and transcriptional regulation.

    2. Reviewer #1 (Public Review):

      In this manuscript entitled "Hexokinase regulates Mondo-mediated longevity via the PPP and organellar dynamics", Laboy and colleagues investigated upstream regulators of MML-1/Mondo, a key transcription factor that regulates aging and metabolism, using the nematode C. elegans and cultured mammalian cells. By performing a targeted RNAi screen for genes encoding enzymes in glucose metabolism, the authors found that two hexokinases, HXK-1 and HXK-2, regulate nuclear localization of MML-1 in C. elegans. The authors showed that knockdown of hxk-1 and hxk-2 suppressed longevity caused by germline-deficient glp-1 mutations. The authors demonstrated that genetic or pharmacological inhibition of hexokinases decreased nuclear localization of MML-1, via promoting mitochondrial β-oxidation of fatty acids. They found that genetic inhibition of hxk-2 changed the localization of MML-1 from the nucleus to mitochondria and lipid droplets by activating pentose phosphate pathway (PPP). The authors further showed that the inhibition of PPP increased the nuclear localization of mammalian MondoA in cultured human cells under starvation conditions, suggesting the underlying mechanism is evolutionarily conserved. This paper provides compelling evidence for the mechanisms by which novel upstream metabolic pathways regulate MML-1/Mondo, a key transcription factor for longevity and glucose homeostasis, through altering organelle communications, using two different experimental systems, C. elegans and mammalian cells. This paper will be of interest to a broad range of biologists who work on aging, metabolism, and transcriptional regulation.

    3. Reviewer #2 (Public Review):

      Raymond Laboy et.al explored how transcriptional Mondo/Max-like complex (MML-1/MXL-2) is regulated by glucose metabolic signals using germ-line removal longevity model. They believed that MML-1/MXL-2 integrated multiple longevity pathways through nutrient sensing and therefore screened the glucose metabolic enzymes that regulated MML-1 nuclear localization. Hexokinase 1 and 2 were identified as the most vigorous regulators, which function through mitochondrial beta-oxidation and the pentose phosphate pathway (PPP), respectively. MML-1 localized to mitochondria associated with lipid droplets (LD), and MML-1 nuclear localization was correlated with LD size and metabolism. Their findings are interesting and may help us to further explore the mechanisms in multiple longevity models, however, the study is not complete and the working model remains obscure. For example, the exact metabolites that account for the direct regulation of MML-1 were not identified, and more detailed studies of the related cellular processes are needed.

      The identification of responsible metabolites is necessary since multiple pieces of evidence from the study suggests that lipid other than glucose metabolites may be more likely to be the direct regulator of MML-1 and HXK regulate MML-1 indirectly by affecting the lipid metabolism: 1) inhibiting the PPP is sufficient to rescue MML-1 function independent of G6P levels; 2) HXK-1 regulates MML-1 by increasing fatty acid beta-oxidation; 3) LD size correlates with MML-1 nuclear localization and LD metabolism can directly regulate MML-1. The identification of metabolites will be helpful for understanding the mechanism.

      Beta-oxidation and the PPP are involved in the regulation of MML-1 by HXK-1 and HXK-2, respectively. But how these two pathways participate in the regulation is not clear. Is it the beta-oxidation rate or the intermediate metabolites that matters? As for the PPP, it provides substrates for nucleotide synthesis and also its product NADPH is essential for redox balance. Is one of the metabolites or the NADPH levels involved in MML-1 regulation? More studies are needed to provide answers to these concerns.

    1. eLife assessment

      This is an important follow-up study to a previous paper in which the authors reconstituted CO2 metabolism in Escherichia coli (autotrophy). Here, the authors define a set of three mutations that promote autotrophy, highlighting the malleability of E. coli metabolism. The authors make a convincing case that mutations in pgi are loss-of-function mutations that prevent metabolic efflux from the reductive pentose phosphate autocatalytic cycle, but claims about the role of mutations in two other genes - crp and rpoB - are currently incomplete. This research will be particularly interesting to synthetic biologists, systems biologists, and metabolic engineers aiming to develop synthetic autotrophic microorganisms.

    2. Reviewer #1 (Public Review):

      The main objective of this study is to achieve the development of a synthetic autotroph using adaptive laboratory evolution. To accomplish this, the authors conducted chemostat cultivation of engineered E. coli strains under xylose-limiting conditions and identified autotrophic growth and the causative mutations. Additionally, the mutational mechanisms underlying these causative mutations were also explored with drill down assays. Overall, the authors demonstrated that only a small number of genetic changes were sufficient (i.e., 3) to construct an autotrophic E. coli when additional heterologous genes were added. While natural autotrophic microorganisms typically exhibit low genetic tractability, numerous studies have focused on constructing synthetic autotrophs using platform microorganisms such as E. coli. Consequently, this research will be of interest to synthetic biologists and systems biologists working on the development of synthetic autotrophic microorganisms. The conclusions of this paper are mostly well supported by appropriate experimental methods and logical reasoning. However, further experimental validation of the mutational mechanisms involving rpoB and crp would enhance readers' understanding and provide clearer insights, despite acknowledgement that these genes impact a broad set of additional genes. Additionally, a similar study, 10.1371/journal.pgen.1001186, where pgi was deleted from the E. coli genome and evolved to reveal an rpoB mutation is relevant to this work and should be placed in the context of the presented findings.

      The authors addressed rpoB and crp as one unit and performed validation. They cultivated the mutant strain and wild type in a minimal xylose medium with or without formate, comparing their growth and NADH levels. The authors argued that the increased NADH level in the mutant strain might facilitate autotrophic growth. Although these phenotypes appear to be closely related, their relationship cannot be definitively concluded based on the findings presented in this paper alone. Therefore, one recommendation is to explore investigating transcriptomic changes induced by the rpoB and crp mutations. Otherwise, conducting experimental verification to determine whether the NADH level directly causes autotrophic growth would provide further support for the authors' claim.

    3. Reviewer #2 (Public Review):

      Synthetic autotrophy of biotechnologically relevant microorganisms offers exciting chances for CO2 neutral or even CO2 negative production of goods. The authors' lab has recently published an engineered and evolved Escherichia coli strain that can grow on CO2 as its only carbon source. Lab evolution was necessary to achieve growth. Evolved strains displayed tens of mutations, of which likely not all are necessary for the desired phenotype.

      In the present paper the authors identify the mutations that are necessary and sufficient to enable autotrophic growth of engineered E. coli. Three mutations were identified, and their phenotypic role in enhancing growth via the introduced Calvin-Benson-Bassham cycle were characterized. It was demonstrated that these mutations allow autotrophic growth of E. coli with the introduced CBB cycle without any further metabolic intervention. Autotrophic growth is demonstrated by 13C labelling with 13C CO2, measured in proteinogenic amino acids. In Figures 2B and S1, the labeling data are shown, with an interval of the "predicted range under 13CO2". Here, the authors should describe how this interval was derived.

      The methodology is clearly described and appropriate.

      The present results will allow other labs to engineer E. coli and other microorganisms further to assimilate CO2 efficiently into biomass and metabolic products. The importance is evident in the opportunity to employ such strain in CO2 based biotech processes for the production of food and feed protein or chemicals, to reduce atmospheric CO2 levels and the consumption of fossil resources.

    4. Reviewer #3 (Public Review):

      The authors previously showed that expressing formate dehydrogenase, rubisco, carbonic anhydrase, and phosphoribulokinase in Escherichia coli, followed by experimental evolution, led to the generation of strains that can metabolise CO2. Using two rounds of experimental evolution, the authors identify mutations in three genes - pgi, rpoB, and crp - that allow cells to metabolise CO2 in their engineered strain background. The authors make a strong case that mutations in pgi are loss-of-function mutations that prevent metabolic efflux from the reductive pentose phosphate autocatalytic cycle. The authors also argue that mutations in crp and rpoB lead to an increase in the NADH/NAD+ ratio, which would increase the concentration of the electron donor for carbon fixation. While this may explain the role of the crp and rpoB mutations, there is good reason to think that the two mutations have independent effects, and that the change in NADH/NAD+ ratio may not be the major reason for their importance in the CO2-metabolising strain.

      Specific comments:

      1. Deleting pgi rather than using a point mutation would allow the authors to more rigorously test whether loss-off-function mutants are being selected for in their experimental evolution pipeline. The same argument applies to crp.

      2. Page 10, lines 10-11, the authors state "Since Crp and RpoB are known to physically interact in the cell (26-28), we address them as one unit, as it is hard to decouple the effect of one from the other". CRP and RpoB are connected, but the authors' description of them is misleading. CRP activates transcription by interacting with RNA polymerase holoenzyme, of which the Beta subunit (encoded by rpoB) is a part. The specific interaction of CRP is with a different RNA polymerase subunit. The functions of CRP and RpoB, while both related to transcription, are otherwise very different. The mutations in crp and rpoB are unlikely to be directly functionally connected. Hence, they should be considered separately.

      3. A Beta-galactosidase assay would provide a very simple test of CRP H22N activity. There are also simple in vivo and in vitro assays for transcription activation (two different modes of activation) and DNA-binding. H22 is not near the DNA-binding domain, but may impact overall protein structure.

      4. There are many high-resolution structures of both CRP and RpoB (in the context of RNA polymerase). The authors should compare the position of the sites of mutation of these proteins to known functional regions, assuming H22N is not a loss-of-function mutation in crp.

      5. RNA-seq would provide a simple assay for the effects of the crp and rpoB mutations. While the precise effect of the rpoB mutation on RNA polymerase function may be hard to discern, the overall impact on gene expression would likely be informative.

  2. Jul 2023
    1. Author Response:

      Reviewer #1 (Public Review):

      This is a short but important study. Basically, the authors show that α-synuclein overexpression's negative impact on synaptic vesicle recycling is mediated by its interaction with E-domain containing synapsins. This finding is highly relevant for synuclein function as well as for the pathophysiology of synucleinopathies. While the data is clear, functional analysis is somewhat incomplete.

      We will perform all additional functional analyses asked by the reviewer (listed in “recommendations for the authors”) and report that in the revised version. These include dissociation of exo/endocytosis in the context of synapsin-E domain, and further quantification of the rise and fall of pHluorin curves.

      Reviewer #2 (Public Review):

      In this manuscript the authors established synapsin's E-domain as an essential functional binding partner that allows α-syn functionality. They show very elegantly that only synapsin isoforms that have an Edomain bind α-syn and allow the inhibition mediated by α-syn. Deletion of the C-terminus (α-syn 96-110) eliminated this interaction. Hence, synapsin E-domain binds to α-syn enabling the inhibitory effect of αsyn on synaptic transmission.

      The paper will be improved significantly if additional experiments are added to expand and provide a more mechanistic understanding of the effect of α-syn and the intricate interplay between synapsin, αsyn, and the SV. For an enthusiastic reader, the manuscript as it looks now with only 3 figures, ends prematurely. Some of the experiments above or others could complement, expand and strengthen the current manuscript, moving it from a short communication describing the phenomenon to a coherent textbook topic. Nevertheless, this work provides new and exciting evidence for the regulation of neurotransmitter release and its regulation by synapsin and α-syn.

      We will address all the technical and conceptual points raised by the reviewer, and do all the necessary experiments (listed in “recommendations for the authors”) and report that in the revised version). These include quantification of the expression levels of various proteins, evaluation of the dispersion of synapsin and α-syn under the stimulation conditions used in our studies, and consideration of other proposed roles of α-syn.

    2. eLife assessment

      Alpha-synuclein is a synaptic vesicle associated protein that is linked to a number of neurodegenerative disorders. In this manuscript, the authors provide compelling evidence of alpha-synuclein's interaction with E-domain synapsins as the main culprit mediating the suppression of neurotransmitter release and synaptic vesicle recycling by alpha-synuclein. This important work provides molecular mechanisms underlying the pathophysiological functions of alpha-synuclein.

    3. Reviewer #1 (Public Review):

      This is a short but important study. Basically, the authors show that α-synuclein overexpression's negative impact on synaptic vesicle recycling is mediated by its interaction with E-domain containing synapsins. This finding is highly relevant for synuclein function as well as for the pathophysiology of synucleinopathies. While the data is clear, functional analysis is somewhat incomplete.

    4. Reviewer #2 (Public Review):

      In this manuscript the authors established synapsin's E-domain as an essential functional binding partner that allows α-syn functionality. They show very elegantly that only synapsin isoforms that have an E-domain bind α-syn and allow the inhibition mediated by α-syn. Deletion of the C-terminus (α-syn 96-110) eliminated this interaction. Hence, synapsin E-domain binds to α-syn enabling the inhibitory effect of α-syn on synaptic transmission.

      The paper will be improved significantly if additional experiments are added to expand and provide a more mechanistic understanding of the effect of α-syn and the intricate interplay between synapsin, α-syn, and the SV. For an enthusiastic reader, the manuscript as it looks now with only 3 figures, ends prematurely. Some of the experiments above or others could complement, expand and strengthen the current manuscript, moving it from a short communication describing the phenomenon to a coherent textbook topic. Nevertheless, this work provides new and exciting evidence for the regulation of neurotransmitter release and its regulation by synapsin and α-syn.

    1. Author Response:

      We thank the reviewers for their very thoughtful suggestions. We will submit a revised manuscript addressing these comments and including a point-by-point response to reviewers. We will provide evidence that Wnt3a treatment increases macropinocytosis and that PMA increases this cellular response in cultured cells, but only in the presence of Wnt3a. This will be done using the current gold standard for macropinocytosis assays, the uptake of high molecular weight Dextran sensitive to the Na/H+ exchanger inhibitor EIPA. A time-lapse video of rapid macropinocytosis cup induction by PMA in colorectal cancer cells will also be provided. Other new experiments will show that levels of the upstream macropinocytosis regulator Rac1 are increased by β-catenin DNA, constitutively active Lrp6, or LiCl. The criticism that by taking a broad approach our study lacks mechanistic analysis depth is a valid one. The reason we used a multiplicity of approaches – Xenopus embryo assays, cancer calls in culture, colon cancer tissue arrays and mouse xenografts – was to validate, in as many different ways possible, a central finding: that the classical phorbol ester tumor promoters can act by potentiating Wnt/β-catenin signaling through membrane trafficking.

    2. eLife assessment

      This valuable study describes an interesting synergism between macropinocytosis and Wnt signaling in multiple biological systems. The main claims are at least partially supported with solid evidence. The pharmacological manipulations comes with a number of caveats, and the mechanistic basis of the described synergism remains unclear. The study will be of interest to cell biologists and biomedical researchers, particularly in the Wnt field and in tumor biology.

    3. Reviewer #1 (Public Review):

      In this ms, Tejeda-Muñoz and colleagues examine the roles of macropinocytosis in WNT signalling activation in development (Xenopus) and cancer (CRC sections, cell lines and xenograft experiments). Furthermore, they investigate the effect of the inflammation inducer Phorbol-12-myristate-13-acetate (PMA) in WNT signalling activation through macropinocytosis. They propose that macropinocytosis is a key driver of WNT signalling, including upon oncogenic activation, with relevance in cancer progression.

      I found the analyses and conclusions of the relevance of macropinocytosis in WNT signalling compelling, notably upon constitutive activation both during development and in CRC. However, I think this manuscript only partially characterises the effects of PMA in WNT signalling, largely due to a lack of an epistatic characterisation of PMA roles in Wnt activation. For example:

      1- The authors show that PMA cooperate with 1) GSK3 inhibition in Xenopus to promote WNT activation, and 2) (possibly) with APCmut in SW480 to induce b-cat and FAK accumulation. To sustain a specific functional interaction between WNT and PMA, the effects should be tested through additional epistatic experiments. For example, does PMA cooperate with Wnt8 in axis duplication analyses? Does PMA cooperate with any other WNT alteration in CRC or other cell lines? Importantly, does APC re-introduction in SW480 rescue the effect of PMA? Such analyses could be critical to determine specificity of the functional interactions between WNT and PMA. This question could be addressed by performing classical epistatic analyses in cell lines (CRC or HEK) focusing on WNT activity, and by including rescue experiments targeting the WNT pathway downstream of the effects e.g., dnTCF, APC re- introduction, etc.

      2- While the epistatic analyses of WNT and macropinocytosis are clear in frog, the causal link in CRC cells is contained to b-catenin accumulation. While is clear that macropinocytosis reduces spheroid growth in SW480, the lack of rescue experiments with e.g., constitutive active b-catenin or any other WNT perturbation or/and APC re-introduction, limit the conclusions of this experiment.

      Minor comments:

      3- Different compounds targeting membrane trafficking are used to rescue modes of WNT activation (Wnt8 vs LiCl) in Xenopus.

      4- The abstract does not state the results in CRC/xenografts

      5- Labels of Figure 2E might be swap

      6- Figure 4i,j, 6 and s4 rely on qualitative analyses instead of quantifications, which underscores their evaluation. On the other hand, the detailed quantifications in Figure S3A-D strongly support the images of Figure 5

    4. Reviewer #2 (Public Review):

      Tejeda Muñoz et al. investigate the intersection of Wnt signaling, macropinocytosis, lysosomes, focal adhesions and membrane trafficking in embryogenesis and cancer. Following up on their previous papers, the authors present evidence that PMA enhances Wnt signaling and embryonic patterning through macropinocytosis. Proteins that are associated with the endo-lysosomal pathway and Wnt signaling are co-increased in colorectal cancer samples, consistent with their pro-tumorigenic action. The function of macropinocytosis is not well understood in most physiological contexts, and its role in Wnt signaling is intriguing. The authors use a wide range of models - Xenopus embryos, cancer cells in culture and in xenografts and patient samples to investigate several endolysosomal processes that appear to act upstream or downstream of Wnt. A downside of this broad approach is a lack of mechanistic depth. In particular, few experiments monitor macropinocytosis directly, and macropinocytosis manipulations have pleiotropic effects that are open alternative interpretations. Several experiments are confirmatory of previous findings; the manuscript could be improved by focusing on the novel relationship between PMA-induced macropinocytosis and better support these conclusions with additional experiments.

      The authors use a range of inhibitors that suppress macropinosome formation (EIPA, Bafilomycin A1, Rac1 inhibition). However, these are not specific macropinocytosis inhibitors (EIPA blocks an Na+/H+ exchanger, which is highly toxic and perturbs cellular pH balance; Bafilomycin blocks the V-ATPase, which has essential functions in the Golgi, endosomes and lysosomes; Rac1 signals through multiple downstream pathways). A specific macropinocytosis inhibitor does not exist, and it is thus important to support key conclusions with dextran uptake experiments.

      The title states that PMA increases Wnt signaling through macropinocytosis. However, the mechanistic relationship between PMA-induced macropinocytosis and Wnt signaling is not well supported. The authors refer to a classical paper that demonstrates macropinocytosis induction by PMA in macrophages (PMID: 2613767). Unlike most cell types, macrophages display growth factor-induced and constitutive macropinocytic pathways (PMID: 30967001). It would thus be important to demonstrate macropinocytosis induction by PMA experimentally in Xenopus embryos / cancer cells. Does treatment with EIPA / Bafilomycin / Rac1i decrease the dextran signal in embryos? In macrophages, the PKC inhibitor Calphostin C blocks macropinocytosis induction by PMA (PMID: 25688212). Does Calphostin C block macropinocytosis in embryos / cancer cells? Do the various combinations of Wnts / Wnt agonists and PMA have additive or synergistic effects on dextran uptake? If the authors want to conclude that PMA activates Wnt signaling, it would also be important to demonstrate the effect of PMA on Wnt target gene expression.

      The experiments concerning macropinosome formation in Xenopus embryos are not very convincing. Macropinosomes are circular vesicles whose size in mammalian cells ranges from 0.2 - 10 µM (PMID: 18612320). The TMR-dextran signal in Fig. 1A does not obviously label structures that look like macropinosomes; rather the signal is diffusely localized throughout the dorsal compartment, which could be extracellular (or perhaps cytosolic). I have similar concerns for the cell culture experiments, where dextran uptake is only shown for SW480 spheroids in Fig. S2. It would be helpful to quantify size of the circular structures (is this consistent with macropinosomes?).

      In Fig. 4I - J, the dramatic decrease in b-catenin and especially in Rac1 after overnight EIPA treatment is rather surprising. How do the authors explain these findings? Is there any evidence that macropinocytosis stabilizes Rac1? Could this be another effect of EIPA or general toxicity?

      On a similar note, Fig. 6 K - L the FAK staining in control cells appears to localize to focal adhesions, but in PMA-treated cells is strongly localized throughout the cell. Do the authors have any thoughts on how PMA stabilizes FAK and where the kinase localizes under these conditions? Does PMA treatment increase FAK signaling activity?

      The tumor stainings in Figure 5 are interesting but correlative. Pak1 functions in multiple cellular processes and Pak1 levels are not a direct marker for macropinocytosis. In the discussion, the authors discuss evidence that the V-ATPase translocates to the plasma membrane in cancer to drive extracellular acidification. To which extent does the Voa3 staining reflect lysosomal V-ATPase? Do the authors have controls for antibody specificity?

    5. Reviewer #3 (Public Review):

      The manuscript by Tejeda-Munoz examines signaling by Wnt and macropinocytosis in Xenopus embryos and colon cancer cells. A major problem with the study is the extensive use of pleiotropic inhibitors as "specific" inhibitors of macropinocytosis in embryos. It is true that BafA and EIPA block macropinocytosis, but they do many other things as well. A major target of EIPA is the NheI Na+/proton transporter, which also regulates invasive structures (podosomes, invadopodia) which could have major roles in development. Similarly, Baf1 will disrupt lysosomes and the endocytic system, which secondary effects on mTOR signaling and growth factor receptor trafficking. The authors cannot assume that processes inhibited by these drugs demonstrate a role of macropinocytosis. While correlations in tumor samples between increased expression of PAK1 and V0a3 and decreased expression of GSK3 are consistent with a link between macropinocytosis and Wnt-driven malignancy, the cell and embryo-based experiments do not convincingly make this connection. Finally, the data on FAK and TES are not well integrated with the rest of the manuscript.

      1. The data in Fig. 1A do not convincingly demonstrate macropinocytosis - it is impossible to tell what is being labeled by the dextran.

      2. The data in Fig. 2 do not make sense. LiCL2 bypasses the WNT activation pathway by inhibiting GSK3. If subsequent treatment with BafA blocks the effects of GSK3 inhibition, then BafrA is doing something unrelated to Wnt activation, whose target is the inhibition/sequestration of GSK3. While BafA might block GSK3 sequestration by inhibiting MVB function, it should have no effect on the inhibition of GSK3 by LiCl2.

      3. The effect of EHT on MP in SW480 cells is not clearly related to what is happening in the embryos. The nearly total loss of staining for Rac and -catenin after overnight EIPA does not implicate MP in protein stability - critical controls for cell viability and overall protein turnover are absent. Inhibition of WNT signaling might be expected to enhance -catenin turnover, but the effect on Rac1 is surprising. A more quantitative analysis by western blotting is required.

      4. The data on FAK inhibition and TES trafficking are poorly integrated with the rest of the paper.

    1. eLife assessment

      This study presents a valuable deep learning-based model for predicting fracture within the next five years from just a standard distal radius and ulna scan obtained using high-resolution computed tomography images. The evidence supporting the conclusion that the model-predicted fracture prediction score can be used clinically to identify women at risk of fracture more effectively than with the current standard clinical approach is convincing. This work will be of interest to biomechanists and biomedical engineers working on osteoporosis.

    2. Reviewer #1 (Public Review):

      This is an interesting study, covering a future direction for the diagnosis of osteoporosis.

      Strength: well validated cohorts, authors are more than experts in the field, use of technology.

      Weakness: the approach is still very experimental and far away to be clinically relevant.

      The authors have performed a very interesting analysis combining data from different, well designed, cohorts.<br /> Authors are leaders in the field. The topic is of interest, the statistical analysis well designed, and the paper is well written and easy to read even for not experts.

      I have a few comments<br /> 1) Although authors are very optimistic about HRpQCT, they should recognize (and acknowledge in the discussion) that their data have a very low clinical impact for the majority of the population. The cost of the machine is still prohibitive for the majority of clinical centers, technology needs more validations out of the reference centers, a lot of controversy on the methodology for cortical porosity. Basically, after 20 years since its introduction, it remains more a research tool than a clinical opportunity. This comment is of course not against the scientific hypothesis or the conduction of the study which remain brilliant<br /> 2) How authors have managed the role of possible secondary causes of osteoporosis? Did they excluded patients with GIOP for example? Are all study subjects treatment naïve?<br /> 3) It would be worth to better describe the role of cortical porosity and the predictive value of this parameter which has been extensively studied by Dr Seeman.

    3. Reviewer #2 (Public Review):

      The authors apply a deep learning approach to predict fracture using forearm HR-pQCT data pooled from 3 longitudinal cohorts totaling 2666 postmenopausal women. The deep learning based 'Structural Fragility Score - AI' was compared to FRAX w/BMD and BMD alone in its ability to identify women who went on to fracture within the next 5 years. SFS-AI performed significantly better than FRAX w/BMD and BMD alone in all metrics except specificity. This work establishes that deep learning methods applied to HR-pQCT data have great potential for use in predicting (and therefore preventing) fractures.

      The low specificity of SFS-AI compared to FRAX and BMD is not adequately acknowledged or addressed - will this lead to over diagnosis / unnecessary interventions and is that a problem?

      The paper does not adequately address the relative role of bone vs soft tissue features in the determination of SFS-AI. It would be possible to feed the algorithm only the segmented bone volumes, and compare AUC, etc, of SFS-AI (bone) to that acquired using the entire bone + muscle volume. It's possible (likely?) that most of the predictive power will remain. If muscle is an important part of this algorithm, then mid-diaphyseal tibia scans will be an interesting next application - since that scan site is closer to the muscle belly compared to the distal radius site which contains very little muscle volume.

    4. Reviewer #3 (Public Review):

      This work presents a novel approach for predicting fracture risk from high-resolution peripheral quantitative computed tomography (HR-pQCT): by training a deep learning model to predict five-year fracture risk where the sole input is the full 3D HR-pQCT image. Prior studies have developed models, of varying complexity, to predict fracture risk from HR-pQCT. However, this study is novel in that neither the typical manual efforts required for HR-pQCT image analysis nor additional biomarker collection are required, simplifying potential clinical implementation. The authors show that their model predicts fracture within five years with greater sensitivity than FRAX (with an assumed diagnostic threshold of FRAX > 20% or T-score < -2.5 SD), albeit with reduced specificity. The authors further investigate how their model output, the structural fragility score derived by artificial intelligence (SFS-AI), is correlated with two microarchitectural parameters that can be measured with HR-pQCT, demonstrating that their model captures many relevant characteristics of a patient's bone quality that cannot be captured by the standard clinical tools used to diagnose osteoporosis, and thus to identify patients at elevated risk of fracture.

      Strengths

      The authors use a very large dataset and a combination of state-of-the-art methods for training and validating their fracture prediction model: k-fold cross-validation is used for training and a held-out external test dataset is used to evaluate ensembled model predictions compared to the current clinical standard for fracture screening. The results with the test dataset show that the model can identify women at risk of fracture in the next five years with greater sensitivity than both FRAX with BMD and BMD alone.

      Because the model takes only a full 3D HR-pQCT image as input, the feasibility of clinical implementation is maximized. Standard morphological analysis with HR-pQCT is semi-automated and the labour required for the manual portions of analysis poses a significant barrier to clinical implementation. There is mounting evidence for the clinical utility of HR-pQCT (see Gazzotti et al. Br. J. Radiol. 2023) and fully automated models such as the one presented in this work will be critical for making clinical applications of HR-pQCT feasible.

      The authors quantify the contributions to the variance of the model output and examine activation maps overlaid on the HR-pQCT images. These sub-analyses indicate that the model is identifying relevant characteristics of hierarchical bone structure for fracture prediction that are not available from aBMD measurements from DXA and thus are not accounted for in the current standard clinical diagnostic tool.

      Weaknesses

      The authors make the claim that SFS-AI outperforms FRAX with BMD and BMD in terms of sensitivity and specificity of predicting fragility fractures within 5 years. This claim is supported by looking at the ROCs in figure 1, but the specific comparison made in the discussion is not completely fair as currently presented in the article. The thresholds of FRAX > 20% and T-score < -2.5SD were selected by the authors for binary comparison. FRAX and BMD achieve specificities of ~95% at these thresholds, while SFS-AI achieves a specificity of only 77% at the selected threshold, SFS-AI > 0.5. Conversely, SFS-AI achieves a sensitivity of 50% to 60% while FRAX and BMD achieve very poor sensitivities, between 4% and 16%. The authors have not justified their choice of binarization thresholds for FRAX or BMD by citing literature or clinical guidelines, nor have they motivated their choice of any of the thresholds with a discussion of how clinical considerations could influence the sensitivity-specificity trade-off. It is difficult to directly compare the prognosticative performance of SFS-AI to that of FRAX or BMD when the thresholds for FRAX and BMD are at such different locations on the respective ROCs when compared to where the threshold for SFS-AI places it on the ROC. The authors have also not compared their estimates of the sensitivity and specificity of FRAX and BMD to literature to provide important context for the comparison to SFS-AI. An additional unacknowledged limitation is that the FRAX tool is designed to predict 10-year fracture risk, while the outcome used to train the SFS-AI model and to compare to FRAX was 5-year fracture risk.

      Direct comparison may be impossible due to differences in study design or reported performance metrics, but the authors have not at all discussed the quantitative performance of prior models for fracture prediction or discrimination that use HR-pQCT (see Lu et al. Bone 2023 or Whittier et al. JBMR 2023) to contextualize the performance of their novel model. While the model presented in this article has the advantage that it does not require the typical expertise and manual effort needed for HR-pQCT image analysis, it is still important to acknowledge the potential trade-off of ease of implementation vs performance. Models that incorporate additional clinical data or that use standard HR-pQCT analysis outputs rather than raw images may perform well enough to justify the increase in the difficulty of clinical implementation or to motivate further work on fully automating microarchitectural analysis with HR-pQCT images.

      Finally, the article does not indicate that either the code used for model training or the trained model itself will be made publicly available. This limits the ability of future researchers to replicate and build on the results presented in the article.

    1. eLife assessment

      This study provides important findings on the distinct functions of resident and recruited macrophages during cardiac healing after myocardial ischemia. Using state-of-the-art fate-mapping models and genetic and pharmacological targeting approaches, the authors provide solid evidence that the absence of resident macrophages do not influence infarct size but instead alter the immune cell crosstalk in response to injury. However, the functional evaluation of resident macrophages is limited by potential off target effects in ∆FIRE mice. This study should be of interest to the fields of Development, Immunology and Cardiology.

    2. Reviewer #1 (Public Review):

      Weinberger et al. use different fate-mapping models, the FIRE model and PLX-diet to follow and target different macrophage populations and combine them with single-cell data to understand their contribution to heart regeneration after I/R injury. This question has already been addressed by other groups in the field using different models. However, the major strength of this manuscript is the usage of the FIRE mouse model that, for the first time, allows specific targeting of only fetal-derived macrophages.<br /> The data show that the absence of resident macrophages is not influencing infarct size but instead is altering the immune cell crosstalk in response to injury, which is in line with the current idea in the field that macrophages of different origins have distinct functions in tissues, especially after an injury.<br /> To fully support the claims of the study, specific targeting of monocyte-derived macrophages or the inhibition of their influx at different stages after injury would be of high interest.<br /> In summary, the study is well done and important for the field of cardiac injury. But it also provides a novel model (FIRE mice + RANK-Cre fate-mapping) for other tissues to study the function of fetal-derived macrophages while monocyte-derived macrophages remain intact.

    3. Reviewer #2 (Public Review):

      In this study Weinberger et al. investigated cardiac macrophage subsets after ischemia/reperfusion (I/R) injury in mice. The authors studied a ∆FIRE mouse model (deletion of a regulatory element in the Csf1r locus), in which only tissue resident macrophages might be ablated. The authors showed a reduction of resident macrophages in ∆FIRE mice and characterized its macrophages populations via scRNAseq at baseline conditions and after I/R injury. 2 days after I/R protocol ∆FIRE mice showed an enhanced pro inflammatory phenotype in the RNAseq data and differential effects on echocardiographic function 6 and 30 days after I/R injury. Via flow cytometry and histology the authors confirmed existing evidence of increased bone marrow-derived macrophage infiltration to the heart, specifically to the ischemic myocardium. Macrophage population in ∆FIRE mice after I/R injury were only changed in the remote zone. Further RNAseq data on resident or recruited macrophages showed transcriptional differences between both cell types in terms of homeostasis-related genes and inflammation. Depleting all macrophage using a Csf1r inhibitor resulted in a reduced cardiac function and increased fibrosis.

      Strengths<br /> 1. The authors utilized robust methodology encompassing state of the art immunological methods, different genetic mouse models and transcriptomics.<br /> 2. The topic of this work is important given the emerging role of tissue resident macrophages in cardiac homeostasis and disease.

      Weaknesses:<br /> 1. Specificity of ∆FIRE mouse model for ablating resident macrophages.<br /> The study builds on the assumption that only resident macrophages are ablated in ∆FIRE mice, while bone marrow-derived macrophages are unaffected. While the effects of the ∆FIRE model is nicely shown for resident macrophages, the authors did not directly assess bone marrow-derived macrophages. Moreover, in the immunohistological images in Fig. 1D nearly all macrophages appear to be absent. It would be helpful to further address the question of whether recruited macrophages are influenced in ∆FIRE mice. Evaluation of YFP positive heart and blood cells in ∆FIRE mice crossed with Flt3CreRosa26eYFP mice could clarify whether bone marrow-derived cardiac macrophages are influenced in ∆FIRE mice. This would be even more relevant in the I/R model where recruitment of bone marrow-derived macrophages is increased. A more direct assessment of recruited macrophages in ∆FIRE mice could also help to discuss potential similarities or discrepancies to the study of Bajpai et al, Circ Res 2018 (https://doi.org/10.1161/CIRCRESAHA.118.314028), which showed distinct effects of resident versus recruited macrophages after myocardial infarction. Providing the quantification of flow cytometry data (fig. 1E-F) would be supportive.

      2. Limited adverse cardiac remodeling in ∆FIRE mice after I/R.<br /> The authors suggested an adverse cardiac remodeling in ∆FIRE mice. However, the relevance of a <5% reduction in ejection fraction/stroke volume within an overall normal range in ∆FIRE mice is questionable. Moreover, 6 days after I/R injury ∆FIRE mice were protected from the impairment in ejection fraction and had a smaller viability defect. Based on the data few questions may arise: Why was ablation of resident macrophages beneficial at earlier time points? Are recruited macrophages affected in ∆FIRE mice (see above)? Overall, the manuscript could benefit if the claim of an adverse remodeling in ∆FIRE mice would be discussed more carefully.

      3. Underlying mechanisms.<br /> The study did not functionally evaluated targets from transcriptomics to provide further mechanistic insights. It would be helpful if the authors discuss potential mechanisms of the differential effects of macrophages after ischemia in more detail.

      Other:<br /> - It is unclear why the authors performed RNAseq experiments 2 days after I/R (fig. 5/6), while the proposed functional phenotype occurred later.<br /> - A sample size of 2 animals per group appears very limited for RNAseq in ∆FIRE mice (fig. 6).

    1. eLife assessment

      This manuscript provides novel and important findings regarding the impact of noradrenergic signaling from the locus coeruleus on hippocampal gene expression. The locus coeruleus is the sole source of noradrenaline to the hippocampus and many rapid molecular changes induced by stress are regulated by noradrenaline. This manuscript provides a rigorous investigation into hippocampal genes uniquely regulated by noradrenaline in the presence or absence of stress. Data were collected and analyses were performed using solid methodology, and the results mostly convincingly support the conclusion made with few weaknesses. The study would benefit from a more comprehensive analyses of sex differences.

    2. Reviewer #1 (Public Review):

      Privitera et al., provide a comprehensive and rigorous assessment of how noradrenaline (NA) inputs from the locus coeruleus (LC) to the hippocampus regulate stress-induced acute changes in gene expression. They utilize RNA-sequencing with selective activation/inhibition of LC-NA activity using pharmacological, chemogenetic and optogenetic manipulations to identify a great number of reproducible sets of genes impacted by LC activation. It is noteworthy that this study compares transcriptomic changes in the hippocampus induced by stress alone, as compared with selective circuit activation/inhibition. This reveals a small set of genes that were found to be highly reproducible. Further, the publicly available data will be highly useful to the scientific community.

      A major strength of the study is the inclusion of both males and females. However, with this aspect of the study also lies the biggest weakness. While the experiments tested males and females, they were not powered for identifying sex differences. There are vast amounts of literature documenting the inherent sex differences, both under resting and stress-evoked conditions, in the LC-NA system and this is a major missed opportunity to better understand if there is an impact of these sex-specific differences at the genetic level in a major LC projection region. There are many instances whereby sex effects are apparent, but do not pass multiple testing correction due to low n's. The authors highlight one of them (Ctla2b) in supplemental figure 6. This gene is only upregulated by stress in females. It is appreciated that the manuscript provides an incredible amount of novel data, making the investigation of sex differences ambitious. Data are publicly available for others to conduct follow up work, and therefore it may be useful if a list of those genes that were different based on targeted interrogation of the dataset be provided with a clear statement that multiple testing corrections failed. This will aid further investigations that are powered to evaluate sex effects.

      A major finding of the present study is the involvement of noradrenergic transcriptomic changes occurring in astrocytic genes in the hippocampus. Given the stated importance of this finding within the discussion, it seems that some additional dialogue integrating this with current literature about the role of astrocytes in the hippocampus during stress or fear memory would be important.

      The comparison of the candidate genes activated by the LC in the present study (swim) with datasets published by Floriou-Servou et al., 2018 (Novelty, swim, restraint, and footshock) is an interesting and important comparison. Were there other stressors identified in this paper or other publications that do not regulate these candidate genes? Further, can references be added to clarify to the reader, that prior studies have identified that novelty, restraint and footshock all activate LC-NA neurons.

      Comparisons are made between chemogenetic studies and yohimbine, stating that fewer genes were activated by chemogenetic activation of LC neurons. There is clear justification for why this may occur, but a caveat may need to be mentioned, that evidence of neuronal activation in the LC by each of these methods were conducted at 90 (yohimbine) versus 45 (hM3Dq) minutes, and therefore it cannot be ruled out that differences in LC-NA activity levels might also contribute.<br /> Please add information about how virus or cannula placement was confirmed in these studies. Were missed placements also analyzed separately?

      Time of day for tissue collection used in genetic analysis should be reported for all studies conducted or reanalyzed.

    3. Reviewer #2 (Public Review):

      The present manuscript investigates the implication of locus coeruleus-noradrenaline system in the stress-induced transcriptional changes of dorsal and ventral hippocampus, combining pharmacological, chemogenetic, and optogenetic techniques. Authors have revealed that stress-induced release of noradrenaline from locus coeruleus plays a modulatory role in the expression of a large scale of genes in both ventral and dorsal hippocampus through activation of β-adrenoreceptors. Similar transcriptional responses were observed after optogenetic and chemogenetic stimulation of locus coeruleus. Among all the genes analysed, authors identified the most affected ones in response to locus coeruleus-noradrenaline stimulation as being Dio2, Ppp1r3c, Ppp1r3g, Sik1, and Nr4a1. By comparing their transcriptomic data with publicly available datasets, authors revealed that these genes were upregulated upon exposure to different stressors. Additionally, authors found that upregulation of Ppp1r3c, Ppp1r3g, and Dio2 genes following swim stress was sustained from 90 min up to 2-4 hours after stress and that it was predominantly restricted to hippocampal astrocytes, while Sik1 and Nr4a1 genes showed a broader cellular expression and a sharp rise and fall in expression, within 90 min of stress onset.

      Overall, the paper is well written and provides a useful inventory of dorsal and ventral hippocampal gene expression upregulated by activation of LC-NA system, which can be used as starting point for more functional studies related to the effects of stress-induced physiological and pathological changes. However, I believe that the study would have benefited of a more comprehensive analyses of sex differences. Experiments in females were conducted only in one experiment and analyses restricted to the ventral hippocampus. Although, the experiments were overall sound and the results broadly support the conclusion made, I think some methodological choices should be better explained and rationalized. For instance, the study focuses on identifying transcriptional changes in the hippocampus induced by stress-mediated activation of the LC-NA system, however NA release following stress exposure and pharmacological or optogenetic manipulation was mostly measured in the cortex. Furthermore, behavioral changes following systemic pharmacologic or chemogenetic manipulation were observed in the open field task immediately after peripheral injections of yohimbine or CNO, respectively. Is this timing sufficient for both drugs to cross the blood brain barrier and to exert behavioral effects? Finally, the study shows that activation of noradrenergic hippocampus-projecting LC neurons is sufficient to regulate the expression of several hippocampal genes, although the necessity of these projection to induce the observed transcriptional effects has been tested to some extent through systemic blockade of beta-adrenoceptor, I believe the study would have benefited of more selective (optogenetic or chemogenetic) necessity experiments.

    1. Reviewer #1 (Public Review):

      This study uses single-cell genomics and gene pathway analysis to characterize the transcriptional effects of influenza H1N1 infection on hypothalamic cell types. The authors use droplet-based single-nuclei RNA-seq to profile genome-wide RNA expression in adult mouse hypothalamic cells at 3, 7, and 23 days after intranasal infection with the H1N1 influenza virus. Through state-of-the-art and rigorous computational methods, the authors find that many hypothalamic cell types, glia, and especially neurons, are transcriptionally altered by respiratory infection with a non-neurotropic influenza virus and that these alterations can persist for weeks and potentially affect cell type interactions that disrupt function. For instance, microglia shift towards a pro-inflammatory molecular phenotype at 3 days post-infection, while astrocytes and oligodendrocytes significantly alter their expression of oxidoreductase activity genes and transport genes, respectively, at 7 days post-infection. In addition, POMC neurons of the arcuate hypothalamus, which suppress appetite and increase metabolism, appear to be unusually sensitive to H1N1 infection, upregulating more genes than other hypothalamic neurons. The authors' thorough discussion of the findings raises interesting questions and hypotheses about the functional implications of the molecular changes they observed, including the physiological changes that can persist long after acute viral infection. Given the role of the hypothalamus in homeostasis, this work sheds light on potential mechanisms by which the H1N1 virus can disrupt cell function and organismal homeostasis beyond the cells that it directly infects.

    2. Reviewer #2 (Public Review):

      The new work from Lemcke et al. suggests that the infection with Influenza A virus causes such flu symptoms as sleepiness and loss of appetite through the direct action on the responsible brain region, the hypothalamus. To test this idea, the authors performed single-nucleus RNA sequencing of the mouse hypothalamus in controlled experimental conditions (0, 3, 7, and 23 days after intranasal infection) and analyzed changes in the gene expression in the specific cell populations. The key results are promising and spurring future research. After revision, the analysis was considerably improved. Alternative approaches were used for testing. Specifically, during the revision: 1) The annotation of cell types was considerably improved; 2) The authors performed an additional analysis comparing case-control studies (Cacoa), where they could partly confirm their earlier findings.

    1. eLife assessment

      This useful study presents data regarding the presence of synaptic proteins in the extracellular vesicle present in the blood of Parkinson's patients and healthy people, trying to correlate changes in such levels with the progression of Parkinson´s symptoms. The results are preliminary, suggesting that these biomarkers might be useful for this purpose. The evidence is incomplete, and more adequate methods to isolate the extracellular vesicles and quantify the proteins are recommended. Also, a better presentation of the results will help the reader to understand the significance of the report, and in addition, more focused Introduction and Discussion sections are recommended.

    2. Reviewer #1 (Public Review):

      The study isolated extracellular vesicles (EV) from healthy controls (HCs) and Parkinson patients (PwP), using plasma from the venous blood of non-fasting people. Such EVs were characterized and validated by the presence of markers, their size, and their morphology. The main aim of the manuscript is to correlate the presence of synaptic proteins, namely SNAP-25, GAP-43, and SYNAPTOTAGMIN-1, normalized with HSP70, with the clinical progression of PwP. Changes in synaptic proteins have been documented in the CSF of Alzheimer's and Parkinson's patients. The demographics of participants are adequately presented. One important limiting, as well as puzzling aspect, is the fact that authors did not find differences between groups at the beginning of the study nor after one year, after age and sex adjustment. Tables in general are hard to follow. Specifically, Table 2 does not convey a clear message nor in the text of the Table itself, and the per 100% of change needs to be explained in the corresponding legend. It is only when PwP were classified as a first quartile that a significantly greater deterioration was found. However, in the case of tremor, the top 25% had values going from 0.46-0.47 to 0.32-0.35, whereas the lower three quarters went from 0.33-0.34 to 0.27-0.28 depending on the protein analyzed. This needs to be clarified in the text. Table 3 is hard to read and some of the values seem repetitive, especially for tremor, AR, and PIGD. It looks as if Figure 2 represents the same information as Table 3. The text and figure legends are not helpful in guiding the reader to understand the presented information.

    3. Reviewer #2 (Public Review):

      Hong and collaborators investigated variations in the amount of synaptic proteins in plasma extracellular vesicles (EV) in Parkinson's Disease (PD) patients on one-year follow-up. Their findings suggest that plasma EV synaptic proteins may be used as clinical biomarkers of PD progression.

      It is a preliminary study using semi-quantitative analysis of synaptic proteins.

      The authors have a cohort of PD patients with clinical examination and a know-how on EV purification. Regarding this latter part, they may improve their description of EV purification. EV may be broken into smaller size EV after freezing. Does it explain the relatively small size in their EV preparation? Do the authors refer to the MISEV guidelines for EV purity? Regarding synaptic protein quantification, the choice of western blotting may not be the best one. ELISA and other multiplex arrays are available. How the authors do justify their choice? Do the authors try to sort plasma EV by membrane-associated neuronal EV markers using either vesicle sorting or immunoprecipitation?

      Many technical aspects may be improved. Such technical questions weakened the authors' conclusions.

      The discussion is pretty long to justify the data. It may be shortened by adding some information in the introduction.

    1. eLife assessment

      This study presents a valuable new behavioral apparatus aimed at differentiating the strategies animals use to orient themselves in an environment. The evidence supporting the claims is solid, with statistical modeling of animal behavior. Overall, this study will attract the interest of researchers exploring spatial learning and memory.

    2. Reviewer #1 (Public Review):

      The authors design an automated 24-well Barnes maze with 2 orienting cues inside the maze, then model what strategies the mice use to reach the goal location across multiple days of learning. They consider a set of models and conclude that one of these models, a combined strategy model, best explains the experimental data.

      This study is written concisely and the results presented concisely. The best fit model is reasonably simple and fits the experimental data well (at least the summary measures of the data that were presented).

      Major points:

      1. One combined strategy (once the goal location is learned) that might seem to be reasonable would be that the animal knows roughly where the goal is, but not exactly where, so it first uses a spatial strategy just to get to the first vestibule, then switches to a serial strategy until it reaches the correct vestibule. How well would such a strategy explain the data for the later sessions? The best combined model presented in the manuscript is one in which the animal starts with a roughly 50-50 chance of a serial (or spatial strategy) from the start vestibule (i.e. by the last session before the reversal the serial and spatial strategies are at ~50-50m in Fig. 5d). Is it the case that even after 15 days of training the animal starts with a serial strategy from its starting point approximately half of the time? The broader point is whether additional examination of the choices made by the animal, combined with consideration of a larger range of possible models, would be able to provide additional insight into the learning and strategies the animal uses.

      2. To clarify, in the Fig. 4 simulations, is the "last" vestibule visit of each trial, which is by definition 0, not counted in the plots of Fig. 4b? Otherwise, I would expect that vestibule 0 is overrepresented because a trial always ends with Vi = 0.

    3. Reviewer #2 (Public Review):

      This paper uses a novel maze design to explore mouse navigation behaviour in an automated analogue of the Barnes maze. Overall I find the work to be solid, with the cleverly designed maze/protocol to be its major strength - however there are some issues that I believe should be addressed and clarified.

      1. Whilst I'm generally a fan of the experimental protocol, the design means that internal odor cues on the maze change from trial to trial, along with cues external to the maze such as the sounds and visual features of the recording room, ultimately making it hard for the mice to use a completely allocentric spatial 'place' strategy to navigate. I do not think there is a way to control for these conflicts between reference frames in the statistical modelling, but I do think these issues should be addressed in the discussion.

      2. Somewhat related - I could not find how the internal maze cues are moved for each trial to demarcate the new goal (i.e. the luminous cues) ? This should be clarified in the methods.

      3. It appears some data is being withheld from Figures 2&3? E.g. Days 3/4 from Fig 2b-f and Days 1-5 on for Fig 3. Similarly, Trials 2-7 are excluded from Fig 3. If this is the case, why? It should be clarified in the main text and Figure captions, preferably with equivalent plots presenting all the data in the supplement.

      4. I strongly believe the data and code should be made freely available rather than "upon reasonable request".

    4. Reviewer #3 (Public Review):

      Royer et al. present a fully automated variant of the Barnes maze to reduce experimenter interference and ensure consistency across trials and subjects. They train mice in this maze over several days and analyze the progression of mouse search strategies during the course of the training. By fitting models involving stochastic processes, they demonstrate that a model combined of the random, spatial, and serial processes can best account for the observed changes in mice's search patterns. Their findings suggest that across training days the spatial strategy (using local landmarks) was progressively employed, mostly at the expense of the random strategy, while the serial strategy (consecutive nearby vestibule check) is reinforced from the early stages of training. Finally, they discuss potential mechanistic underpinnings within brain systems that could explain such behavioral adaptation and flexibility.

      Strength:<br /> The development of an automated Barnes maze allows for more naturalistic and uninterrupted behavior, facilitating the study of spatial learning and memory, as well as the analysis of the brain's neural networks during behavior when combined with neurophysiological techniques. The system's design has been thoughtfully considered, encompassing numerous intricate details. These details include the incorporation of flexible options for selecting start, goal, and proximal landmark positions, the inclusion of a rotating platform to prevent the accumulation of olfactory cues, and careful attention given to atomization, taking into account specific considerations such as the rotation of the maze without causing wire shortage or breakage. When combined with neurophysiological manipulations or recordings, the system provides a powerful tool for studying spatial navigation system.<br /> The behavioral experiment protocols, along with the analysis of animal behavior, are conducted with care, and the development of behavioral modeling to capture the animal's search strategy is thoughtfully executed. It is intriguing to observe how the integration of these innovative stochastic models can elucidate the evolution of mice's search strategy within a variant of the Barnes maze.

      Weakness:<br /> 1. The development of the well-thought-out automated Barnes maze may attract the interest of researchers exploring spatial learning and memory. However, this aspect of the paper lacks significance due to insufficient coverage of the materials and methods required for readers to replicate the behavioral methodology for their own research inquiries.<br /> Moreover, as discussed by the authors, the methodology favors specialists who utilize wired recordings or manipulations (e.g. optogenetics) in awake, behaving rodents. However, it remains unclear how the current maze design, which involves trapping mice in start and goal positions and incorporating angled vestibules resulting in the addition of numerous corners, can be effectively adapted for animals with wired implants.

      2. Novelty: In its current format, the main axis of the paper falls on the analysis of animal behavior and the development of behavioral modeling. In this respect, while it is interesting to see how thoughtfully designed models can explain the evolution of mice search strategy in a maze, the conclusions offer limited novel findings that align with the existing body of research and prior predictions.

      3. Scalability and accessibility: While the approach may be intriguing to experts who have an interest in or are familiar with the Barnes maze, its presentation seems to primarily target this specific audience. Therefore, there is a lack of clarity and discussion regarding the scalability of behavioral modeling to experiments involving other search strategies (such as sequence or episodic learning), other animal models, or the potential for translational applications. The scalability of the method would greatly benefit a broader scientific community. In line with this view, the paper's conclusions heavily rely on the development of new models using custom-made codes. Therefore, it would be advantageous to make these codes readily available, and if possible, provide access to the processed data as well. This could enhance comprehension and enable a larger audience to benefit from the methodology.

      4. Cross-validation of models: The authors have not implemented any measures to mitigate the risk of overfitting in their modeling. It would have been beneficial to include at least some form of cross-validation with stochastic models to address this concern. Additionally, the paper lacks the presence of analytics or measures that assess and compare the performance of the models.

      5. Quantification of inter-animal variations in strategy development: It is important to investigate, and address the argument concerning the possibility that not all animals recruit and develop the three processes (random, spatial, and serial) in a similar manner over days of training. It would be valuable to quantify the transition in strategy across days for each individual mouse and analyze how the population average, reflecting data from individual mice, corresponds to these findings. Currently, there is a lack of such quantification and analysis in the paper.

    1. eLife assessment

      By combining electrophysiological analysis of mutant channels and molecular dynamics simulations, this important study identifies a common binding site for two structurally distinct activators of KCNQ1-KCNE1 channels. The findings represent an important advance for the field, with convincing functional and computational data to support the claims. The work will be of interest to those studying the binding of small molecule drugs to membrane protein complexes.

    2. Reviewer #1 (Public Review):

      Chan et al. attempted to identify the binding sites or pockets for the KCNQ1-KCNE1 activator mefenamic acid. Because the KCNQ1-KCNE1 channel is responsible for cardiac repolarization, genetic impairment of either the KCNQ1 or KCNE1 gene can cause cardiac arrhythmias. Therefore, the development of activators without side effects is highly desired. Since mefenamic acid binding requires both KCNQ1 and KCNE1 subunits, the authors performed drug docking simulations using the KCNQ1-psKCNE1 structural model with substitution of the extracellular five amino acids (R53-Y58) of KCNE3 to D39-A44 of KCNE1. They successfully identified some critical amino acid residues, including W323 of KCNQ1 and K41 and A44 of KCNE1. They then tested these identified amino acid residues by analyzing the point mutants and confirmed that they were critical for the binding of the activator. They also examined another activator, but structurally different DIDS, and reported that DIDS and mefenamic acid share the binding pocket, and concluded that the extracellular region composed of S1, S6, and KCNE1 is a generic binding pocket for the IKS activators.

      The limitation of this study is that they had to use the KCNQ1-KCNE3-based structural model for the docking simulation. Although they only focused on the extracellular region substituted by the six amino acid residues of KCNE1, the binding mode or location of KCNE1 might be different from KCNE3. Another weakness is that unbinding may be facilitated in the closed state, whereas they had to use the open channel for the MD simulation. Therefore, their MD simulations do not necessarily reflect the unbinding process in the closed state, which should occur in the comparable electrophysiological experiments. Nevertheless, the data are solid and well support their conclusions. This work should be valuable to the field, not only for future drug design but also for the biophysical understanding of the binding/unbinding of drugs to ion channel complexes.

    3. Reviewer #2 (Public Review):

      The voltage-gated potassium channel KCNQ1/KCNE1 (IKs) plays important physiological functions, for instance in the repolarization phase of the cardiac action potential. Loss-of-function of KCNQ1/KCNE1 is linked to disease. Hence, KCNQ1/KCNE1 is a highlighted pharmacological target and mechanistic insights into how channel modulators enhance the function of the channel is of great interest. The authors have through several previous studies provided mechanistic insights into how small-molecule activators like ML277 act on KCNQ1. However, less is known about the binding site and mechanism of action of other type of channel activators, which require KCNE1 for their effect. In this study, Chan and co-workers use molecular dynamics approaches, mutagenesis and electrophysiology to propose an overall similar binding site for the KCNQ1/KCNE1 activators mefenamic acid and DIDS, located at the extracellular interface of KCNQ1 and KCNE1. The authors propose an induced-fit model for the binding site, which critically engages residues in the N-terminus of KCNE1. Moreover, the authors discuss possible mechanisms of action of how drug binding to this site may enhance channel function.

      The authors address an important question, of broad relevance to researchers in the field. The manuscript is well written and the text easy to follow. A strength of the work is the parallel use of experimental and simulation approaches, which enables both functional testing and mechanistic predictions and interpretations. For instance, the authors have experimentally assessed the putative relevance of a large set of residues based on simulation predictions. A minor limitation is that not all residues of putative importance for drug binding/effects can be reliable evaluated in experiments, which is, however, clearly discussed by the authors and a challenge shared by electrophysiologists in the field.

    4. Reviewer #3 (Public Review):

      The authors identified the mefenamic (Mef) binding site and DIDS binding site on the KCNQ1 KCNE1 complex. The authors also identified the mechanism of interactions using electrophysiological recording, calculating V1/2 of different mutants, and looking at the instantaneous and tail currents. The contribution of each residue within the binding pocket was analysed using GBSA and PBSA and traditional molecular dynamics simulation.

      The manuscript has been substantially revised from the previous version with a greater depth of computational analysis.

    1. eLife assessment

      This important study uses near full-length HIV-1 sequencing to examine proviral persistence in various tissues derived from three individuals who received antiretroviral therapy until time of death. Intact as well as defective HIV-1 proviruses are found at various anatomical sites including the central nervous system, results that are convincing and relevant for our understanding of latent viral reservoirs, especially in the brain.

    2. Reviewer #1 (Public Review):

      Despite durable viral suppression by antiretroviral therapy (ART), HIV-1 persists in cellular reservoirs in vivo. The viral reservoir in circulating memory T cells has been well characterized, in part due to the ability to safely obtain blood via peripheral phlebotomy from people living with HIV-1 infection (PWH). Tissue reservoirs in PWH are more difficult to sample and are less well understood. Sun and colleagues describe isolation and genetic characterization of HIV-1 reservoirs from a variety of tissues including the central nervous system (CNS) obtained from three recently deceased individuals at autopsy. They identified clonally expanded proviruses in the CNS in all three individuals.

      Strengths of the work include the study of human tissues that are under-studied and difficult to access, and the sophisticated near-full length sequencing technique that allows for inferences about genetic intactness and clonality of proviruses. The small sample size (n=3) is a drawback. Furthermore, two individuals were on ART for just one year at the time of autopsy and had T cells compatible with AIDS, and one of these individuals had a low-level detectable viral load (Figure S1). This makes generalizability of these results to PWH who have been on ART for years or decades and have achieved durable viral suppression and immune reconstitution difficult.

      While anatomic tissue compartment and CNS region accompany these PCR results, it is unclear which cell types these viruses persist in. As the authors point out, it is possible that these reservoir cells might have been infiltrating T cells from blood present at the time of autopsy tissue sampling. Cell type identification would greatly enhance the impact of this work. Several other groups have undergone similar studies (with similar results) using autopsy samples (links below). These studies included more individuals, but did not make use of the near-full length sequencing described here. In particular, the Last Gift cohort, based at UCSD and led by Sara Gianella and Davey Smith, has established protocols for tissue sampling during autopsy performed soon after death.<br /> https://pubmed.ncbi.nlm.nih.gov/35867351/<br /> https://pubmed.ncbi.nlm.nih.gov/37184401/

      Overall, this small, thoughtful study contributes to our understanding of the tissue distribution of persistent HIV-1, and informs the ongoing search for viral eradication.

    3. Reviewer #2 (Public Review):

      The manuscript by Sun et al. applies the powerful technology of profiling viral DNA sequences in numerous anatomical sites in autopsy samples from participants who maintained their antiviral therapy up to the time of death. The sequencing is of high quality in using end-point dilution PCR to generate individual viral genomes. There is a thoughtful discussion, although there are points that we disagree with. This is an important data set that increases the scope of how the field thinks about the latent reservoir with a new look at the potential of a reservoir within the CNS.

      1. The participants are very different in their exposure to HIV replication and disease progression. Participant 1 appears to have been on ART for most of the time after diagnosis of infection (16 years) and died with a high CD4 T cell count. The other two participants had only one year on ART and died with relatively low CD4 T cell counts (under 200). This could lead to differences in the nature of the reservoir. In this regard, the amount of DNA per million cells appears to be about 10-fold lower across the compartments sampled for participant 1. Also, one might expect fewer intact proviruses surviving after 16 years on ART compared to only 1 year on ART. The depth of sampling may be too limited and the number of participants too few to assess if these differences are features of these participants because of their different exposures to HIV replication. On the positive side, finding similarities across these big differences in participant profiles does reinforce the generalizability of the observations.

      2. The following analysis will be limited by sampling depth but where possible it would be interesting to compare the ratio of intact to defective DNA. A sanctuary might allow greater persistence of cells with intact viral DNA even without viral replication (i.e. reduced immune surveillance). Detecting one or two intact proviruses in a tissue sample does not lend itself to a level of precision to address this question, but statistical tests could be applied to infer when there is sampling of 5 or more intact proviruses to determine if their frequency as a ratio of total DNA in different anatomical sites is similar or different. This would allow adjustment for the different amount of viral DNA in different compartments while addressing the question of the frequency of intact versus defective proviruses. One complication in this analysis is if there was clonal expansion of a cell with an intact genome which would represent a fortuitous over-representation intact genomes in that compartment.

      3. The key point of this work is that the participants were on therapy up to the time of death ("enforcing" viral latency). The predominance of defective genomes is consistent with this assumption. Is there data from untreated infections to compare to as a signature of whether the viral DNA population was under selective pressure from therapy or not? Presumably untreated infections contain more intact DNA relative to total DNA. This would represent independent evidence that therapy was in place.

      4. There are several points in Figure 5 to raise about V3 loop sequences. The analysis includes a large number of "undetermined" sequences that did not have a V3 loop sequence to evaluate. We would argue it is a fair assumption that the deleted proviruses have the same distribution of X4 and R5 sequences as the ones that have a V3 sequence to evaluate. In this view it would be possible to exclude the sequences for which there is no data and just look at the ratio of X4 and R5 in the different compartments, specifically does this ratio change in a statistically significant way in different compartments? The authors use "CCR5 and non-CCR5" as the two entry phenotypes. The evidence is pretty strong that the "other" coreceptor the virus routinely uses is CXCR4, and G2P is providing the FPR for X4 viruses. Perhaps the authors are trying to create some space for other coreceptors on microglia, but we are pretty sure what they are measuring is X4 viruses, especially in this late disease state of participant 2. Finally, we have previously observed that the G2P FPR score of <2 is a strong indicator of being X4, FPR scores between 2 and 10 have a 50% chance of being X4, and FPR scores above 10 are reliably R5 (PMID27226378). In addition, we observed that X4 viruses form distinct phylogenetic lineages. The authors might consider these features of X4 viruses in the evaluation of their sequences. Specifically, it would be helpful to incorporate the FPR scores of the reported X4 viruses.

      5. We have puzzled over the many reports of different cell types in the CNS being infected. When we examined these cells types (both as primary cells and as iPSC-derived cells), all cells could be infected with a version of HIV that had the promiscuous VSV-G protein on the virus surface as a pseudotype. However, only macrophages and microglia could be infected using the HIV Env protein, and then only if it was the M-tropic version and not the T-tropic version (PMID35975998). RNAseq analysis was consistent with this biological readout in that only macrophages and microglia expressed CD4, neurons and astrocytes do not. From the virology point of view, astrocytes are no more infectable than neurons.

      6. The brain gets exposed to virus from the earliest stages of infection but this is not synonymous with viral replication. Most of the time there is virus in the CSF but it is present at 1-10% of the level of viral load in the blood and phylogenetically it looks like the virus in the blood, most consistent with trafficking T cells, some of which are infected (PMID25811757). The fact that the virus in the blood is almost always T cell-tropic in needing a high density of CD4 for entry makes it unlikely that monocytes are infected (with their low density of CD4) and thus are not the source of virus found in the CNS. It seems much more likely that infected T cells are the "Trojan Horse" carrying virus into the CNS.

      7. While all participants were taking antiretroviral therapy at the time of their death, they were not all suppressed when the tissues were collected. The authors are careful not to mention "suppressive ART" in the text, which is appreciated. However, the title should be changed to also reflect this fact.

    1. eLife assessment

      This useful study presents a genetically encoded barcoding system that could not only advance transcriptomic studies but that also has potential further applications, such as in high-throughput population-scale behavioral measurements. The evidence supporting the claims of the authors are currently inadequate to demonstrate that the method is indeed greatly superior to existing approaches in behavioural and transcriptomic studies.

    2. Reviewer #1 (Public Review):

      The aim of this paper is to describe a novel method for genetic labelling of animals or cell populations, using a system of DNA/RNA barcodes.

      Strengths:<br /> • The author's attempt at providing a straightforward method for multiplexing Drosophila samples prior to scRNA-seq is commendable. The perspective of being able to load multiple samples on a 10X Chromium without antibody labelling is appealing.<br /> • The authors are generally honest about potential issues in their method, and areas that would benefit from future improvement.<br /> • The article reads well. Graphs and figures are clear and easy to understand.

      Weaknesses:<br /> • The usefulness of TaG-EM for phototaxis, egg laying or fecundity experiments is questionable. The behaviours presented here are all easily quantifiable, either manually or using automated image-based quantification, even when they include a relatively large number of groups and replicates. Despite their claims (e.g., L311-313), the authors do not present any real evidence about the cost- or time-effectiveness of their method in comparison to existing quantification methods.<br /> • Behavioural assays presented in this article have clear outcomes, with large effect sizes, and therefore do not really challenge the efficiency of TaG-EM. By showing a T-maze in Fig 1B, the authors suggest that their method could be used to quantify more complex behaviours. Not exploring this possibility in this manuscript seems like a missed opportunity.<br /> • Experiments in Figs S3 and S6 suggest that some tags have a detrimental effect on certain behaviours or on GFP expression. Whereas the authors rightly acknowledge these issues, they do not investigate their causes. Unfortunately, this question the overall suitability of TaG-EM, as other barcodes may also affect certain aspects of the animal's physiology or behaviour. Revising barcode design will be crucial to make sure that sequences with potential regulatory function are excluded.<br /> • For their single-cell experiments, the authors have used the 10X Genomics method, which relies on sequencing just a short segment of each transcript (usually 50-250bp - unknown for this study as read length information was not provided) to enable its identification, with the matching paired-end read providing cell barcode and UMI information (Macosko et al., 2015). With average fragment length after tagmentation usually ranging from 300-700bp, a large number of GFP reads will likely not include the 14bp TaG-EM barcode. When a given cell barcode is not associated with any TaG-EM barcode, then demultiplexing is impossible. This is a major problem, which is particularly visible in Figs 5 and S13. In 5F, BC4 is only detected in a couple of dozen cells, even though the Jon99Ciii marker of enterocytes is present in a much larger population (Fig 5C). Therefore, in this particular case, TaG-EM fails to detect most of the GFP-expressing cells. Similarly, in S13, most cells should express one of the four barcodes, however many of them (maybe up to half - this should be quantified) do not. Therefore, the claim (L277-278) that "the pan-midgut driver were broadly distributed across the cell clusters" is misleading. Moreover, the hypothesis that "low expressing driver lines may result in particularly sparse labelling" (L331-333) is at least partially wrong, as Fig S13 shows that the same Gal4 driver can lead to very different levels of barcode coverage.<br /> • Comparisons between TaG-EM and other, simpler methods for labelling individual cell populations are missing. For example, how would TaG-EM compare with expression of different fluorescent reporters, or a strategy based on the brainbow/flybow principle?<br /> • FACS data is missing throughout the paper. The authors should include data from their comparative flow cytometry experiment of TaG-EM cells with or without additional hexameric GFP, as well as FSC/SSC and fluorescence scatter plots for the FACS steps that they performed prior to scRNA-seq, at least in supplementary figures.<br /> • The authors should show the whole data described in L229, including the cluster that they chose to delete. At least, they should provide more information about how many cells were removed. In any case, the fact that their data still contains a large number of debris and dead cells despite sorting out PI negative cells with FACS and filtering low abundance barcodes with Cellranger is concerning.

      Overall, although a method for genetic tagging cell populations prior to multiplexing in single-cell experiments would be extremely useful, the method presented here is inadequate. However, despite all the weaknesses listed above, the idea of barcodes expressed specifically in cells of interest deserves more consideration. If the authors manage to improve their design to resolve the major issues and demonstrate the benefits of their method more clearly, then TaG-EM could become an interesting option for certain applications.

    3. Reviewer #2 (Public Review):

      In this manuscript, Mendana et al developed a multiplexing method - Targeted Genetically-Encoded Multiplexing or TaG-EM - by inserting a DNA barcode upstream of the polyadenylation site in a Gal4-inducible UAS-GFP construct. This Multiplexing method can be used for population-scale behavioral measurements or can potentially be used in single-cell sequencing experiments to pool flies from different populations. The authors created 20 distinctly barcoded fly lines. First, TaG-EM was used to measure phototaxis and oviposition behaviors. Then, TaG-EM was applied to the fly gut cell types to demonstrate its applications in single-cell RNA-seq for cell type annotation and cell origin retrieving.

      This TaG-EM system can be useful for multiplexed behavioral studies from next-generation sequencing (NGS) of pooled samples and for Transcriptomic Studies. I don't have major concerns for the first application, but I think the scRNA-seq part has several major issues and needs to be further optimized.

      Major concerns:<br /> 1. It seems the barcode detection rate is low according to Fig S9 and Fig 5F, J and N. Could the authors evaluate the detection rate? If the detection rate is too low, it can cause problems when it is used to decode cell types.<br /> 2. Unsuccessful amplification of TaG-EM barcodes: The authors attempted to amplify the TaG-EM barcodes in parallel to the gene expression library preparation but encountered difficulties, as the resulting sequencing reads were predominantly off-target. This unsuccessful amplification raises concerns about the reliability and feasibility of this amplification approach, which could affect the detection and analysis of the TaG-EM barcodes in future experiments.<br /> 3. For Fig 5, the singe-cell clusters are not annotated. It is not clear what cell types are corresponding to which clusters. So, it is difficult to evaluate the accuracy of the assignment of barcodes.<br /> 4. The scRNA-seq UMAP in Fig 5 is a bit strange to me. The fly gut epithelium contains only a few major cell types, including ISC, EB, EC, and EE. However, the authors showed 38 clusters in fig 5B. It is true that some cell types, like EE (Guo et al., 2019, Cell Reports), have sub-populations, but I don't expect they will form these many sub-types. There are many peripheral small clusters that are not shown in other gut scRNA-seq studies (Hung et al., 2020; Li et al., 2022 Fly Cell Atlas; Lu et al., 2023 Aging Fly Cell Atlas). I suggest the authors try different data-processing methods to validate their clustering result.<br /> 5. Different gut drivers, PMC-, PC-, EB-, EC-, and EE-GAL4, were used. The authors should carefully characterize these GAL4 expression in larval guts and validate sequencing data. For example, does the ratio of each cell type in Fig 5B reflect the in vivo cell type ratio? The authors used cell-type markers mostly based on the knowledge from adult guts, but there are significant morphological and cell ratio differences between larval and adult guts (e.g., Mathur...Ohlstein, 2010 Science).<br /> 6. Doublets are removed based on the co-expression of two barcodes in Fig 5A. However, there are also other possible doublets, for example, from the same barcode cells or when one cell doesn't have detectable barcode. Did the authors try other computational approaches to remove doublets, like DoubleFinder (McGinnis et al., 2019) and Scrublet (Wolock et al., 2019)?<br /> 7. Did the authors remove ambient RNA which is a common issue for scRNA-seq experiments?<br /> 8. Why does TaG-EM barcode #4, driven by EC-GAL4, not label other classes of enterocyte cells such as betaTry+ positive ECs (Figures 5D-E)? similarly, why does TaG-EM barcode #9, driven by EE-GAL4, not label all EEs? Again, it is difficult to evaluate this part without proper data processing and accurate cell type annotation.<br /> 9. For Figure 2, when the authors tested different combinations of groups with various numbers of barcodes. They found remarkable consistency for the even groups. Once the numbers start to increase to 64, barcode abundance becomes highly variable (range of 12-18% for both male and female). I think this would be problematic because the differences seen in two groups for example may be due to the barcode selection rather than an actual biologically meaningful difference.<br /> 10. Barcode #14 cannot be reliably detected in oviposition experiment. This suggests that the BC 14 fly line might have additional mutations in the attp2 chromosome arm that affects this behavior. Perhaps other barcode lines also have unknown mutations and would cause issues for other untested behaviors. One possible solution is to back-cross all 20 lines with the same genetic background wild-type flies for >7 generations to make all these lines to have the same (or very similar) genetic background. This strategy is common for aging and behavior assays.

    4. Reviewer #3 (Public Review):

      The work addresses challenges in linking anatomical information to transcriptomic data in single-cell sequencing. It proposes a method called Targeted Genetically-Encoded Multiplexing (TaG-EM), which uses genetic barcoding in Drosophila to label specific cell populations in vivo. By inserting a DNA barcode near the polyadenylation site in a UAS-GFP construct, cells of interest can be identified during single-cell sequencing. TaG-EM enables various applications, including cell type identification, multiplet droplet detection, and barcoding experimental parameters. The study demonstrates that TaG-EM barcodes can be decoded using next-generation sequencing for large-scale behavioral measurements. Overall, the results are solid in supporting the claims and will be useful for a broader fly community. I have only a few comments below:

      Specific comments:

      1. The authors mentioned that the results of structure pool tests in Fig. 2 showed a high level of quantitative accuracy in detecting the TaG-EM barcode abundance. Although the data were generally consistent with the input values in most cases, there were some obvious exceptions such as barcode 1 (under-represented) and barcodes 15, 20 (over-represented). It would be great if the authors could comment on these and provide a guideline for choosing the appropriate barcode lines when implementing this TaG-EM method.

      2. In Supplemental Figure 6, the authors showed GFP antibody staining data with 20 different TaG-EM barcode lines. The variability in GFP antibody staining results among these different TaG-EM barcode lines concerns the use of these TaG-EM barcode lines for sequencing followed by FACS sorting of native GFP. I expected the native GFP expression would be weaker and much more variable than the GFP antibody staining results shown in Supplemental Figure 6. If this is the case, variation of tissue-specific expression of TaG-EM barcode lines will likely be a confounding factor.

      3. As the authors mentioned in the manuscript, multiple barcodes for one experimental condition would be a better experimental design. Could the authors suggest a recommended number of barcodes for each experiential condition? 3? 4? Or more? Also, it would be great if the authors could provide a short discussion on the cost of such TaG-EM method. For example, for the phototaxis assay, if it is much more expensive to perform TaG-EM as compared to manually scoring the preference index by videotaping, what would be the practical considerations or benefits of doing TaG-EM over manual scoring?

    1. Author Response:

      We are very grateful to the Editors and the three Reviewers for their valuable reviews of our submission. We will take into account all the comments and provide a revised manuscript with our point-by-point responses as soon as possible. In the meantime, we would like to respond provisionally to the reservation expressed in the eLife editorial assessment and by Reviewer #3 about the validity of our models to study of the neurobehavioral consequences of purine deficiency and the pathogenesis of Lesch-Nyhan disease (LND) in Drosophila.

      Two enzymes are responsible for purine recycling in mammals: APRT and HGPRT. Only HGPRT deficiency causes neurobehavioral disturbances and LND in humans, while APRT deficiency leads to metabolic deficits without neurological or behavioral symptoms. In contrast, as we have been able to confirm, Drosophila expresses a single purine recycling enzyme, Aprt, and no HGPRT or HGPRT-like activity. Here we propose different ways to model LND in Drosophila, based either on Aprt deficiency or the expression of mutant HGPRT.

      Although it may be difficult to accept that the inactivation of a different gene in a distant organism could be a good model for LND, we have found that, in contrast to humans, Aprt deficiency has both metabolic and neurobehavioral consequences in Drosophila. This suggested that Aprt, being the unique fly purine recycling enzyme, might share the enzymatic function of human APRT and the neurodevelopmental function of human HGPRT, because its inactivation should recapitulate all pathological consequences of a lack of purine recycling in this organism, and in particular in the brain.

      The statement by Reviewer #3 that “it is unknown whether Aprt is also a structural homologue [of HGPRT]” is not accurate. APRT and HGPRT are known to be functionally and structurally related. Both human APRT and HGPRT belong to the type I PRTases family identified by a conserved phosphoribosyl pyrophosphate (PRPP) binding motif, which is used as a substrate to transfer phosphoribosyl to purines. This binding motif is only found in PRTases from the nucleotide synthesis and salvage pathways (see: Sinha and Smith (2001) Curr Opin Struct Biol 11(6):733-9. PMID: 11751055). The purine substrates adenine, hypoxanthine and guanine share the same chemical skeleton and APRT can bind hypoxanthine, indicating that APRT and HGPRT also share similarities in their substrate binding sites (Ozeir et al. (2019) J Biol Chem. 294(32):11980-11991. PMID: 31160323). Moreover, Drosophila Aprt and Human APRT are closely related as the amino acid sequences of APRTs have been highly conserved throughout evolution (shown in Fig. S3B of our paper). We apologize for not providing this information in our original submission. This point will be made clearer in the revised article.

      Here we report a set of evidence that Drosophila can be used as a model to study LND. A strong argument, as we believe, is that the same drugs have been found effective in rescuing the seizure-like phenotype in Aprt-deficient flies (Figure 7 in our manuscript) and the viability of fibroblasts and neural stem cells derived from iPSCs of LND patients, in which de novo purine synthesis was prevented (as discussed on page 37). This is a good sign that Drosophila could be used to identify new genetic targets and pharmacological compounds capable to rescue HGPRT mutations in humans.

      Finally, we would like to emphasize that Reviewer #1 and Reviewer #2 expressed confidence in the potential usefulness of our work to better understand and treat LND in their public reviews. Reviewer #1 indeed stated that: “The findings provide a new example of how manipulating specific genes in the fruit fly allows the study of fundamental molecular processes that are linked to a human disease”, and Reviewer #2 further wrote: "Altogether, these are very important and fundamental findings that convincingly demonstrate the establishment of a Drosophila model for the scientific community to investigate LND, to carry out drug testing screens and find cures”, and added: “To conclude, this is a fundamental piece of work that opens the opportunity for the broader scientific community to use Drosophila to investigate LND”.

    2. eLife assessment

      The manuscript looks at how dysregulated purine metabolism impacts the behavior, neural circuits and survival of the fruit fly with the possibility of generating a model for Lesch-Nyhan Disease. The study is valuable but the strength of evidence is currently incomplete regarding the use of this model for LND.

    3. Reviewer #1 (Public Review):

      The current manuscript focuses on the adenine phosphoribosyltransferase (Aprt) and how the lack of its function affects nervous system function. It puts it into the context of Lesch-Nyhan disease, a rare hereditary disease linked to hypoxanthine-guanine phosphoribosyltransferase (HGPRT). Since HGPRT appears absent in Drosophila, the study focuses initially on Aprt and shows that aprt mutants have a decreased life-span and altered uric acid levels (the latter can be attenuated by allopurinol treatment). Moreover, aprt mutants show defects in locomotor reactivity behaviors. A comparable phenotype can be observed when specifically knocking down aprt in dopaminergic cells. Interestingly, also glia-specific knock-down caused a similar behavioral defect, which could not be restored when re-expressing UAS-aprt, while neuronal re-expression did restore the mutant phenotype. Moreover, mutants, pan-neuronal and pan-neuronal plus glia RNAi for aprt caused sleep-defects. Based on immunostainings Dopamine levels are increased; UPLC shows that adenosine levels are reduced and PCR showed in increase of Ent2 levels are increased (but not AdoR). Moreover, aprt mutants display seizure-like behaviros, which can be partly restored by purine feeding (adenosine and N6-methyladenosine). Finally, expression of the human HGPRT also causes locomotor defects.

      The authors provide a wide range of genetic experimental data to assess behavior and some molecular assessment on how the defects may emerge. It is clearly written, and the arguments follow the experimental evidence that is provided.

      The findings provide a new example of how manipulating specific genes in the fruit fly allows the study of fundamental molecular processes that are linked to a human disease.

    4. Reviewer #2 (Public Review):

      The manuscript by Petitgas et al demonstrates that loss of function for the only enzyme responsible for the purine salvage pathway in fruit-flies reproduces the metabolic and neurologic phenotypes of human patients with Lesch-Nyhan disease (LND). LND is caused by mutations in the enzyme HGPRT, but this enzyme does not exist in fruit-flies, which instead only have Aprt for purine recycling. They demonstrate that mutants lacking the Aprt enzyme accumulate uric acid, which like in humans can be rescued by feeding flies allopurinol, and have decreased longevity, locomotion and sleep impairments and seizures, with striking resemblance to HGPRT loss of function in humans. They demonstrate that both loss of function throughout development or specifically in the adult ubiquitously or in all neurons, or dopaminergic neurons, mushroom body neurons or glia, can reproduce the phenotypes (although knock-down in glia does not affect sleep). They show that the phenotypes can be rescued by over-expressing a wild-type form of the Aprt gene in neurons. They identify a decrease in adenosine levels as the cause underlying these phenotypes, as adenosine is a neurotransmitter functioning via the purinergic adenosine receptor in neurons. In fact, feeding flies throughout development and in the adult with either adenosine or m6A could prevent seizures. They also demonstrate that loss of adenosine caused a secondary up-regulation of ENT nucleoside transporters and of dopamine levels, that could explain the phenotypes of decreased sleep and hyperactivity and night. Finally, they provide the remarkable finding that over-expression of the human mutant HGPRT gene but not its wild-type form in neurons impaired locomotion and induced seizures. This means that the human mutant enzyme does not simply lack enzymatic activity, but it is toxic to neurons in some gain-of-function form. Altogether, these are very important and fundamental findings that convincingly demonstrate the establishment of a Drosophila model for the scientific community to investigate LND, to carry out drug testing screens and find cures.

      The experiments are conducted with great rigour, using appropriate and exhaustive controls, and on the whole the evidence does convincingly or compellingly support the claims. The exception is an instance when authors mention 'data not shown' and here data should either be provided, or claims removed: "feeding flies with adenosine or m6A did not rescue the SING phenotype of Aprt mutants (data not shown)". It is important to show these data (see below).

      Sleep is used to refer to lack of movement of flies to cross a beam for more than 5 minutes. However, lack of movement does not necessarily mean the flies are asleep, as they could be un-motivated to move (which could reflect abnormal dopamine levels) or engaged in incessant grooming instead. These differences are important for future investigation into the neural circuits affect by LND.

      The authors claim that based on BLAST genome searchers, there are no HPRTI (encoding HGPRT) homologues in Drosophila. However, such a claim would require instead structure-based searches that take into account structural conservation despite high sequence divergence, as this may not be detected by regular BLAST.

      This work raises important questions that still need resolving. For example, the link between uric acid accumulation, reduced adenosine levels, increased dopamine and behavioural neurologic consequences remain unresolved. It is important that they show that restoring uric acid levels does not rescue locomotion nor seizure phenotypes, as this means that this is not the cause of the neurologic phenotypes. Instead, their data indicate adenosine deficiency is the cause. However, one weakness is that for the manipulations they test some behaviours but not all. The authors could attempt to improve the link between mechanism and behaviour by testing whether over-expression of Aprt in neurons or glia, throughout development or in the adult, and feeding with adenosine and m6A can rescue each of the behavioural phenotypes handled: lifespan, SING, sleep and seizures. The authors could also attempt to knock-down dopamine levels concomitantly with feeding with adenosine or m6A to see if this rescues the phenotypes of SING and sleep. Visualising the neural circuits that express the adenosine receptor could reveal why the deficit in adenosine can affect distinct behaviours differentially, and which neurologic phenotypes are primary and which secondary consequences of the mutations. This would allow them to carry out epistasis analysis by knocking-down AdoR in specific circuits, whilst at the same time feeding Aprt mutants with Adenosine.

      The revelation that the mutant form of human HGPRT has toxic effects is very intriguing and important and it invites the community to investigate this further into the future.

      To conclude, this is a fundamental piece of work that opens the opportunity for the broader scientific community to use Drosophila to investigate LND.

    5. Reviewer #3 (Public Review):

      The study attempts to develop a Drosophila model for the human disease of LND. The issue here, and the main weakness of this study, is that Drosophila does not express the enzyme, HGPRT, which when mutated causes LND. The authors, instead, mutate the functionally-related Drosophila Aprt enzyme. However, it is unknown whether Aprt is also a structural homologue. Because of this, it will likely not be possible to identify pharmacological compounds that rescue HGPRT activity via a direct interaction (unless modelling predicts high conservation of substrate binding pocket between the two enzymes, etc). An additional weakness is that the study does not identify a molecule that may act as a lead compound for further development for treating LND. Rather, the various rescues reported are selective for only a subset of the disease-associated phenotypes. Thus, whilst informative, this first section of the study does not meet the study ambitions.

      The second approach adopted is to express a 'humanised mutated' form of HGPRT in Drosophila, which holds more promise for the development of a pharmacological screen. In particular, the locomotor defect is recapitulated but the seizure-like activity, whilst reported as being recapitulated, is debatable. A recovery time of 2.3 seconds is very much less than timings for typical seizure mutants. Nevertheless, the SING behaviour could be sufficient to screen against. However, this is not explored.

      In summary, this is a largely descriptive study reporting the behavioural effects of an Aprt loss-of-function mutation. RNAi KD and rescue expression studies suggest that a mix of neuronal (particularly dopaminergic and possibly adenosinergic signalling pathways) and glia are involved in the behavioural phenotypes affecting locomotion, sleep and seizure. There is insufficient evidence to have confidence that the Arpt fly model will prove valuable for understanding / treating LND.

    1. Author Response:

      We thank the reviewers for the constructive feedback and detailed reviews. To avoid any misunderstandings, we would like to add the following clarification. The comments from Reviewer 3 seem to indicate that in our simulator, synthspot, we mix cells from different data sets and even different species to create synthetic spots. The comment is the following:

      The choice to blend mouse and human scRNA-seq datasets in the simulation setup for generating synthetic spots is not ideal due to its departure from a realistic biological scenario.

      We would like to point out that the synthetic spots we create for the silver standard data sets are always sampled from the same scRNAseq or snRNAseq data set to keep the simulations as biologically plausible as possible.

      For each of the 6 public data sets, we create 9 different synthetic data sets, resulting in a total of 54 synthetic data sets. Each of these 9 data sets correspond to a different abundance pattern with spots representing combinations of cells sampled from this same public data set. Hence, these synthetic data sets always reflect cell types that actually co-occur in the tissue sections used to generate the underlying public scRNAseq or snRNAseq data set.

    2. Reviewer #1 (Public Review):

      Cell type deconvolution is one of the early and critical steps in the analysis and integration of spatial omic and single cell gene expression datasets, and there are already many approaches proposed for the analysis. Sang-aram et al. provide an up-to-date benchmark of computational methods for cell type deconvolution.

      In doing so, they provide some (perhaps subtle) additional elements that I would say are above the average for a benchmarking study: i) a full Nextflow pipeline to reproduce their analyses; ii) methods implemented in Docker containers (which can be used by others to run their datasets); iii) a fairly decent assessment of their simulator compared to other spatial omics simulators. A key aspect of their results is that they are generally very concordant between real and synthetic datasets. And, it is important that the authors include an appropriate "simpler" baseline method to compare against and surprisingly, several methods performed below this baseline. Overall, this study also has the potential to also set the standard of benchmarks higher, because of these mentioned elements.

      The only weakness of this study that I can readily see is that this is a very active area of research and we may see other types of data start to dominate (CosMx, Xenium) and new computational approaches will surely arrive. The Nextflow pipeline will make the prospect of including new reference datasets and new computational methods easier.

    3. Reviewer #2 (Public Review):

      In this manuscript Sangaram et al provide a systematic methodology and pipeline for benchmarking cell type deconvolution algorithms for spatial transcriptomic data analysis in a reproducible manner. They developed a tissue pattern simulator that starts from single-cell RNA-seq data to create silver standards and used spatial aggregation strategies from real in situ-based spatial technologies to obtain gold standards. By using several established metrics combined with different deconvolution challenges they systematically scored and ranked 11 deconvolution methods and assessed both functional and usability criteria. Altogether, they present a reusable and extendable platform and reach very similar conclusions to other deconvolution benchmarking paper, including that RCTD, SpatialDWLS and Cell2location typically provide the best results.

      More specifically, the authors of this study sought to construct a methodology for benchmarking cell type deconvolution algorithms for spatial transcriptomic data analysis in a reproducible manner. The authors leveraged publicly available scRNA-seq, seqFISH, and STARMap datasets to create synthetic spatial datasets modeled after that of the Visium platform. It should be noted that the underlying experimental techniques of seqFISH and STARMap (in situ hybridization) do not parallel that of Visium (sequencing), which could bias simulated data. Furthermore, to generate the ground truth datasets cells and their corresponding count matrix are represented by simple centroids. Although this simplifies the analysis it might not necessarily accurately reflect Visium spots where cells could lie on a boundary and affect deconvolution results. On the other hand, the authors state that in silver standard datasets one half of the scRNA-seq data was used for simulation and the other half was used as a reference for the algorithms, but the method of splitting the data, i.e., at random or proportionally by cell type, was not specified. Supplying optimal reference data is important to achieve best performance, as the authors note in their conclusions.

      The authors thoroughly and rigorously compare methods while addressing situational discrepancies in model performance, indicative of a strong analysis. The authors make a point to address both inter- and intra- dataset reference handling, which has a significant impact on performance. Major strengths of the simulation engine include the ability to downsample and recapitulate several cell and tissue organization patterns.

      It's important to realize that deconvolution approaches are typically part of larger exploratory data analysis (EDA) efforts and require users to change parameters and input data multiple times. Furthermore, many users might not have access to more advanced computing infrastructure (e.g. GPU) and thus running time, computing needs, and scalability are probably key factors that researchers would like to consider when looking to deconvolve their datasets.

      The authors achieve their aim to benchmark different deconvolution methods and the results from their thorough analysis support the conclusions that many methods are still outperformed by bulk deconvolution methods. This study further informs the need for cell type deconvolution algorithms that can handle both cell abundance and rarity throughout a given tissue sample.

      The reproducibility of the methods described will have significant utility for researchers looking to develop cell type deconvolution algorithms, as this platform will allow simultaneous replication of the described analysis and comparison to new methods.

    4. Reviewer #3 (Public Review):

      The authors thoroughly evaluate the performance and scalability of existing cell-type deconvolution methods. The paper builds on the existing knowledge by considering the suitability of deconvolution algorithms in the context of more challenging analyses where rare cell types are present or when dealing with unmatched references or noise introduced by a highly abundant cell type within the data. The paper also presents a new simulation framework for spatial transcriptomics data to support their benchmarking effort.

      ● Major strengths and weaknesses of the methods and results.

      While most of the benchmarking studies rely on publicly available spatial transcriptomics datasets, one of the major strengths of the paper is the additional evidence support from their silver standard datasets. Leveraging computational processes synthspot, the authors generated abundant synthetic spatial transcriptomics data with replicates. In addition, the data generation process also accounts for 9 different biological patterns to stay close to real data quality. The authors also communicated with the original authors of each benchmarked method to ensure correct implementation and optimal performance. Figure 2 provides a clear and concise summary of the benchmark results, which will be of great assistance to users who are contemplating conducting deconvolution analysis.

      The simulation setup has a significant weakness in the selection of reference single-cell RNAseq datasets used for generating synthetic spots. It is unclear why a mix of mouse and human scRNA-seq datasets were chosen, as this does not reflect a realistic biological scenario. This could call into question the findings of the "detecting rare cell types remains challenging even for top-performing methods" section of the paper, as the true "rare cell types" would not be as distinct as human skin cells in a mouse brain setting as simulated here. Furthermore, it is unclear why the authors developed Synthspot when other similar frameworks, such as SRTsim, exist. Have the authors explored other simulation frameworks? Finally, we would have appreciated the inclusion of tissue samples with more complex structures, such as those from tumors, where there may be more intricate mixing between cell types and spot types.

      The authors have effectively accomplished their objectives in benchmarking deconvolution methods by thoughtfully designing the experiments and selecting appropriate evaluation metrics. This paper will be highly beneficial for the community.

      This paper can provide guidance for selecting the most proper deconvolution methods under user-decided scenarios of the interests. Synthspot, allows for generating more realistic artificial tissue data with specific spatial patterns and is integrated as part of an easy-to-use and adaptable Nextflow pipeline. It might be worthwhile to clearly differentiate this work from previous work either in the benchmarking area or SRT data simulation area.

    1. Author Response:

      Reviewer #1 (Public Review):

      In this paper, Hui and colleagues investigate how the predictive accuracy of a polygenic score (PGS) for body mass index (BMI) changes when individuals are stratified by 62 different covariates. After showing that the PGS has different predictive power across strata for 18 out of 62 covariates, they turn to understanding why these differences and seeing if predictive performance could be improved. First, they investigated which types of covariates result in the largest differences in PGS predictive power, finding that covariates with larger "main effects" on the trait and covariates with larger interaction effects (interacting with the PGS to affect the trait) tend to better stratify individuals by PGS performance. The authors then see if including interactions between the PGS and covariates improves predictive accuracy, finding that linear models only result in modest increases in performance but nonlinear models result in more substantial performance gains.

      Overall, the results are interesting and well-supported. The results will be broadly interesting to people using and developing PGS methods. Below I list some strengths and minor weaknesses.

      Strengths:

      A major impediment to the clinical use of PGS is the interaction between the PGS and various other routinely measured covariates, and this work provides a very interesting empirical study along these lines. The problem is interesting, and the work presented here is a convincing empirical study of the problem.

      The result that PGS accuracy differs across covariates, but in a way that is not well-captured by linear models with interactions is important for PGS method development.

      Thank you for all of the positive comments.

      Weakness:

      While arguably outside the scope of this paper, one shortcoming is the lack of a conceptual model explaining the results. It is interesting and empirically useful that PGS prediction accuracy differs across many covariates, but some of the results are hard to reconcile simultaneously. For example, it is interesting that triglyceride levels are associated with PGS performance across cohorts, but it seems like the effect on performance is discordant across datasets (Figure 2). Similarly, many of these effects have discordant (linear) interactions across cohorts (Figure 3). Overall it is surprising that the same covariates would be important but for presumably different reasons in different cohorts. Similarly, it would be good to discuss how the present results relate to the conceptual models in Mostafavi et al. (eLife 2020) and Zhu et al. (Cell Genomics 2023).

      Thank you for the comments. We agree that more generalizable explanations would be useful, which may be worth exploring in future work. Specifically, if there is heteroskedasticity in the relationship between PGS and BMI (e.g., phenotypic variance increases for higher values of BMI while PGS variance does not, or at least by a different amount), then that may partially explain the performance differences when stratifying by covariates that have main effects on BMI – somewhat similarly to what is presented in Figure 2 of Mostafavi et al. Such results may imply that similar performance differences could occur when stratifying by the phenotype itself, although this still may not explain differences in PGS effects, and differences in performance when using nonlinear methods (such as in this work and in Figure 4 of Zhu et al.). While we observe discordant effects for certain covariates across datasets, the findings from the correlation analyses use all cohorts and ancestries, and we expect that these difference in effects across datasets may be due to differences in their relationship with BMI across datasets (triglyceride levels may be especially noisy due to their sensitivity to fasting which may have been controlled for differently across datasets).

      Reviewer #2 (Public Review):

      This work follows in the footsteps of earlier work showing that BMI prediction accuracy can vary dramatically by context, even within a relatively ancestrally homogenous sample. This is an important observation that is worth the extension to different context variables and samples.

      Much of the follow-up analyses are commendably trying to take us a step further-towards explaining the underlying observed trends of variable prediction accuracy for BMI. Some of these analyses, however, are somewhat confounded and hard to interpret.

      For example, many of the covariates which the authors use to stratify the sample by may drive range restriction effects. Further, the covariates considered could be causally affected by genotype and causally affect BMI, with reverse causality effects; other covariates may be partially causally affected by both genotype and BMI, resulting in collider bias. Finally, population structure differences between quintiles of a covariate may drive variable levels of stratification. These can bias estimation and confounds interpretations, at least one of which intuitively seems like a concern for each of the context variables (e.g., the covariates SES, LDL, diet, age, smoking, and alcohol drinking).

      The increased prediction accuracy observed with some of the age-dependent prediction models is notable. Despite the clear utility of this investigation, I am not aware of much existing work that shows such improvements for context-aware prediction models (compared to additive/main effect models). I would be curious to see if the predictive utility extends to held-out data from a data set distinct from the UKB, where the model was trained, or whether it replicates when predicting variation within families. Such analyses could strengthen the evidence for these models capturing direct causal effects, rather than other reasons for the associations existing in the UKB sample.

      Thank you for the comments. We agree there are certain biases that may be introduced in these analyses. For population structure between quintiles, the analyses are already stratified by ancestry and have the top 5 genetic principal components included, which may help with this issue. In the interaction models we included separate terms for the PGS of the covariate as well which was meant to better capture the environmental component of the covariates, which may partially ameliorate issues of collider bias as SNPs that are causally affecting both BMI and the covariate would be partially adjusted for. While range restriction effects could introduce bias, in the correlation analyses the relationship between main effects and interaction effects (which were estimated without range restriction) have strong and reproducible correlations with PGS R2 differences across datasets.

      We agree the increased prediction performance using PGS created directly from GxAge GWAS effects is notable, as it is essentially “free” performance increase that doesn’t require any new data, and it likely generalizable to additional covariates. It would be useful to validate its performance in other datasets, especially ones that are outside of the 40-69 age of UKBB.

      Reviewer #3 (Public Review):

      Polygenic scores (PGS), constructed based on genetic effect sizes estimated in genome-wide association studies (GWAS) and used to predict phenotypes in humans have attracted considerable recent interest in human and evolutionary genetics, and in the social sciences. Recent work, however, has shown that PGSs have limited portability across ancestry groups, and that even within an ancestry group, their predictive accuracy varies markedly depending on characteristics such as the socio-economic status, age, and sex of the individuals in the samples used to construct them and to which they are applied. This study takes further steps in investigating and addressing the later problem, focusing on body mass index, a phenotype of substantial biomedical interest. Specifically, it quantifies the effects of a large number of co-variates and of interactions between these covariates and the PGS on prediction accuracy; it also examines the utility of including such covariates and interaction in the construction of predictors using both standard methods and artificial neural networks. This study would be of interest to investigators that develop and apply PGSs.

      I should add that I have not worked on PGSs and am not a statistician, and apologize in advance if this has led to some misunderstandings.

      Strengths:

      • The paper presents a much more comprehensive assessment of the effects of covariates than previous studies. It finds many covariates to have a substantial effect, which further highlights the importance of this problem to the development and application of PGSs for BMI and more generally.
      • The findings re the relationships between the effects of covariates and interactions between covariates and PGSs are, to the best of my knowledge, novel and interesting.
      • The development of predictors that account for multiple covariates and their interaction with the PGS are, to the best of my knowledge, novel and may prove useful in future efforts to produce reliable PGSs.
      • The improvement offered by the predictors that account for PGS and covariates using neural networks highlights the importance of non-linear interactions that are not addressed by standard methods, which is both interesting and likely to be of future utility.

      Thank for the positive feedback.

      Weaknesses:

      • The paper would benefit substantially from extensive editing. It also uses terminology that is specific to recent literature on PGSs, thus limiting accessibility to a broader readership.
      • The potential meaning of most of the results is not explored. Some examples are provided below: • The paper emphasizes that 18/62 covariates examined show significant effects, but this result clearly depends on the covariates included. It would be helpful to provide more detail on how these covariates were chosen. Moreover, many of these covariates are likely to be correlated, making this result more difficult to interpret. Could these questions at least be partially addressed using the predictors constructed using all covariates and their interactions jointly (i.e., with LASSO)? In that regard, it would be helpful to know how many of the covariates and interactions were used in this predictor (I apologize if I missed that). • While the relationship between covariate effects and covariate-PGS interaction effects is intriguing, it is difficult to interpret without articulating what one would expect, i.e., what would be an appropriate null. • The finding that using artificial neural networks substantially improves prediction over more standard methods is especially intriguing, and highlights the potential importance of non-linear relationships between PGSs and covariates. These relationships remain hidden in a black box, however. Even fairly straightforward analyses, based on using different combinations of the PGS and/or covariates may shed some light on these relationships. For example, analyzing which covariates have a substantial effect on the prediction or varying one covariate at a time for different values of the PGS, etc.
      • The relationship to previous work should be discussed in greater detail.

      Thank you for the comments. Regarding running LASSO with all covariates along with each of their interactions with PGS in one model, upon reading those sections of the text again it is a little unclear we agree; but we actually did something very similar already (related sections have been edited for clarity in our revised manuscript) with these results being presented later on in the neural network section (second paragraph, S Table 7 – those results specifically aren’t in Figure 5). We just looked at changes in prediction performance, and did not try to interpret the model coefficients. We agree that many of the covariates are probably correlated, but based on the correlation results (Figure 4) it doesn’t seem like any covariate is especially important separately from its effect on BMI itself i.e., whatever covariates were chosen by LASSO may still not be especially important. This explanation is related to the interpretation of the neural network results, where neural networks improved performance even over linear models with just age and sex and their interactions with PGS as additional covariates, which may suggest that increased performance is coming from nonlinearities apart from multiplicative interaction effects with the PGS. So observing the coefficients from LASSO but still with a linear model may still not substantially aid in explaining the relationships that increase prediction performance using neural networks (additionally, this analysis may be difficult to replicate since many of the covariates are not present in multiple datasets). But this replication would be nice to see in future studies if such datasets exist. In terms of the null relationship between covariate main and interaction effects, if they are from the same model they will inherently be correlated, but the main effects from Figure 4 are from a main effects model only. Regarding the other points, the text will be edited for clarity and elaboration on specific topics.

    2. eLife assessment

      This study presents a valuable analysis of the effects of covariates, such as age, sex, socio-economic status, or biomarker levels, on the predictive accuracy of polygenic scores for body mass index; it also presents approaches for improving prediction accuracy by accounting for such covariates. While the analyses are solid, the study falls short of providing a cogent interpretation of key findings, which could be of great interest and utility. The work will be of interest to people using and developing methods for phenotypic prediction based on polygenic scores.

    3. Reviewer #1 (Public Review):

      In this paper, Hui and colleagues investigate how the predictive accuracy of a polygenic score (PGS) for body mass index (BMI) changes when individuals are stratified by 62 different covariates. After showing that the PGS has different predictive power across strata for 18 out of 62 covariates, they turn to understanding why these differences and seeing if predictive performance could be improved. First, they investigated which types of covariates result in the largest differences in PGS predictive power, finding that covariates with larger "main effects" on the trait and covariates with larger interaction effects (interacting with the PGS to affect the trait) tend to better stratify individuals by PGS performance. The authors then see if including interactions between the PGS and covariates improves predictive accuracy, finding that linear models only result in modest increases in performance but nonlinear models result in more substantial performance gains.

      Overall, the results are interesting and well-supported. The results will be broadly interesting to people using and developing PGS methods. Below I list some strengths and minor weaknesses.

      Strengths:

      A major impediment to the clinical use of PGS is the interaction between the PGS and various other routinely measured covariates, and this work provides a very interesting empirical study along these lines. The problem is interesting, and the work presented here is a convincing empirical study of the problem.

      The result that PGS accuracy differs across covariates, but in a way that is not well-captured by linear models with interactions is important for PGS method development.

      Weakness:

      While arguably outside the scope of this paper, one shortcoming is the lack of a conceptual model explaining the results. It is interesting and empirically useful that PGS prediction accuracy differs across many covariates, but some of the results are hard to reconcile simultaneously. For example, it is interesting that triglyceride levels are associated with PGS performance across cohorts, but it seems like the effect on performance is discordant across datasets (Figure 2). Similarly, many of these effects have discordant (linear) interactions across cohorts (Figure 3). Overall it is surprising that the same covariates would be important but for presumably different reasons in different cohorts. Similarly, it would be good to discuss how the present results relate to the conceptual models in Mostafavi et al. (eLife 2020) and Zhu et al. (Cell Genomics 2023).

    4. Reviewer #2 (Public Review):

      This work follows in the footsteps of earlier work showing that BMI prediction accuracy can vary dramatically by context, even within a relatively ancestrally homogenous sample. This is an important observation that is worth the extension to different context variables and samples.

      Much of the follow-up analyses are commendably trying to take us a step further-towards explaining the underlying observed trends of variable prediction accuracy for BMI. Some of these analyses, however, are somewhat confounded and hard to interpret.

      For example, many of the covariates which the authors use to stratify the sample by may drive range restriction effects. Further, the covariates considered could be causally affected by genotype and causally affect BMI, with reverse causality effects; other covariates may be partially causally affected by both genotype and BMI, resulting in collider bias. Finally, population structure differences between quintiles of a covariate may drive variable levels of stratification. These can bias estimation and confounds interpretations, at least one of which intuitively seems like a concern for each of the context variables (e.g., the covariates SES, LDL, diet, age, smoking, and alcohol drinking).

      The increased prediction accuracy observed with some of the age-dependent prediction models is notable. Despite the clear utility of this investigation, I am not aware of much existing work that shows such improvements for context-aware prediction models (compared to additive/main effect models). I would be curious to see if the predictive utility extends to held-out data from a data set distinct from the UKB, where the model was trained, or whether it replicates when predicting variation within families. Such analyses could strengthen the evidence for these models capturing direct causal effects, rather than other reasons for the associations existing in the UKB sample.

    5. Reviewer #3 (Public Review):

      Polygenic scores (PGS), constructed based on genetic effect sizes estimated in genome-wide association studies (GWAS) and used to predict phenotypes in humans have attracted considerable recent interest in human and evolutionary genetics, and in the social sciences. Recent work, however, has shown that PGSs have limited portability across ancestry groups, and that even within an ancestry group, their predictive accuracy varies markedly depending on characteristics such as the socio-economic status, age, and sex of the individuals in the samples used to construct them and to which they are applied. This study takes further steps in investigating and addressing the later problem, focusing on body mass index, a phenotype of substantial biomedical interest. Specifically, it quantifies the effects of a large number of co-variates and of interactions between these covariates and the PGS on prediction accuracy; it also examines the utility of including such covariates and interaction in the construction of predictors using both standard methods and artificial neural networks. This study would be of interest to investigators that develop and apply PGSs.

      I should add that I have not worked on PGSs and am not a statistician, and apologize in advance if this has led to some misunderstandings.

      Strengths:

      - The paper presents a much more comprehensive assessment of the effects of covariates than previous studies. It finds many covariates to have a substantial effect, which further highlights the importance of this problem to the development and application of PGSs for BMI and more generally.<br /> - The findings re the relationships between the effects of covariates and interactions between covariates and PGSs are, to the best of my knowledge, novel and interesting.<br /> - The development of predictors that account for multiple covariates and their interaction with the PGS are, to the best of my knowledge, novel and may prove useful in future efforts to produce reliable PGSs.<br /> - The improvement offered by the predictors that account for PGS and covariates using neural networks highlights the importance of non-linear interactions that are not addressed by standard methods, which is both interesting and likely to be of future utility.

      Weaknesses:

      - The paper would benefit substantially from extensive editing. It also uses terminology that is specific to recent literature on PGSs, thus limiting accessibility to a broader readership.<br /> - The potential meaning of most of the results is not explored. Some examples are provided below:<br /> • the paper emphasizes that 18/62 covariates examined show significant effects, but this result clearly depends on the covariates included. It would be helpful to provide more detail on how these covariates were chosen. Moreover, many of these covariates are likely to be correlated, making this result more difficult to interpret. Could these questions at least be partially addressed using the predictors constructed using all covariates and their interactions jointly (i.e., with LASSO)? In that regard, it would be helpful to know how many of the covariates and interactions were used in this predictor (I apologize if I missed that).<br /> • While the relationship between covariate effects and covariate-PGS interaction effects is intriguing, it is difficult to interpret without articulating what one would expect, i.e., what would be an appropriate null.<br /> • The finding that using artificial neural networks substantially improves prediction over more standard methods is especially intriguing, and highlights the potential importance of non-linear relationships between PGSs and covariates. These relationships remain hidden in a black box, however. Even fairly straightforward analyses, based on using different combinations of the PGS and/or covariates may shed some light on these relationships. For example, analyzing which covariates have a substantial effect on the prediction or varying one covariate at a time for different values of the PGS, etc.<br /> - The relationship to previous work should be discussed in greater detail.